Seed Script¶
scripts/seed_git_history.py reads git log from a local repository and posts all commits to POST /api/v1/seed/commits in batches of 200.
Use this script once when first deploying Leliel against a repository that already has commit history. It populates Repo, Branch, and Commit nodes so that historical context is available in the graph before any CI builds have been ingested.
Prerequisites¶
- Python 3.12 with
requestsinstalled - Leliel API running and reachable
- A local clone of the repository to seed
# Install the requests library into your active Python environment before running
pip install requests
Usage¶
# Run the seed script to post all commits from a local git repository to Leliel
python scripts/seed_git_history.py \
--repo-path /path/to/local/repo \
--repo-name my-repo \
--branch main \
--api-url http://localhost:8081 \
--api-key your-pipeline-key
Arguments¶
| Argument | Required | Default | Description |
|---|---|---|---|
--repo-path |
Yes | — | Path to the local git repository to read commit history from |
--repo-name |
Yes | — | Repo name as stored in the graph. Must match the repo field in build ingest payloads. |
--branch |
No | main |
Branch name to walk; uses --first-parent to follow the mainline only |
--api-url |
Yes | — | Knowledge API base URL, no trailing slash |
--api-key |
Yes | — | Pipeline key (KNOWLEDGE_API_KEY) |
--dry-run |
No | — | Parse and count commits but do not POST to the API |
Dry run¶
Run with --dry-run first to confirm the repo path and commit count before writing any data:
# Parse commits from git log without sending any data to the Knowledge API
python scripts/seed_git_history.py \
--repo-path /path/to/local/repo \
--repo-name my-repo \
--api-url http://localhost:8081 \
--api-key your-pipeline-key \
--dry-run
Output:
Batch processing¶
Commits are posted in batches of 200. Progress is printed after each batch completes:
Notes¶
The script uses git log --first-parent to walk only the mainline of the specified branch. Commits that arrived on the branch via a merge are not followed into their source branches. This keeps the commit graph clean and avoids importing large numbers of commits that belong to feature branches.
The --repo-name value controls the repo property stored on every Repo, Branch, and Commit node created by the seed. It must match the repo field used in build ingest payloads. If they differ, builds and seed commits will be stored under separate Repo nodes in the graph.
Re-seeding is safe
All seed writes use MERGE. Running the script multiple times against the same repo will update existing nodes without creating duplicates.