Skip to content

Conversation

@filipchristiansen
Copy link
Contributor

@filipchristiansen filipchristiansen commented Jun 15, 2025

Private-repo support & sparse-checkout fix

This PR enables Gitingest to ingest private GitHub repositories and resolves the long-standing sparse-checkout CLI mismatch.

✨ New Features

Area What’s new
CLI --token/-t flag (with GITHUB_TOKEN env-var fallback) to supply a GitHub PAT.
Cloning • Automatically injects http.https://github.com/.extraheader=Authorization: Basic … when a PAT is present.
• Validates PAT format (github_pat_*, or gph_*).
• Fully supports partial clones and commit pins.
Repo detection check_repo_exists and branch-listing now hit the GitHub REST API with auth when required.

🐛 Fixes & clean-ups

  • Sparse-checkout and commit checkout now run as separate git commands (correct syntax).
  • Tidied docs

🧪 Tests

  • Mocks updated to verify token-aware calls.
  • Added assertions for new checkout sequence.

🗒️ Usage

# PAT with at least repo-read scope
export GITHUB_TOKEN=github_pat_xxx
gitingest myorg/myprivaterepo

or:

gitingest myorg/myprivaterepo --token github_pat_xxx

No breaking changes — public-repo workflows continue to work as before.

Next: surface the token option in the web UI (gitingest.com) and update docs.

…cs/CLI

* Run `git sparse-checkout set …` and `git checkout <sha>` as two calls—matches Git’s CLI rules and fixes failures.
* Tidy clone path creation via _ensure_directory; use DEFAULT_TIMEOUT.
* Clarify CLI/help strings and schema docstrings.
* Update tests for the new two-step checkout flow.
* CLI: new `--token/-t` flag (fallback to `GITHUB_TOKEN`)
* clone_repo:
  * injects Basic-auth header when a PAT is supplied
  * validates PAT format (`github_pat_*`)
* git_utils:
  * `create_git_auth_header`, `validate_github_token`, `create_git_command`
  * `_check_github_repo_exists` & branch-listing now work with tokens
* os_utils.ensure_directory extracted for reuse
* tests updated to reflect new call signatures
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces private repository support and fixes issues with sparse-checkout usage by updating command invocations, token handling, and associated utilities. Key changes include adding token parameters to functions (e.g. check_repo_exists, clone_repo, etc.), adjusting CLI options to accept a GitHub PAT, and refactoring git command construction for cloning and checkout processes.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_repository_clone.py Updated test expectations to reflect the new token parameter for repo existence checks.
src/gitingest/utils/os_utils.py Added a helper to ensure directories exist before operations.
src/gitingest/utils/git_utils.py Enhanced functions to include token-based authentication and support private repo checks via GitHub API.
src/gitingest/schemas/ingestion_schema.py Added a new blob field to the clone configuration schema.
src/gitingest/query_parsing.py Updated parsing functions to support token passing for handling private repos.
src/gitingest/entrypoint.py Introduced token handling in both async and synchronous entry points.
src/gitingest/config.py Added a DEFAULT_TIMEOUT constant for consistent timeout handling.
src/gitingest/cloning.py Updated clone_repo to handle token validation, authentication and split sparse-checkout/commit checkout commands.
src/gitingest/cli.py Extended CLI options to include token support and improved help messages for clarity.

@coderamp-labs coderamp-labs deleted a comment from Copilot AI Jun 15, 2025
@filipchristiansen filipchristiansen force-pushed the feat/private-repo-support branch from 75131b5 to e8156a9 Compare June 15, 2025 20:22
Copy link
Member

@cyclotruc cyclotruc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and the changes seem very reasonable to me, thank you

@cyclotruc cyclotruc merged commit 1dd133c into main Jun 15, 2025
18 checks passed
@cyclotruc cyclotruc deleted the feat/private-repo-support branch June 15, 2025 21:30
@ChrisCarini
Copy link

Hi @cyclotruc - is there plans for this change be released out through pypi anytime soon?

@cyclotruc
Copy link
Member

@ChrisCarini
Copy link

@ChrisCarini this has just been released!

https://github.com/cyclotruc/gitingest/releases/tag/v0.1.5

Awesome, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants