Skip to content

Commit af53885

Browse files
feat: enhance parser domain-agnostic support (coderamp-labs#117)
* feat: make parser domain-agnostic to support multiple Git hosts - added list of known domains/Git hosts in `query_parser.py` - fixed bug from [coderamp-labs#115](coderamp-labs#115): corrected case handling for URL components—scheme, domain, username, and repository are case-insensitive, but paths beyond (e.g., file names, branches) are case-sensitive - implemented `try_domains_for_user_and_repo` in `query_parser.py` to iteratively guess the correct domain until success or supported hosts are exhausted - added helper functions `_get_user_and_repo_from_path`, `_validate_host`, and `_validate_scheme` in `query_parser.py` - extended `_parse_repo_source` in `query_parser.py` to be Git host agnostic by using `try_domains_for_user_and_repo` - added tests `test_parse_url_unsupported_host` and `test_parse_query_with_branch` in `test_query_parser.py` - created new file `test_git_host_agnostic.py` to verify domain/Git host agnostic behavior
1 parent 6d8cb1a commit af53885

22 files changed

+429
-167
lines changed

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ FROM python:3.12-slim
2020
ENV PYTHONUNBUFFERED=1
2121
ENV PYTHONDONTWRITEBYTECODE=1
2222

23-
# Install git
23+
# Install Git
2424
RUN apt-get update \
2525
&& apt-get install -y --no-install-recommends git curl\
2626
&& rm -rf /var/lib/apt/lists/*

README.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,13 @@
1111

1212
Turn any Git repository into a prompt-friendly text ingest for LLMs.
1313

14-
You can also replace `hub` with `ingest` in any GitHub URL to access the coresponding digest
14+
You can also replace `hub` with `ingest` in any GitHub URL to access the coresponding digest.
1515

16-
[gitingest.com](https://gitingest.com/) · [Chrome Extension](https://chromewebstore.google.com/detail/adfjahbijlkjfoicpjkhjicpjpjfaood) · [Firefox Add-on](https://addons.mozilla.org/firefox/addon/gitingest/)
16+
[gitingest.com](https://gitingest.com) · [Chrome Extension](https://chromewebstore.google.com/detail/adfjahbijlkjfoicpjkhjicpjpjfaood) · [Firefox Add-on](https://addons.mozilla.org/firefox/addon/gitingest)
1717

1818
## 🚀 Features
1919

20-
- **Easy code context**: Get a text digest from a git repository URL or a directory
20+
- **Easy code context**: Get a text digest from a Git repository URL or a directory
2121
- **Smart Formatting**: Optimized output format for LLM prompts
2222
- **Statistics about**:
2323
- File and directory structure
@@ -36,11 +36,12 @@ pip install gitingest
3636

3737
<!-- markdownlint-disable MD033 -->
3838
<a href="https://chromewebstore.google.com/detail/adfjahbijlkjfoicpjkhjicpjpjfaood" target="_blank" title="Get Gitingest Extension from Chrome Web Store"><img height="48" src="https://github.com/user-attachments/assets/20a6e44b-fd46-4e6c-8ea6-aad436035753" alt="Available in the Chrome Web Store" /></a>
39-
<a href="https://addons.mozilla.org/firefox/addon/gitingest/" target="_blank" title="Get Gitingest Extension from Firefox Add-ons"><img height="48" src="https://github.com/user-attachments/assets/c0e99e6b-97cf-4af2-9737-099db7d3538b" alt="Get The Add-on for Firefox" /></a>
39+
<a href="https://addons.mozilla.org/firefox/addon/gitingest" target="_blank" title="Get Gitingest Extension from Firefox Add-ons"><img height="48" src="https://github.com/user-attachments/assets/c0e99e6b-97cf-4af2-9737-099db7d3538b" alt="Get The Add-on for Firefox" /></a>
4040
<a href="https://microsoftedge.microsoft.com/addons/detail/nfobhllgcekbmpifkjlopfdfdmljmipf" target="_blank" title="Get Gitingest Extension from Firefox Add-ons"><img height="48" src="https://github.com/user-attachments/assets/204157eb-4cae-4c0e-b2cb-db514419fd9e" alt="Get from the Edge Add-ons" /></a>
4141
<!-- markdownlint-enable MD033 -->
4242

4343
The extension is open source at [lcandy2/gitingest-extension](https://github.com/lcandy2/gitingest-extension).
44+
4445
Issues and feature requests are welcome to the repo.
4546

4647
## 💡 Command line usage
@@ -71,7 +72,7 @@ summary, tree, content = ingest("path/to/directory")
7172
summary, tree, content = ingest("https://github.com/cyclotruc/gitingest")
7273
```
7374

74-
By default, this won't write a file but can be enabled with the `output` argument
75+
By default, this won't write a file but can be enabled with the `output` argument.
7576

7677
## 🌐 Self-host
7778

@@ -87,31 +88,30 @@ By default, this won't write a file but can be enabled with the `output` argumen
8788
docker run -d --name gitingest -p 8000:8000 gitingest
8889
```
8990

90-
The application will be available at `http://localhost:8000`
91+
The application will be available at `http://localhost:8000`.
9192

9293
If you are hosting it on a domain, you can specify the allowed hostnames via env variable `ALLOWED_HOSTS`.
9394

9495
```bash
95-
#Default: "gitingest.com,*.gitingest.com,localhost, 127.0.0.1".
96+
# Default: "gitingest.com, *.gitingest.com, localhost, 127.0.0.1".
9697
ALLOWED_HOSTS="example.com, localhost, 127.0.0.1"
9798
```
9899

99100
## 🛠️ Stack
100101

101-
- [Tailwind CSS](https://tailwindcss.com/) - Frontend
102+
- [Tailwind CSS](https://tailwindcss.com) - Frontend
102103
- [FastAPI](https://github.com/fastapi/fastapi) - Backend framework
103-
- [Jinja2](https://jinja.palletsprojects.com/) - HTML templating
104+
- [Jinja2](https://jinja.palletsprojects.com) - HTML templating
104105
- [tiktoken](https://github.com/openai/tiktoken) - Token estimation
105-
- [apianalytics.dev](https://www.apianalytics.dev/) - Simple Analytics
106+
- [apianalytics.dev](https://www.apianalytics.dev) - Simple Analytics
106107

107-
### Looking for a javascript/node package?
108+
### Looking for a JavaScript/Node package?
108109

109110
Check out the NPM alternative 📦 Repomix: <https://github.com/yamadashy/repomix>
110111

111112
## ✔️ Contributing to Gitingest
112113

113-
Gitingest aims to be friendly for first time contributors, with a simple python and html codebase.
114-
If you need any help while working with the code, reach out to us on [discord](https://discord.com/invite/zerRaGK9EC)
114+
Gitingest aims to be friendly for first time contributors, with a simple python and html codebase. If you need any help while working with the code, reach out to us on [Discord](https://discord.com/invite/zerRaGK9EC).
115115

116116
### Ways to help (non-technical)
117117

@@ -125,7 +125,7 @@ Gitingest aims to be friendly for first time contributors, with a simple python
125125
2. Setup the dev environment (see Development section bellow)
126126
3. Run unit tests with `pytest`
127127
4. Commit your changes and run `pre-commit`
128-
5. Open a pull request on Github for review and feedback
128+
5. Open a pull request on GitHub for review and feedback
129129
6. (Optionnal) Invite project maintainer to your branch for easier collaboration
130130

131131
## 🔧 Development
@@ -161,7 +161,7 @@ Gitingest aims to be friendly for first time contributors, with a simple python
161161
pytest
162162
```
163163

164-
The application should be available at `http://localhost:8000`
164+
The application should be available at `http://localhost:8000`.
165165

166166
### Working on the CLI
167167

src/gitingest/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
""" Gitingest: A package for ingesting data from git repositories. """
1+
""" Gitingest: A package for ingesting data from Git repositories. """
22

33
from gitingest.query_ingestion import run_ingest_query
44
from gitingest.query_parser import parse_query

src/gitingest/cli.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
@click.option("--max-size", "-s", default=MAX_FILE_SIZE, help="Maximum file size to process in bytes")
1515
@click.option("--exclude-pattern", "-e", multiple=True, help="Patterns to exclude")
1616
@click.option("--include-pattern", "-i", multiple=True, help="Patterns to include")
17-
def main(
17+
async def main(
1818
source: str,
1919
output: str | None,
2020
max_size: int,
@@ -54,7 +54,7 @@ def main(
5454

5555
if not output:
5656
output = "digest.txt"
57-
summary, _, _ = ingest(source, max_size, include_patterns, exclude_patterns, output=output)
57+
summary, _, _ = await ingest(source, max_size, include_patterns, exclude_patterns, output=output)
5858

5959
click.echo(f"Analysis complete! Output written to: {output}")
6060
click.echo("\nSummary:")

src/gitingest/exceptions.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def __init__(self, pattern: str) -> None:
2323

2424
class AsyncTimeoutError(Exception):
2525
"""
26-
Raised when an async operation exceeds its timeout limit.
26+
Exception raised when an async operation exceeds its timeout limit.
2727
2828
This exception is used by the `async_timeout` decorator to signal that the wrapped
2929
asynchronous function has exceeded the specified time limit for execution.
@@ -38,7 +38,7 @@ def __init__(self, max_files: int) -> None:
3838

3939

4040
class MaxFileSizeReachedError(Exception):
41-
"""Raised when the maximum file size is reached."""
41+
"""Exception raised when the maximum file size is reached."""
4242

4343
def __init__(self, max_size: int):
4444
super().__init__(f"Maximum file size limit ({max_size/1024/1024:.1f}MB) reached.")

src/gitingest/query_ingestion.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,9 @@ def _read_file_content(file_path: Path) -> str:
170170

171171
def _sort_children(children: list[dict[str, Any]]) -> list[dict[str, Any]]:
172172
"""
173-
Sort children nodes with:
173+
Sort the children nodes of a directory according to a specific order.
174+
175+
Order of sorting:
174176
1. README.md first
175177
2. Regular files (not starting with dot)
176178
3. Hidden files (starting with dot)

0 commit comments

Comments
 (0)