Skip to content

Conversation

@filipchristiansen
Copy link
Contributor

@filipchristiansen filipchristiansen commented Jan 8, 2025

This pull request addresses the issue where the injest URL field was not case insensitive, as described in #110.

Changes:

  • Added source = source.lower() in the parse_query function to handle mixed-case URLs.
  • Implemented a new test test_parse_query_mixed_case to ensure that URLs with uppercase letters are processed correctly.

These changes ensure that URLs with different casing are correctly recognized, resolving the reported issue.

Resolves #110

- Convert `source` to lowercase in `parse_query`
- Add `test_parse_query_mixed_case` to verify functionality

Resolves #110
Copy link
Member

@cyclotruc cyclotruc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch thank you! Merging

@cyclotruc cyclotruc merged commit 551d09a into coderamp-labs:main Jan 8, 2025
8 checks passed
@filipchristiansen filipchristiansen deleted the fix/injest-url-case-insensitive branch January 8, 2025 22:33
filipchristiansen added a commit that referenced this pull request Jan 10, 2025
- added list of known domains/Git hosts in `query_parser.py`
- fixed bug from [#115](#115): corrected case handling for URL components—scheme, domain, username, and repository are case-insensitive, but paths beyond (e.g., file names, branches) are case-sensitive
- implemented `try_domains_for_user_and_repo` in `query_parser.py` to iteratively guess the correct domain until success or supported hosts are exhausted
- added helper functions `_get_user_and_repo_from_path`, `_validate_host`, and `_validate_scheme` in `query_parser.py`
- extended `_parse_repo_source` in `query_parser.py` to be Git host agnostic by using `try_domains_for_user_and_repo`
- added tests `test_parse_url_unsupported_host` and `test_parse_query_with_branch` in `test_query_parser.py`
- created new file `test_git_host_agnostic.py` to verify domain/Git host agnostic behavior
filipchristiansen added a commit that referenced this pull request Jan 10, 2025
- added list of known domains/Git hosts in `query_parser.py`
- fixed bug from [#115](#115): corrected case handling for URL components—scheme, domain, username, and repository are case-insensitive, but paths beyond (e.g., file names, branches) are case-sensitive
- implemented `try_domains_for_user_and_repo` in `query_parser.py` to iteratively guess the correct domain until success or supported hosts are exhausted
- added helper functions `_get_user_and_repo_from_path`, `_validate_host`, and `_validate_scheme` in `query_parser.py`
- extended `_parse_repo_source` in `query_parser.py` to be Git host agnostic by using `try_domains_for_user_and_repo`
- added tests `test_parse_url_unsupported_host` and `test_parse_query_with_branch` in `test_query_parser.py`
- created new file `test_git_host_agnostic.py` to verify domain/Git host agnostic behavior
cyclotruc pushed a commit that referenced this pull request Jan 13, 2025
* feat: make parser domain-agnostic to support multiple Git hosts

- added list of known domains/Git hosts in `query_parser.py`
- fixed bug from [#115](#115): corrected case handling for URL components—scheme, domain, username, and repository are case-insensitive, but paths beyond (e.g., file names, branches) are case-sensitive
- implemented `try_domains_for_user_and_repo` in `query_parser.py` to iteratively guess the correct domain until success or supported hosts are exhausted
- added helper functions `_get_user_and_repo_from_path`, `_validate_host`, and `_validate_scheme` in `query_parser.py`
- extended `_parse_repo_source` in `query_parser.py` to be Git host agnostic by using `try_domains_for_user_and_repo`
- added tests `test_parse_url_unsupported_host` and `test_parse_query_with_branch` in `test_query_parser.py`
- created new file `test_git_host_agnostic.py` to verify domain/Git host agnostic behavior
FOLKS-Tech pushed a commit to FOLKS-Tech/gitingest that referenced this pull request Sep 5, 2025
FOLKS-Tech pushed a commit to FOLKS-Tech/gitingest that referenced this pull request Sep 5, 2025
* feat: make parser domain-agnostic to support multiple Git hosts

- added list of known domains/Git hosts in `query_parser.py`
- fixed bug from [coderamp-labs#115](coderamp-labs#115): corrected case handling for URL components—scheme, domain, username, and repository are case-insensitive, but paths beyond (e.g., file names, branches) are case-sensitive
- implemented `try_domains_for_user_and_repo` in `query_parser.py` to iteratively guess the correct domain until success or supported hosts are exhausted
- added helper functions `_get_user_and_repo_from_path`, `_validate_host`, and `_validate_scheme` in `query_parser.py`
- extended `_parse_repo_source` in `query_parser.py` to be Git host agnostic by using `try_domains_for_user_and_repo`
- added tests `test_parse_url_unsupported_host` and `test_parse_query_with_branch` in `test_query_parser.py`
- created new file `test_git_host_agnostic.py` to verify domain/Git host agnostic behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

injest url field isn't case insensitive.

2 participants