Skip to content

Conversation

@filipchristiansen
Copy link
Contributor

This PR replaces the old dictionary-based query object with a new ParsedQuery dataclass, improving type safety and readability across the codebase.

Key points:

  • Introduces a ParsedQuery dataclass that consolidates all query-related fields (repo, branch, commit, patterns, etc.)

  • Updates the following functions in the query_parser module to return a query of type ParsedQuery (instead of dict[str, Any]):

    • _parse_repo_source
    • _parse_path
    • parse_query
  • Updates the following functions in the query_ingestion module to take a query argument of type ParsedQuery (instead of dict[str, Any]):

    • run_ingest_query
    • _ingest_directory
    • _ingest_single_file
    • _create_tree_structure
    • _create_summary_string
    • _extract_files_content
    • _process_item
    • _process_symlink
    • _scan_directory
  • Switches ignore/include patterns to sets for clearer overrides and deduplication

  • Moves or imports maximum size constants (MAX_FILE_SIZE, etc.) to config.py

  • Aligns all references and tests to the new dataclass approach

This unification makes it easier to understand and maintain how query data flows through the ingestion process.

@filipchristiansen filipchristiansen force-pushed the refactor/dict-to-dataclass branch 2 times, most recently from 14b607e to a9373d4 Compare January 15, 2025 07:40
- Introduce ParsedQuery dataclass to store query parameters and metadata
- Update ingestion and parser modules to use ParsedQuery instead of dict[str, Any]
- Convert ignore_patterns and include_patterns to sets
- Clean references to max size and pattern handling
- Update tests to reflect new dataclass usage
@filipchristiansen filipchristiansen force-pushed the refactor/dict-to-dataclass branch from a9373d4 to 0c6242a Compare January 17, 2025 08:45
Copy link
Member

@cyclotruc cyclotruc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it, looks good

@cyclotruc cyclotruc merged commit d721b00 into main Jan 17, 2025
8 checks passed
@filipchristiansen filipchristiansen deleted the refactor/dict-to-dataclass branch January 19, 2025 14:43
FOLKS-Tech pushed a commit to FOLKS-Tech/gitingest that referenced this pull request Sep 5, 2025
- Introduce ParsedQuery dataclass to store query parameters and metadata
- Update ingestion and parser modules to use ParsedQuery instead of dict[str, Any]
- Convert ignore_patterns and include_patterns to sets
- Clean references to max size and pattern handling
- Update tests to reflect new dataclass usage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants