Neverdecel · Neverdecel · Sep 7, 2025 · Sep 3, 2025 · Sep 3, 2025 · Sep 3, 2025
diff --git a/.github/workflows/ci-tests.yml b/.github/workflows/ci-tests.yml
@@ -0,0 +1,66 @@
+name: CI Tests
+
+on:
+  push:
+    branches: [ main, master, develop ]
+  pull_request:
+    branches: [ main, master, develop ]
+
+jobs:
+  test-imports:
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v4
+
+    - name: Set up Python 3.11
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.11'
+
+    - name: Cache pip dependencies
+      uses: actions/cache@v4
+      with:
+        path: ~/.cache/pip
+        key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
+        restore-keys: |
+          ${{ runner.os }}-pip-
+
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install -r requirements.txt
+
+    - name: Test Import Structure
+      run: |
+        python -c "import coderag.config; print('✓ Config import successful')"
+        python -c "import coderag.embeddings; print('✓ Embeddings import successful')"
+        python -c "import coderag.index; print('✓ Index import successful')"
+        python -c "import coderag.search; print('✓ Search import successful')"
+        python -c "import coderag.monitor; print('✓ Monitor import successful')"
+      env:
+        OPENAI_API_KEY: dummy-key-for-testing
+
+  quality-and-tests:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python 3.11
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r requirements.txt
+          pip install black flake8 isort mypy pytest
+      - name: Lint and type-check
+        run: |
+          black --check .
+          isort --check-only .
+          flake8 . --max-line-length=88 --ignore=E203,W503
+          mypy .
+      - name: Run tests
+        env:
+          PYTHONPATH: ${{ github.workspace }}
+        run: pytest -q
diff --git a/.gitignore b/.gitignore
@@ -27,3 +27,5 @@ node_modules/
 *.tmp
 plan.md
 metadata.npy
+test_env/
+*.npy
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,36 @@
+repos:
+  - repo: https://github.com/psf/black
+    rev: 23.12.1
+    hooks:
+      - id: black
+        language_version: python3
+        args: ['--line-length=88']
+
+  - repo: https://github.com/pycqa/flake8
+    rev: 7.0.0
+    hooks:
+      - id: flake8
+        args: ['--max-line-length=88', '--ignore=E203,W503']
+
+  - repo: https://github.com/pycqa/isort
+    rev: 5.13.2
+    hooks:
+      - id: isort
+        args: ["--profile", "black"]
+
+  - repo: https://github.com/pre-commit/mirrors-mypy
+    rev: v1.8.0
+    hooks:
+      - id: mypy
+        additional_dependencies: [types-all]
+        args: [--ignore-missing-imports, --no-strict-optional]
+
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.5.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-yaml
+      - id: check-added-large-files
+      - id: check-merge-conflict
+      - id: debug-statements
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,37 @@
+# Repository Guidelines
+
+## Project Structure & Module Organization
+- `coderag/`: Core library (`config.py`, `embeddings.py`, `index.py`, `search.py`, `monitor.py`).
+- `app.py`: Streamlit UI. `main.py`: backend/indexer. `prompt_flow.py`: RAG orchestration.
+- `scripts/`: Utilities (e.g., `initialize_index.py`, `run_monitor.py`).
+- `tests/`: Minimal checks (e.g., `test_faiss.py`).
+- `example.env` → copy to `.env` for local secrets; CI lives in `.github/`.
+
+## Build, Test, and Development Commands
+- Create env: `python -m venv venv && source venv/bin/activate`.
+- Install deps: `pip install -r requirements.txt`.
+- Run backend: `python main.py` (indexes and watches `WATCHED_DIR`).
+- Run UI: `streamlit run app.py`.
+- Quick test: `python tests/test_faiss.py` (FAISS round‑trip sanity check).
+- Quality suite: `pre-commit run --all-files` (black, isort, flake8, mypy, basics).
+
+## Coding Style & Naming Conventions
+- Formatting: Black (88 cols), isort profile "black"; run `black . && isort .`.
+- Linting: flake8 with `--ignore=E203,W503` to match Black.
+- Typing: mypy (py311 target; ignore missing imports OK). Prefer typed signatures and docstrings.
+- Indentation: 4 spaces. Names: `snake_case` for files/functions, `PascalCase` for classes, constants `UPPER_SNAKE`.
+- Imports: first‑party module is `coderag` (see `pyproject.toml`).
+
+## Testing Guidelines
+- Place tests in `tests/` as `test_*.py`. Keep unit tests deterministic; mock OpenAI calls where possible.
+- Run directly (`python tests/test_faiss.py`) or with pytest if available (`pytest -q`).
+- Ensure `.env` or env vars provide `OPENAI_API_KEY` for integration tests; avoid hitting rate limits in CI.
+
+## Commit & Pull Request Guidelines
+- Use Conventional Commits seen in history: `feat:`, `fix:`, `docs:`, `ci:`, `refactor:`, `simplify:`.
+- Before pushing: `pre-commit run --all-files` and update docs when behavior changes.
+- PRs: clear description, linked issues, steps to validate; include screenshots/GIFs for UI changes; note config changes (`.env`).
+
+## Security & Configuration Tips
+- Never commit secrets. Start with `cp example.env .env`; set `OPENAI_API_KEY`, `WATCHED_DIR`, `FAISS_INDEX_FILE`.
+- Avoid logging sensitive data. Regenerate the FAISS index if dimensions or models change (`python scripts/initialize_index.py`).
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
@@ -0,0 +1,194 @@
+# 🛠️ Development Guide
+
+## Setting Up Development Environment
+
+### 1. Clone and Setup
+
+```bash
+git clone https://github.com/your-username/CodeRAG.git
+cd CodeRAG
+python -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
+pip install -r requirements.txt
+```
+
+### 2. Configure Pre-commit Hooks
+
+```bash
+pip install pre-commit
+pre-commit install
+```
+
+This will run code quality checks on every commit:
+- **Black**: Code formatting
+- **isort**: Import sorting  
+- **Flake8**: Linting and style checks
+- **MyPy**: Type checking
+- **Basic hooks**: Trailing whitespace, file endings, etc.
+
+### 3. Environment Variables
+
+Copy `example.env` to `.env` and configure:
+
+```bash
+cp example.env .env
+```
+
+Required variables:
+```env
+OPENAI_API_KEY=your_key_here  # Required for embeddings and chat
+WATCHED_DIR=/path/to/code     # Directory to index (default: current dir)
+```
+
+## Code Quality Standards
+
+### Type Hints
+All functions should have type hints:
+
+```python
+def process_file(filepath: str, content: str) -> Optional[np.ndarray]:
+    \"\"\"Process a file and return embeddings.\"\"\"
+    ...
+```
+
+### Error Handling
+Use structured logging and proper exception handling:
+
+```python
+import logging
+logger = logging.getLogger(__name__)
+
+try:
+    result = risky_operation()
+except SpecificError as e:
+    logger.error(f"Operation failed: {str(e)}")
+    return None
+```
+
+### Documentation
+Use concise docstrings for public functions:
+
+```python
+def search_code(query: str, k: int = 5) -> List[Dict[str, Any]]:
+    \"\"\"Search the FAISS index using a text query.
+
+    Args:
+        query: The search query text
+        k: Number of results to return
+
+    Returns:
+        List of search results with metadata
+    \"\"\"
+```
+
+## Testing Your Changes
+
+### Manual Testing
+```bash
+# Test backend indexing
+python main.py
+
+# Test Streamlit UI (separate terminal)
+streamlit run app.py
+```
+
+### Code Quality Checks
+```bash
+# Format code
+black .
+isort .
+
+# Check linting
+flake8 .
+
+# Type checking
+mypy .
+
+# Run all pre-commit checks
+pre-commit run --all-files
+```
+
+## Adding New Features
+
+1. **Create feature branch**: `git checkout -b feature/new-feature`
+2. **Add logging**: Use the logger for all operations
+3. **Add type hints**: Follow existing patterns
+4. **Handle errors**: Graceful degradation and user-friendly messages
+5. **Update tests**: Add tests for new functionality
+6. **Update docs**: Update README if needed
+
+## Architecture Guidelines
+
+### Keep It Simple
+- Maintain the single-responsibility principle
+- Avoid unnecessary abstractions
+- Focus on the core RAG functionality
+
+### Error Handling Strategy
+- Log errors with context
+- Return None/empty lists for failures
+- Show user-friendly messages in UI
+- Don't crash the application
+
+### Performance Considerations
+- Limit search results (default: 5)
+- Truncate long content for context
+- Cache embeddings when possible
+- Monitor memory usage with large codebases
+
+## Debugging Tips
+
+### Enable Debug Logging
+```python
+logging.basicConfig(level=logging.DEBUG)
+```
+
+### Check Index Status
+```python
+from coderag.index import inspect_metadata
+inspect_metadata(5)  # Show first 5 entries
+```
+
+### Test Embeddings
+```python
+from coderag.embeddings import generate_embeddings
+result = generate_embeddings("test code")
+print(f"Shape: {result.shape if result is not None else 'None'}")
+```
+
+## Common Development Issues
+
+**Import Errors**
+- Ensure you're in the virtual environment
+- Check PYTHONPATH includes project root
+- Verify all dependencies are installed
+
+**OpenAI API Issues**
+- Check API key validity
+- Monitor rate limits and usage
+- Test with a simple embedding request
+
+**FAISS Index Corruption**
+- Delete existing index files and rebuild
+- Check file permissions
+- Ensure consistent embedding dimensions
+
+## Project Structure
+
+```
+CodeRAG/
+├── coderag/              # Core library
+│   ├── __init__.py
+│   ├── config.py         # Configuration management
+│   ├── embeddings.py     # OpenAI integration
+│   ├── index.py          # FAISS operations
+│   ├── search.py         # Search functionality
+│   └── monitor.py        # File monitoring
+├── scripts/              # Utility scripts
+├── tests/                # Test files
+├── .github/              # GitHub workflows
+├── main.py              # Backend service
+├── app.py               # Streamlit frontend
+├── prompt_flow.py       # RAG orchestration
+└── requirements.txt     # Dependencies
+```
-Original file line number
+Diff line change
@@ Expand Up / @@ -27,3 +27,5 @@ node_modules/ @@
     *.tmp
     plan.md
     metadata.npy
+    test_env/
+    *.npy