Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/ci-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
name: CI Tests

on:
push:
branches: [ main, master, develop ]
pull_request:
branches: [ main, master, develop ]

jobs:
test-imports:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Cache pip dependencies
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt

- name: Test Import Structure
run: |
python -c "import coderag.config; print('βœ“ Config import successful')"
python -c "import coderag.embeddings; print('βœ“ Embeddings import successful')"
python -c "import coderag.index; print('βœ“ Index import successful')"
python -c "import coderag.search; print('βœ“ Search import successful')"
python -c "import coderag.monitor; print('βœ“ Monitor import successful')"
env:
OPENAI_API_KEY: dummy-key-for-testing

quality-and-tests:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.11
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install black flake8 isort mypy pytest
- name: Lint and type-check
run: |
black --check .
isort --check-only .
flake8 . --max-line-length=88 --ignore=E203,W503
mypy .
- name: Run tests
env:
PYTHONPATH: ${{ github.workspace }}
run: pytest -q
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,5 @@ node_modules/
*.tmp
plan.md
metadata.npy
test_env/
*.npy
36 changes: 36 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
repos:
- repo: https://github.com/psf/black
rev: 23.12.1
hooks:
- id: black
language_version: python3
args: ['--line-length=88']

- repo: https://github.com/pycqa/flake8
rev: 7.0.0
hooks:
- id: flake8
args: ['--max-line-length=88', '--ignore=E203,W503']

- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
args: ["--profile", "black"]

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.8.0
hooks:
- id: mypy
additional_dependencies: [types-all]
args: [--ignore-missing-imports, --no-strict-optional]

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- id: check-merge-conflict
- id: debug-statements
37 changes: 37 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Repository Guidelines

## Project Structure & Module Organization
- `coderag/`: Core library (`config.py`, `embeddings.py`, `index.py`, `search.py`, `monitor.py`).
- `app.py`: Streamlit UI. `main.py`: backend/indexer. `prompt_flow.py`: RAG orchestration.
- `scripts/`: Utilities (e.g., `initialize_index.py`, `run_monitor.py`).
- `tests/`: Minimal checks (e.g., `test_faiss.py`).
- `example.env` β†’ copy to `.env` for local secrets; CI lives in `.github/`.

## Build, Test, and Development Commands
- Create env: `python -m venv venv && source venv/bin/activate`.
- Install deps: `pip install -r requirements.txt`.
- Run backend: `python main.py` (indexes and watches `WATCHED_DIR`).
- Run UI: `streamlit run app.py`.
- Quick test: `python tests/test_faiss.py` (FAISS round‑trip sanity check).
- Quality suite: `pre-commit run --all-files` (black, isort, flake8, mypy, basics).

## Coding Style & Naming Conventions
- Formatting: Black (88 cols), isort profile "black"; run `black . && isort .`.
- Linting: flake8 with `--ignore=E203,W503` to match Black.
- Typing: mypy (py311 target; ignore missing imports OK). Prefer typed signatures and docstrings.
- Indentation: 4 spaces. Names: `snake_case` for files/functions, `PascalCase` for classes, constants `UPPER_SNAKE`.
- Imports: first‑party module is `coderag` (see `pyproject.toml`).

## Testing Guidelines
- Place tests in `tests/` as `test_*.py`. Keep unit tests deterministic; mock OpenAI calls where possible.
- Run directly (`python tests/test_faiss.py`) or with pytest if available (`pytest -q`).
- Ensure `.env` or env vars provide `OPENAI_API_KEY` for integration tests; avoid hitting rate limits in CI.

## Commit & Pull Request Guidelines
- Use Conventional Commits seen in history: `feat:`, `fix:`, `docs:`, `ci:`, `refactor:`, `simplify:`.
- Before pushing: `pre-commit run --all-files` and update docs when behavior changes.
- PRs: clear description, linked issues, steps to validate; include screenshots/GIFs for UI changes; note config changes (`.env`).

## Security & Configuration Tips
- Never commit secrets. Start with `cp example.env .env`; set `OPENAI_API_KEY`, `WATCHED_DIR`, `FAISS_INDEX_FILE`.
- Avoid logging sensitive data. Regenerate the FAISS index if dimensions or models change (`python scripts/initialize_index.py`).
194 changes: 194 additions & 0 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# πŸ› οΈ Development Guide

## Setting Up Development Environment

### 1. Clone and Setup

```bash
git clone https://github.com/your-username/CodeRAG.git
cd CodeRAG
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
```

### 2. Configure Pre-commit Hooks

```bash
pip install pre-commit
pre-commit install
```

This will run code quality checks on every commit:
- **Black**: Code formatting
- **isort**: Import sorting
- **Flake8**: Linting and style checks
- **MyPy**: Type checking
- **Basic hooks**: Trailing whitespace, file endings, etc.

### 3. Environment Variables

Copy `example.env` to `.env` and configure:

```bash
cp example.env .env
```

Required variables:
```env
OPENAI_API_KEY=your_key_here # Required for embeddings and chat
WATCHED_DIR=/path/to/code # Directory to index (default: current dir)
```

## Code Quality Standards

### Type Hints
All functions should have type hints:

```python
def process_file(filepath: str, content: str) -> Optional[np.ndarray]:
\"\"\"Process a file and return embeddings.\"\"\"
...
```

### Error Handling
Use structured logging and proper exception handling:

```python
import logging
logger = logging.getLogger(__name__)

try:
result = risky_operation()
except SpecificError as e:
logger.error(f"Operation failed: {str(e)}")
return None
```

### Documentation
Use concise docstrings for public functions:

```python
def search_code(query: str, k: int = 5) -> List[Dict[str, Any]]:
\"\"\"Search the FAISS index using a text query.

Args:
query: The search query text
k: Number of results to return

Returns:
List of search results with metadata
\"\"\"
```

## Testing Your Changes

### Manual Testing
```bash
# Test backend indexing
python main.py

# Test Streamlit UI (separate terminal)
streamlit run app.py
```

### Code Quality Checks
```bash
# Format code
black .
isort .

# Check linting
flake8 .

# Type checking
mypy .

# Run all pre-commit checks
pre-commit run --all-files
```

## Adding New Features

1. **Create feature branch**: `git checkout -b feature/new-feature`
2. **Add logging**: Use the logger for all operations
3. **Add type hints**: Follow existing patterns
4. **Handle errors**: Graceful degradation and user-friendly messages
5. **Update tests**: Add tests for new functionality
6. **Update docs**: Update README if needed

## Architecture Guidelines

### Keep It Simple
- Maintain the single-responsibility principle
- Avoid unnecessary abstractions
- Focus on the core RAG functionality

### Error Handling Strategy
- Log errors with context
- Return None/empty lists for failures
- Show user-friendly messages in UI
- Don't crash the application

### Performance Considerations
- Limit search results (default: 5)
- Truncate long content for context
- Cache embeddings when possible
- Monitor memory usage with large codebases

## Debugging Tips

### Enable Debug Logging
```python
logging.basicConfig(level=logging.DEBUG)
```

### Check Index Status
```python
from coderag.index import inspect_metadata
inspect_metadata(5) # Show first 5 entries
```

### Test Embeddings
```python
from coderag.embeddings import generate_embeddings
result = generate_embeddings("test code")
print(f"Shape: {result.shape if result is not None else 'None'}")
```

## Common Development Issues

**Import Errors**
- Ensure you're in the virtual environment
- Check PYTHONPATH includes project root
- Verify all dependencies are installed

**OpenAI API Issues**
- Check API key validity
- Monitor rate limits and usage
- Test with a simple embedding request

**FAISS Index Corruption**
- Delete existing index files and rebuild
- Check file permissions
- Ensure consistent embedding dimensions

## Project Structure

```
CodeRAG/
β”œβ”€β”€ coderag/ # Core library
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ config.py # Configuration management
β”‚ β”œβ”€β”€ embeddings.py # OpenAI integration
β”‚ β”œβ”€β”€ index.py # FAISS operations
β”‚ β”œβ”€β”€ search.py # Search functionality
β”‚ └── monitor.py # File monitoring
β”œβ”€β”€ scripts/ # Utility scripts
β”œβ”€β”€ tests/ # Test files
β”œβ”€β”€ .github/ # GitHub workflows
β”œβ”€β”€ main.py # Backend service
β”œβ”€β”€ app.py # Streamlit frontend
β”œβ”€β”€ prompt_flow.py # RAG orchestration
└── requirements.txt # Dependencies
```
Loading