Skip to content

A Model Context Protocol (MCP) server that builds rich code graphs to provide deep contextual understanding of TypeScript codebases to Large Language Models.

License

Notifications You must be signed in to change notification settings

drewdrewH/code-graph-context

Repository files navigation

Code Graph Context

npm version MIT License TypeScript Neo4j NestJS OpenAI MCP

Give your AI coding assistant a photographic memory of your codebase.

Code Graph Context is an MCP server that builds a semantic graph of your TypeScript codebase, enabling Claude to understand not just individual files, but how your entire system fits together.

Config-Driven & Extensible: Define custom framework schemas to capture domain-specific patterns beyond the included NestJS support. The parser is fully configurable to recognize your architectural patterns, decorators, and relationships.

                    ┌─────────────────────────────────────────────────────────────┐
                    │                     YOUR CODEBASE                           │
                    │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐    │
                    │  │ Service  │  │Controller│  │  Module  │  │  Entity  │    │
                    │  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘    │
                    └───────┼─────────────┼─────────────┼─────────────┼──────────┘
                            │             │             │             │
                            ▼             ▼             ▼             ▼
                    ┌─────────────────────────────────────────────────────────────┐
                    │                   CODE GRAPH CONTEXT                        │
                    │                                                             │
                    │   AST Parser ──► Neo4j Graph ──► Vector Embeddings          │
                    │   (ts-morph)     (Relationships)  (OpenAI)                  │
                    │                                                             │
                    └─────────────────────────────────────────────────────────────┘
                                                │
                                                ▼
                    ┌─────────────────────────────────────────────────────────────┐
                    │                      CLAUDE CODE                            │
                    │                                                             │
                    │   "What services depend on UserService?"                    │
                    │   "What's the blast radius if I change this function?"      │
                    │   "Find all HTTP endpoints that accept a UserDTO"           │
                    │   "Refactor this across all 47 files that use it"           │
                    │                                                             │
                    └─────────────────────────────────────────────────────────────┘

Why Code Graph Context?

Without Code Graph With Code Graph
Claude reads files one at a time Claude understands the entire dependency tree
"What uses this?" requires manual searching Instant impact analysis with risk scoring
Refactoring misses edge cases Graph traversal finds every reference
Large codebases overwhelm context Semantic search finds exactly what's relevant
Multi-file changes are error-prone Swarm agents coordinate parallel changes

Features

  • Multi-Project Support: Parse and query multiple projects in a single database with complete isolation
  • Semantic Search: Vector-based search using OpenAI embeddings to find relevant code
  • Natural Language Querying: Convert questions into Cypher queries
  • Framework-Aware: Built-in NestJS schema with ability to define custom framework patterns
  • Weighted Graph Traversal: Intelligent traversal scoring paths by importance and relevance
  • Workspace Support: Auto-detects Nx, Turborepo, pnpm, Yarn, and npm workspaces
  • Parallel & Async Parsing: Multi-threaded parsing with Worker threads for large codebases
  • Streaming Import: Chunked processing for projects with 100+ files
  • Incremental Parsing: Only reparse changed files
  • File Watching: Real-time graph updates on file changes
  • Impact Analysis: Assess refactoring risk (LOW/MEDIUM/HIGH/CRITICAL)
  • Dead Code Detection: Find unreferenced exports with confidence scoring
  • Duplicate Detection: Structural (AST hash) and semantic (embedding similarity) duplicates
  • Swarm Coordination: Multi-agent stigmergic coordination with pheromone decay

Architecture

TypeScript Source → AST Parser (ts-morph) → Neo4j Graph + Vector Embeddings → MCP Tools

Core Components:

  • src/core/parsers/typescript-parser.ts - AST parsing with ts-morph
  • src/storage/neo4j/neo4j.service.ts - Graph storage and queries
  • src/core/embeddings/embeddings.service.ts - OpenAI embeddings
  • src/mcp/mcp.server.ts - MCP server and tool registration

Dual-Schema System:

  • Core Schema: AST-level nodes (ClassDeclaration, MethodDeclaration, ImportDeclaration, etc.)
  • Framework Schema: Semantic interpretation (NestController, NestService, HttpEndpoint, etc.)

Nodes have both coreType (AST) and semanticType (framework meaning), enabling queries like "find all controllers" while maintaining AST precision.

Quick Start

Prerequisites

  • Node.js >= 18
  • Docker (for Neo4j)
  • OpenAI API Key

1. Install

npm install -g code-graph-context
code-graph-context init  # Sets up Neo4j via Docker

2. Configure Claude Code

Add to Claude Code with your OpenAI API key:

claude mcp add --scope user code-graph-context \
  -e OPENAI_API_KEY=sk-your-key-here \
  -- code-graph-context

That's it. Restart Claude Code and you're ready to go.

3. Parse Your Project

In Claude Code, say:

"Parse this project and build the code graph"

Claude will run parse_typescript_project and index your codebase.


Configuration Files

Claude Code stores MCP server configs in JSON files. The location depends on scope:

Scope File Use Case
User (global) ~/.claude.json Available in all projects
Project .claude.json in project root Project-specific config
Local .mcp.json in project root Git-ignored local overrides

Manual Configuration

If you prefer to edit the config files directly:

~/.claude.json (user scope - recommended):

{
  "mcpServers": {
    "code-graph-context": {
      "command": "code-graph-context",
      "env": {
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

From source installation:

{
  "mcpServers": {
    "code-graph-context": {
      "command": "node",
      "args": ["/absolute/path/to/code-graph-context/dist/cli/cli.js"],
      "env": {
        "OPENAI_API_KEY": "sk-your-key-here"
      }
    }
  }
}

Environment Variables

Variable Required Default Description
OPENAI_API_KEY Yes - For embeddings and NL queries
NEO4J_URI No bolt://localhost:7687 Neo4j connection URI
NEO4J_USER No neo4j Neo4j username
NEO4J_PASSWORD No PASSWORD Neo4j password

Core Capabilities

Semantic Code Search

Find code by describing what you need, not by memorizing file paths:

"Find where user authentication tokens are validated"
"Show me the database connection pooling logic"
"What handles webhook signature verification?"

Impact Analysis

Before you refactor, understand the blast radius:

┌─────────────────────────────────────────────────────────────┐
│ Impact Analysis: UserService.findById()                     │
├─────────────────────────────────────────────────────────────┤
│ Risk Level: HIGH                                            │
│                                                             │
│ Direct Dependents (12):                                     │
│   └── AuthController.login()                                │
│   └── ProfileController.getProfile()                        │
│   └── AdminService.getUserDetails()                         │
│   └── ... 9 more                                            │
│                                                             │
│ Transitive Dependents (34):                                 │
│   └── 8 controllers, 15 services, 11 tests                  │
│                                                             │
│ Affected Files: 23                                          │
│ Recommendation: Add deprecation warning before changing     │
└─────────────────────────────────────────────────────────────┘

Graph Traversal

Explore relationships in any direction:

UserController
    │
    ├── INJECTS ──► UserService
    │                   │
    │                   ├── INJECTS ──► UserRepository
    │                   │                   │
    │                   │                   └── MANAGES ──► User (Entity)
    │                   │
    │                   └── INJECTS ──► CacheService
    │
    └── EXPOSES ──► POST /users
                        │
                        └── ACCEPTS ──► CreateUserDTO

Dead Code Detection

Find code that can be safely removed:

Dead Code Analysis: 47 items found
├── HIGH confidence (23): Exported but never imported
│   └── formatLegacyDate() in src/utils/date.ts:45
│   └── UserV1DTO in src/dto/legacy/user.dto.ts:12
│   └── ... 21 more
├── MEDIUM confidence (18): Private, never called
└── LOW confidence (6): May be used dynamically

Duplicate Code Detection

Identify DRY violations across your codebase:

Duplicate Groups Found: 8

Group 1 (Structural - 100% identical):
├── validateEmail() in src/auth/validation.ts:23
└── validateEmail() in src/user/validation.ts:45
    Recommendation: Extract to shared utils

Group 2 (Semantic - 94% similar):
├── parseUserInput() in src/api/parser.ts:78
└── sanitizeInput() in src/webhook/parser.ts:34
    Recommendation: Review for consolidation

Swarm Coordination

Execute complex, multi-file changes with parallel AI agents.

The swarm system enables multiple Claude agents to work on your codebase simultaneously, coordinating through the code graph without stepping on each other.

                         ┌──────────────────┐
                         │   ORCHESTRATOR   │
                         │                  │
                         │ "Add JSDoc to    │
                         │  all services"   │
                         └────────┬─────────┘
                                  │
                    ┌─────────────┼─────────────┐
                    │             │             │
                    ▼             ▼             ▼
             ┌──────────┐  ┌──────────┐  ┌──────────┐
             │ Worker 1 │  │ Worker 2 │  │ Worker 3 │
             │          │  │          │  │          │
             │ Claiming │  │ Working  │  │ Claiming │
             │ AuthSvc  │  │ UserSvc  │  │ PaySvc   │
             └──────────┘  └──────────┘  └──────────┘
                    │             │             │
                    └─────────────┼─────────────┘
                                  │
                                  ▼
                    ┌─────────────────────────────┐
                    │      PHEROMONE TRAILS       │
                    │                             │
                    │  AuthService: [claimed]     │
                    │  UserService: [modifying]   │
                    │  PayService:  [claimed]     │
                    │  CacheService: [available]  │
                    │                             │
                    └─────────────────────────────┘

Two Coordination Mechanisms

1. Pheromone System (Stigmergic)

Agents leave markers on code nodes that decay over time—like ants leaving scent trails:

Pheromone Half-Life Meaning
exploring 2 min "I'm looking at this"
claiming 1 hour "This is my territory"
modifying 10 min "I'm actively changing this"
completed 24 hours "I finished work here"
warning Never "Don't touch this"
blocked 5 min "I'm stuck"

Self-healing: If an agent crashes, its pheromones decay and the work becomes available again.

2. Task Queue (Blackboard)

Explicit task management with dependencies:

┌─────────────────────────────────────────────────────────────┐
│                        TASK QUEUE                           │
├─────────────────────────────────────────────────────────────┤
│ [available] Add JSDoc to UserService         priority: high │
│ [claimed]   Add JSDoc to AuthService         agent: worker1 │
│ [blocked]   Update API docs ─────────────────► depends on ──┤
│ [in_progress] Add JSDoc to PaymentService    agent: worker2 │
│ [completed] Add JSDoc to CacheService        ✓              │
└─────────────────────────────────────────────────────────────┘

Swarm Tools

Tool Purpose
swarm_orchestrate Decompose a task and spawn worker agents
swarm_post_task Add a task to the queue
swarm_get_tasks Query tasks with filters
swarm_claim_task Claim/start/release a task
swarm_complete_task Complete/fail/request review
swarm_pheromone Leave a marker on a code node
swarm_sense Query what other agents are doing
swarm_cleanup Remove pheromones after completion

Example: Parallel Refactoring

// Orchestrator decomposes and creates tasks
swarm_orchestrate({
  projectId: "backend",
  task: "Rename getUserById to findUserById across the codebase",
  maxAgents: 3
})

// Returns a plan:
{
  swarmId: "swarm_abc123",
  plan: {
    totalTasks: 12,
    parallelizable: 8,
    sequential: 4,  // These have dependencies
    tasks: [
      { id: "task_1", title: "Update UserService.findUserById", status: "available" },
      { id: "task_2", title: "Update UserController references", status: "blocked", depends: ["task_1"] },
      ...
    ]
  },
  workerInstructions: "..."  // Copy-paste to spawn workers
}

Install the Swarm Skill

For optimal swarm execution, install the included Claude Code skill that teaches agents the coordination protocol:

# Copy to your global skills directory
mkdir -p ~/.claude/skills
cp -r skills/swarm ~/.claude/skills/

Or for a specific project:

cp -r skills/swarm .claude/skills/

The skill provides:

  • Worker agent protocol with step-by-step workflow
  • Multi-phase orchestration patterns (discovery, contracts, implementation, validation)
  • Common failure modes and how to prevent them
  • Complete tool reference

Once installed, just say "swarm" or "parallel agents" and Claude will use the skill automatically.

See skills/swarm/SKILL.md for the full documentation.


All Tools

Tool Description
Discovery
list_projects List parsed projects in the database
search_codebase Semantic search using vector embeddings
traverse_from_node Explore relationships from a node
natural_language_to_cypher Convert questions to Cypher queries
Analysis
impact_analysis Assess refactoring risk (LOW/MEDIUM/HIGH/CRITICAL)
detect_dead_code Find unreferenced exports and methods
detect_duplicate_code Find structural and semantic duplicates
Parsing
parse_typescript_project Build the graph from source
check_parse_status Monitor async parsing jobs
start_watch_project Auto-update graph on file changes
stop_watch_project Stop file watching
list_watchers List active file watchers
Swarm
swarm_orchestrate Plan and spawn parallel agents
swarm_post_task Add task to the queue
swarm_get_tasks Query tasks
swarm_claim_task Claim/start/release tasks
swarm_complete_task Complete/fail/review tasks
swarm_pheromone Leave coordination markers
swarm_sense Query what others are doing
swarm_cleanup Clean up after swarm completion
Utility
test_neo4j_connection Verify database connectivity

Tool Workflow Patterns

Pattern 1: Discovery → Focus → Deep Dive

list_projects → search_codebase → traverse_from_node → traverse (with skip for pagination)

Pattern 2: Pre-Refactoring Safety

search_codebase("function to change") → impact_analysis(nodeId) → review risk level

Pattern 3: Code Health Audit

detect_dead_code → detect_duplicate_code → prioritize cleanup

Pattern 4: Multi-Agent Work

swarm_orchestrate → spawn workers → swarm_get_tasks(includeStats) → swarm_cleanup

Multi-Project Support

All query tools require projectId for isolation. You can use:

  • Project ID: proj_a1b2c3d4e5f6 (auto-generated)
  • Project name: my-backend (from package.json)
  • Project path: /path/to/project (resolved automatically)
// These all work:
search_codebase({ projectId: "my-backend", query: "auth" })
search_codebase({ projectId: "proj_a1b2c3d4e5f6", query: "auth" })
search_codebase({ projectId: "/path/to/my-backend", query: "auth" })

Framework Support

NestJS (Built-in)

Deep understanding of NestJS patterns:

  • Controllers with route analysis (@Controller, @Get, @Post, etc.)
  • Services with dependency injection mapping (@Injectable)
  • Modules with import/export relationships (@Module)
  • Guards, Pipes, Interceptors as middleware chains
  • DTOs with validation decorators (@IsString, @IsEmail, etc.)
  • Entities with TypeORM relationship mapping

NestJS-Specific Relationships:

  • INJECTS - Dependency injection
  • EXPOSES - Controller exposes HTTP endpoint
  • MODULE_IMPORTS, MODULE_PROVIDES, MODULE_EXPORTS - Module system
  • GUARDED_BY, TRANSFORMED_BY, INTERCEPTED_BY - Middleware

Custom Framework Schemas

The parser is config-driven. Define your own framework patterns:

// Example: Custom React schema
const REACT_SCHEMA = {
  name: 'react',
  decoratorPatterns: [
    { pattern: /^use[A-Z]/, semanticType: 'ReactHook' },
    { pattern: /^with[A-Z]/, semanticType: 'HOC' },
  ],
  nodeTypes: [
    { coreType: 'FunctionDeclaration', condition: (node) => node.name?.endsWith('Provider'), semanticType: 'ContextProvider' },
  ],
  relationships: [
    { type: 'PROVIDES_CONTEXT', from: 'ContextProvider', to: 'ReactHook' },
  ]
};

The dual-schema system means every node has:

  • coreType: AST-level (ClassDeclaration, FunctionDeclaration)
  • semanticType: Framework meaning (NestController, ReactHook)

This enables queries like "find all hooks that use context" while maintaining AST precision for refactoring.


Troubleshooting

MCP Server Not Connecting

# Check the server is registered
claude mcp list

# Verify Neo4j is running
docker ps | grep neo4j

# Test manually
code-graph-context status

Missing OPENAI_API_KEY

Symptoms: "Failed to generate embedding" errors

Fix: Ensure the key is in your config file:

# Check current config
cat ~/.claude.json | grep -A5 "code-graph-context"

# Re-add with key
claude mcp remove code-graph-context
claude mcp add --scope user code-graph-context \
  -e OPENAI_API_KEY=sk-your-key-here \
  -- code-graph-context

Neo4j Memory Issues

For large codebases, increase memory limits:

# Stop and recreate with more memory
code-graph-context stop
code-graph-context init --memory 4G

Parsing Timeouts

Use async mode for large projects:

parse_typescript_project({
  projectPath: "/path/to/project",
  tsconfigPath: "/path/to/project/tsconfig.json",
  async: true  // Returns immediately, poll with check_parse_status
})

CLI Commands

code-graph-context init [options]   # Set up Neo4j container
code-graph-context status           # Check Docker/Neo4j status
code-graph-context stop             # Stop Neo4j container

Init options:

  • -p, --port <port> - Bolt port (default: 7687)
  • --http-port <port> - Browser port (default: 7474)
  • --password <password> - Neo4j password (default: PASSWORD)
  • -m, --memory <size> - Heap memory (default: 2G)
  • -f, --force - Recreate container

Contributing

git clone https://github.com/drewdrewH/code-graph-context.git
cd code-graph-context
npm install
npm run build
npm run dev  # Watch mode

Conventional Commits: feat|fix|docs|refactor(scope): description


License

MIT - see LICENSE


Links

About

A Model Context Protocol (MCP) server that builds rich code graphs to provide deep contextual understanding of TypeScript codebases to Large Language Models.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •