Add a new prompt & tool for diagnosing FPs/FNs from query runs.#70
Conversation
Dependency ReviewThe following issues were found:
License Issuesserver/package.json
OpenSSF Scorecard
Scanned Files
|
|
Mostly written by a local agent, especially the zip decoding stuff. |
There was a problem hiding this comment.
Pull request overview
This PR adds a new MCP tool and workflow prompt to help developers analyze false positives in their CodeQL query results by reading source code directly from database archives. The implementation enables agents to explore code at alert locations without requiring the original source tree, which is particularly valuable for diagnosing query precision issues.
Changes:
- Added
read_database_sourcetool to read source files from CodeQL database archives (src.zip or src/ directory) - Added
run_query_and_summarize_false_positivesprompt to guide agents through FP analysis workflows - Added comprehensive test coverage with 12 test cases covering both src.zip and src/ directory modes
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| server/src/tools/codeql/read-database-source.ts | New tool implementation with archive reading, path resolution, and line-range slicing capabilities |
| server/test/src/tools/codeql/read-database-source.test.ts | Comprehensive test suite covering listing, reading, URI resolution, line ranges, and error cases |
| server/src/tools/codeql/index.ts | Export for the new read_database_source tool |
| server/src/tools/codeql-tools.ts | Registration of the new tool in alphabetical order |
| server/src/prompts/run-query-and-summarize-false-positives.prompt.md | New prompt with YAML frontmatter, task instructions, and output format guidance |
| server/src/prompts/workflow-prompts.ts | Schema and registration for the new prompt |
| server/test/src/tools/codeql-tools.test.ts | Updated tool count from 34 to 35 |
| server/test/src/prompts/workflow-prompts.test.ts | Updated prompt count from 10 to 11 |
| server/package.json | Added adm-zip dependency and @types/adm-zip dev dependency |
| package-lock.json | Added dependency entries for adm-zip packages; contains engine version inconsistency |
| docs/ql-mcp/prompts.md | Updated prompt count and added new prompt to reference table |
server/src/prompts/run-query-and-summarize-false-positives.prompt.md
Outdated
Show resolved
Hide resolved
server/src/prompts/run-query-and-summarize-false-positives.prompt.md
Outdated
Show resolved
Hide resolved
server/src/prompts/run-query-and-summarize-false-positives.prompt.md
Outdated
Show resolved
Hide resolved
Rather than relying on SARIF, give the MCP server a tool to allow agents to read the source code from the zip archive of a database. Create a prompt that instructs the agent to try to find a prior runs, rerun if necessary, and then use this tool to find FPs/FNs.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Nathan Randall <70299490+data-douser@users.noreply.github.com>
b85e017 to
1b02f68
Compare
Adds client integration tests for the new 'read_database_source' MCP server tool.
server/src/prompts/run-query-and-summarize-false-positives.prompt.md
Outdated
Show resolved
Hide resolved
server/src/prompts/run-query-and-summarize-false-positives.prompt.md
Outdated
Show resolved
Hide resolved
- Register run-query-and-summarize-false-positives prompt in PROMPT_TEMPLATES so it is embedded in the bundle instead of returning the fallback message - Replace backtick tool references with #tool_name form in the prompt template for consistent Copilot Chat rendering - Fix CRLF line splitting in applyLineRange (split on /\r?\n/) - Add default listing cap (1000 entries) to prevent oversized MCP responses from large src.zip archives - Enforce max uncompressed size (10 MB) before decompressing zip entries - Use dirname() instead of join(.., '..') in test helper for clarity
| function* walkDirectory(dir: string, base: string = dir): Generator<string> { | ||
| for (const entry of readdirSync(dir)) { | ||
| const fullPath = join(dir, entry); | ||
| if (statSync(fullPath).isDirectory()) { | ||
| yield* walkDirectory(fullPath, base); | ||
| } else { | ||
| // Return path relative to base, normalised with forward slashes | ||
| yield fullPath.slice(base.length).replace(/\\/g, '/').replace(/^\//, ''); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The walkDirectory function does not explicitly check for or skip symbolic links when recursively traversing the src/ directory. This could potentially allow reading files outside the intended database source directory if symlinks are present. Consider adding a check to skip symlinks or resolve them safely within the database boundary.
For example, you could check lstatSync instead of statSync to detect symlinks before following them, or verify that resolved paths remain within the database directory.
| // 3. Suffix match — collect all candidates, prefer the longest (most specific) | ||
| const candidates: { entry: string; length: number }[] = []; | ||
| for (const entry of available) { | ||
| const entryNorm = entry.replace(/^\//, ''); | ||
| if (entryNorm.endsWith(withoutLeading) || withoutLeading.endsWith(entryNorm)) { | ||
| candidates.push({ entry, length: entryNorm.length }); | ||
| } | ||
| } |
There was a problem hiding this comment.
The suffix matching logic at line 143 uses a bidirectional match condition: entryNorm.endsWith(withoutLeading) || withoutLeading.endsWith(entryNorm). The second condition (withoutLeading.endsWith(entryNorm)) could match unintended entries when requesting a longer path that happens to end with a shorter entry name.
For example, if requesting foo/bar/File.java and the archive contains just File.java, this would match even though the requested path has additional directory components. This behavior seems intentional based on the comment about "handles absolute SARIF URIs", but the bidirectional match could lead to unexpected results in some cases.
Consider documenting this behavior more explicitly or restricting the second condition to only apply when it makes sense semantically.
Rather than relying on SARIF, give the MCP server a tool to allow agents to read the source code from the zip archive of a database.
Create a prompt that instructs the agent to try to find a prior runs, rerun if necessary, and then use this tool to find FPs/FNs.
Example results of usage:
False Positive Analysis on gRPC
Redis: 0 results — clean.
gRPC: 290 results — all false positives, one root cause.
Root cause:
ABSL_DCHECK→ assertion failure handler →demangle.ccEvery alert points to
MapKeyComparator::operator()atwire_format.cc:1176:The
ABSL_DCHECKmacro expands to code that, on assertion failure, invokes a diagnostic handler that calls into abseil's stack symbolizer, which callsDemangle()indemangle.cc. That function operates on a locally-allocatedState*struct:This produces ~290 duplicate alerts for one call site, one per pointer-write in
demangle.cc.There are two compounding false positive mechanisms here:
AliasVariableModificationGlobalSideEffect. HereState*is locally allocated — not a persistent global. The query can't distinguish pointer-to-local from pointer-to-global.ABSL_DCHECKis only executed on crash/assertion paths in debug builds. The query treats all reachable call paths equally, with no distinction for assertion/diagnostic code.Recommended fix: The most impactful single change would be to restrict to direct side effects only — replacing
getAnExternalOrGlobalSideEffectInFunction(f)(which recurses through all callees) with only effects whereeffect.getEnclosingFunction() = f. This eliminates all transitive false positives at the cost of missing cases where a predicate calls a small helper that performs a side effect.