Skip to content

fix(gap-analysis): repair dead GAP_ANALYSIS_OPTIMIZED toggle + add benchmark harness#748

Open
PRAteek-singHWY wants to merge 1 commit intoOWASP:mainfrom
PRAteek-singHWY:issue-587-gap-analysis-performance
Open

fix(gap-analysis): repair dead GAP_ANALYSIS_OPTIMIZED toggle + add benchmark harness#748
PRAteek-singHWY wants to merge 1 commit intoOWASP:mainfrom
PRAteek-singHWY:issue-587-gap-analysis-performance

Conversation

@PRAteek-singHWY
Copy link
Contributor

What & Why

Bug: Feature toggle introduced in PR #717 was silently broken

When PR #717 was stacked on PR #716 and both were merged, a duplicate gap_analysis method was accidentally left in NEO_DB (the PR #716 version at line ~733). In Python, when two methods share the same name in a class, the last one wins — so the toggle wrapper from PR #717 (line ~563) was completely shadowed.

Effect: GAP_ANALYSIS_OPTIMIZED had zero effect. Everyone was hitting the tiered-pruning path unconditionally — with no way to fall back to the safe exhaustive traversal @robvanderveer requested as a feature toggle.

Fix

Removed the duplicate gap_analysis method. The toggle now works correctly:

GAP_ANALYSIS_OPTIMIZED Behaviour
false / unset (default) _gap_analysis_original — original exhaustive traversal, safe default
true _gap_analysis_optimized — Tier 1 → 2 → 3 tiered pruning with early exit

Benchmark — closes #587

Added scripts/benchmark_gap.py — the empirical proof requested in Issue #587 that a previous contributor never delivered. Run it yourself against any live Neo4j instance:

python scripts/benchmark_gap.py --standard1 "OWASP Top 10 2021" --standard2 "NIST 800-53 v5"

Results — local Neo4j (19 standards loaded, 3 runs each)

Metric Original (GAP_ANALYSIS_OPTIMIZED=false) Optimized (GAP_ANALYSIS_OPTIMIZED=true) Δ
Avg query time 33.07s 0.16s 99.5% faster
Peak memory 37.32 MB 0.17 MB 🧠 99.6% less
Paths returned 6289 32 strong direct links only
DB queries run 2 (always both) 1 (early exit at Tier 1)

Note: local DB has only 19 standards. On a full production dataset the improvement will be even more significant — which is exactly the scenario Issue #587 describes (64 GB RAM, 24+ hours on full data).


Files Changed

  • application/database/db.py — removed duplicate gap_analysis method (67 lines deleted)
  • scripts/benchmark_gap.py — new benchmark script (258 lines added); runs both modes head-to-head and prints a GitHub-ready results table

fixes #587

The toggle added in PR OWASP#717 was being overridden by a duplicate
gap_analysis method left over from PR OWASP#716. Removed the duplicate
so the feature toggle actually works as intended.

Also adds scripts/benchmark_gap.py which proved the optimized mode
is 99.5% faster and uses 99.6% less memory than the original.

Closes OWASP#587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make calculating of Map Analysis Faster and less resource intensive

1 participant

Comments