-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Description of the bug
When reimporting CodeQL SARIF into the same test with DD_DEDUPLICATION_ALGORITHM_PER_PARSER='{"SARIF":"unique_id_from_tool_or_hash_code"}', findings are closed even though the new SARIF contains the same vulnerability with an identical unique_id_from_tool. Only hash_code changes between scans, which is expected for a line/location change.
Given the metadata below, I expect DefectDojo to keep the existing finding open and reuse it (same vuln, new location), but instead some findings are being marked closed and no new findings are created.
Environment
-
DefectDojo edition: Community
-
DefectDojo version: (2.53.5)
-
Deployment: docker-compose
-
Relevant env var:
DD_DEDUPLICATION_ALGORITHM_PER_PARSER: '{"SARIF": "unique_id_from_tool_or_hash_code"}'
-
Test type:
SARIF -
Endpoint used:
/api/v2/reimport-scan/ -
close_old_findingson reimport:true -
Same Product → same Engagement → same Test for both scans
What I’m doing
- Initial CodeQL SARIF upload to a
SARIFtest viareimport-scan(first time behaves like import). - Later CodeQL SARIF reimport into the same test using
/api/v2/reimport-scan/withclose_old_findings=true. - Code has changed so the line/location changed, but the logical vuln is the same.
Observed behavior
- Some findings are being closed (mitigated/inactive) after reimport.
- No new findings are created for those vulns.
- In SARIF Explorer and in DefectDojo API, I can see that the vuln is still present in the new SARIF and the
unique_id_from_toolvalue is identical between fresh and reimport scans. - Only
hash_codechanges between scans.
Expected behavior
With DD_DEDUPLICATION_ALGORITHM_PER_PARSER='{"SARIF":"unique_id_from_tool_or_hash_code"}' I expect:
- On reimport into the same test:
- New SARIF result with same
unique_id_from_toolshould be matched to the existing finding and keep it open. - No new finding should be created (since it’s the same vuln).
close_old_findings=trueshould only close findings that have no matchingunique_id_from_toolorhash_codein the new SARIF.
- New SARIF result with same
In other words: same test + same unique_id_from_tool + different location/hash_code should result in one open finding, not a closed finding and no new one.
Example metadata
Below are 3 findings, with metadata from the initial (fresh) scan and the reimported scan.
Fresh scan metadata:
unique_id_from_tool:
primaryLocationLineHash:2abebf2b9f7e8f07:1|primaryLocationStartColumnFingerprint:14hash_code:
fd88a20b1bd5ce0674cfa22284f08e7cdb3d3369e58d7969afdd03b01b041fa5
unique_id_from_tool:
primaryLocationLineHash:7c1ccbae89e35318:1|primaryLocationStartColumnFingerprint:13hash_code:
60caa862d2020e301b64b4f52ae88cfbbce991196ffddd417e50af9274d05980
unique_id_from_tool:
primaryLocationLineHash:db9e4b3bba297e41:1|primaryLocationStartColumnFingerprint:12hash_code:
f1c6f6c4cccb48980e93b8758e42e48ae2cf7188f75b402fbfc7ce97b146e18c
Reimport scan metadata (same test, same vulns with changed location):
unique_id_from_tool:
primaryLocationLineHash:2abebf2b9f7e8f07:1|primaryLocationStartColumnFingerprint:14hash_code:
1a49e4cdb4a19cc434a3da0a8a92ac8546487b3043a11f70205167cde9f3a908
unique_id_from_tool:
primaryLocationLineHash:7c1ccbae89e35318:1|primaryLocationStartColumnFingerprint:13hash_code:
fa51a69946d6090670522afa22b81e56c6b189cdbaf48dda6f1f95caab230a85
unique_id_from_tool:
primaryLocationLineHash:db9e4b3bba297e41:1|primaryLocationStartColumnFingerprint:12hash_code:
e5c242ce20f8cdf6878235c46e01dc4011853a9979c2cde4af7ecc223b04acbf
For all 3:
unique_id_from_toolis identical between fresh and reimport.hash_codeis different between fresh and reimport (expected for location change).
Despite this, some of these findings are being closed after reimport and no new finding is created.
Why this seems wrong
According to the deduplication docs and UNIQUE_ID_FROM_TOOL_OR_HASH_CODE semantics, if an incoming finding has the same unique_id_from_tool as an existing one, they should be considered the same logical finding. docs.defectdojo
In a reimport-scan for the same test, that should keep the existing finding open and mark it as “seen”; close_old_findings=true should only close findings that had no matching unique_id_from_tool or hash_code in the new scan. docs.defectdojo
Here, reimport behaves as if the finding was not seen, even though there is a matching unique_id_from_tool in the new SARIF. This looks similar to other reports where valid findings get mitigated on reimport. github
Request
- Can you please confirm the intended behavior of
reimport-scanwithUNIQUE_ID_FROM_TOOL_OR_HASH_CODEfor SARIF? - If my understanding is correct, can this be treated as a bug in the
close_old_findingslogic for SARIF reimport?