C++: Fix cartesian-like join in ExternalFlow.qll#20783
C++: Fix cartesian-like join in ExternalFlow.qll#20783MathiasVP merged 6 commits intogithub:mainfrom
ExternalFlow.qll#20783Conversation
6c9d5f4 to
dfdc2a6
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the C++ dataflow external flow signature matching logic to properly utilize namespace information when identifying functions. Previously, namespace was noted as "not being used" in a comment, but this change integrates namespace checking throughout the signature matching process.
Key changes:
- Introduces a
getFunctionhelper predicate to centralize function identification logic based on namespace, type, and name - Extends
signatureMatchespredicate to include namespace as a parameter, enabling proper namespace-aware signature matching - Refactors
interpretElement0to use the new helper, improving code maintainability by reducing duplication
Reviewed Changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| cpp/ql/lib/semmle/code/cpp/dataflow/ExternalFlow.qll | Adds getFunction helper predicate, updates signatureMatches to use namespace parameter, refactors interpretElement0 to leverage new helper, removes outdated comment about namespace not being used, adjusts classHasQualifiedName bindingset annotation |
| cpp/ql/test/library-tests/dataflow/taint-tests/test_mad-signatures.ql | Updates test predicate arity from /5 to /6 to match the new signatureMatches signature |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Please disregard the "Pull Request Overview" comment by Copilot. It completely misses the point of this PR 😭 |
|
Note that the coding standards failure is my fault, please ignore. I know how to fix this. |
|
There seems to be a slowdown in DCA. |
I'll take a look. Note that the majority of the slowdown is coming from the unstable macOS project. But worth double checking anyway! Edit: Yeah, it looks like the code generated for the new version of |
…dingset's. The repeated inlining caused by 'bindiingset's created some #shared predicates with repeated regex matching.
| | stl.h:678:33:678:38 | format | std | (format_string,Args &&) | | format<Args> | 0 | | ||
| | stl.h:678:33:678:38 | format | std | (format_string,Args &&) | | format<Args> | 1 | | ||
| | stl.h:678:33:678:38 | format | std | (format_string,Args &&) | | format<Args> | 1 | | ||
| | taint.cpp:735:7:735:12 | malloc | | (size_t) | | malloc | 0 | |
There was a problem hiding this comment.
This change is expected since the signatureMatches predicate now has its columns restricted to only functions for which we have a MaD row. And since we don't have any MaD summary for malloc this signatureMatches result disappears
|
DCA was uneventful after 29a294f 🎉 |
The
signatureMatchespredicate only related theFunctioncolumn to theicolumn (and not to thenameortypecolumns) which meant that it returned waaaayyyy to manyFunctions.Functionally the behavior was correct since the resulting
Functionwas properly related to those entities in the caller (i.e, theinterpretElement0predicate). But it did mean we had a rather large blowup in the number of results for this predicate (which our tests also showed).Thanks for highlithing this to me, @jketema!
Commit-by-commit review recommended.