Swift: mass enable diff-informed data flow#19662
Conversation
There was a problem hiding this comment.
Pull Request Overview
Enables diff-informed incremental data flow analysis across numerous Swift security queries by adding the required predicate and updates the change notes to reflect this new capability.
- Added
observeDiffInformedIncrementalMode()predicate to all relevant ConfigSig and StateConfigSig modules to opt into diff-informed mode. - Updated
2025-06-04-diff-informed.mdchange notes to document the new feature.
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| swift/ql/lib/codeql/swift/security/XXEQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/WeakPasswordHashingQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/UnsafeUnpackQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/UnsafeJsEvalQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/UncontrolledFormatStringQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/StringLengthConflationQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/StaticInitializationVectorQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/SqlInjectionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/PredicateInjectionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/PathInjectionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/InsufficientHashIterationsQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/HardcodedEncryptionKeyQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/ECBEncryptionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/ConstantSaltQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/ConstantPasswordQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/CommandInjectionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/CleartextTransmissionQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/codeql/swift/security/CleartextLoggingQuery.qll | Added diff-informed incremental mode predicate |
| swift/ql/lib/change-notes/2025-06-04-diff-informed.md | Documented that built-in Swift queries can now run in diff-informed mode |
Comments suppressed due to low confidence (2)
swift/ql/lib/change-notes/2025-06-04-diff-informed.md:4
- [nitpick] This note is concise but could benefit from listing the specific modules or providing a link to detailed docs so users know exactly which queries are affected.
* A number of built-in Swift queries can now run in diff-informed mode.
swift/ql/lib/codeql/swift/security/WeakSensitiveDataHashingQuery.qll:42
- There are no accompanying tests to verify that diff-informed incremental mode behaves as expected. Consider adding unit or integration tests to cover this new predicate in incremental runs.
predicate observeDiffInformedIncrementalMode() { any() }
| any(CleartextLoggingAdditionalFlowStep s).step(n1, n2) | ||
| } | ||
|
|
||
| predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
What does this mean?
(I wrote many of these queries, I'm sure I can figure out if it's appropriate)
There was a problem hiding this comment.
This line enables diff-informed mode, i.e. only showing results that are in a diff range. There is currently some internal documentation here, not sure if it's documented publicly yet.
geoffw0
left a comment
There was a problem hiding this comment.
Happy with all of these. 👍
The only Swift queries that don't select dataflow sources and sinks in the usual way are the regular expression queries, which you haven't tagged (apart from regex injection, which is indeed "simple" and you have correctly tagged it).
The other regex queries use three(?) data flow configurations to track the flow of regular expression strings, regular expression objects and regular expression configuration flag, putting things together at the site of the regex eval. We probably could make those work as well with a bit of effort (I don't recommend prioritizing it).
| any(InsecureTlsExtensionsAdditionalFlowStep s).step(nodeFrom, nodeTo) | ||
| } | ||
|
|
||
| predicate observeDiffInformedIncrementalMode() { any() } |
There was a problem hiding this comment.
This surprises me. This library is only used by the swift/insecure-tls query, which is an entirely straightforward dataflow query, so I would expect this to work.
There was a problem hiding this comment.
Looking at the select clause in the corresponding .ql file, it seems only the sink node is used as a location source, which requires an override of getASelectedSourceLocation() { none() }. The patch script currently misses these cases, which would explain why this one slipped through. To properly check the correctness of these configs, we need to look at the corresponding select, which is an awkward code review flow. Another mystery is why only this query failed the diff-informed consistency check.
There was a problem hiding this comment.
Oh I see. It would also be possible to add the source as a $@ parameter to make this a normal query.
There was a problem hiding this comment.
P.S. this is the run with the failing InsecureTLS diff-informed consistency check (Wrongly omitted: | :0:0:0:0; strange location): https://github.com/github/semmle-code/actions/runs/15425113807/job/43409978713
There was a problem hiding this comment.
It turns out the problem wasn't about selecting the source or not since we've modified the diff-informed tests to not care about omitted results. The problem was that the test selects an element without a location, leading to errors like this:
Wrongly omitted: | :0:0:0:0 | [post] self | This TLS configuration is insecure. |
That problem is orthogonal to diff-informed queries, but it's probably easiest to leave this query not diff-informed until the problem is fixed.
There was a problem hiding this comment.
Makes sense. FYI fixing that particular issue will be very low priority for the foreseeable future, unless it's affecting customers. I think we should just press on and get everything else diff-informed.
|
It turns out that some of the generated changes in the PRs were not correct, e.g. because they should have also generated a |
|
Happy to re-review when this is ready. |
72da199 to
02341c0
Compare
|
Update: no changes since last time I opened the PR. It turns out that it's sound (but not optimally performant) to leave |
geoffw0
left a comment
There was a problem hiding this comment.
LGTM, thanks for doing this.
An auto-generated patch that enables diff-informed data flow in the obvious cases. Builds on github#18343 and github/codeql-patch#88
27dce51 to
2078a34
Compare
|
Sorry for the review invalidation; I removed the change note commit since diff-informed queries are not publicly documented yet. |
|
Note, according to the follow-up PR, 6 of these queries (ConstantPasswordQuery.qll, InsufficientHashIterationsQuery.qll, StaticInitializationVectorQuery.qll, StringLengthConflationQuery.qll, UnsafeJsEvalQuery.qll, UnsafeUnpackQuery.qll) have a missing source/sink in their select clause; the other ones should have both. |
An auto-generated patch that enables diff-informed data flow in the obvious cases.
Builds on #18343 and https://github.com/github/codeql-patch/pull/88