Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes performance in Java query libraries by adding the overlay[local?] annotation to expensive predicates. The purpose is to make certain predicates local to improve query execution performance, particularly for predicates that contain slow operations like regular expressions.
Key changes:
- Added
overlay[local?]annotations to three expensive predicate classes - Targets predicates with known performance issues due to regex operations and other expensive computations
- Maintains existing functionality while improving execution efficiency
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
java/ql/lib/semmle/code/java/security/UrlForwardQuery.qll |
Added local overlay annotation to BarrierPrefix class |
java/ql/lib/semmle/code/java/security/StaticInitializationVectorQuery.qll |
Added local overlay annotation to ArrayUpdate class |
java/ql/lib/semmle/code/java/security/BrokenCryptoAlgorithmQuery.qll |
Added local overlay annotation to BrokenAlgoLiteral class |
kaspersv
left a comment
There was a problem hiding this comment.
Did you run a DCA overlay experiment to test if these changes had any impact on overlay accuracy for these queries?
| ma = this and | ||
| ma.getArgument(0) = array | ||
| | | ||
| m.getAnOverride*().hasQualifiedName("java.io", ["InputStream", "RandomAccessFile"], "read") or |
There was a problem hiding this comment.
These m.getAnOverride*() aren't file-local though they seem unlikely to matter in practice.
There was a problem hiding this comment.
Good point as these are getAnOverride so point backwards to what I thought they did.
|
I have a DCA experiment here; https://github.com/github/codeql-dca-main/issues/32115 . it's in progress as it failed the first time. |
|
You used the source suite intended for performance testing for the DCA experiment, so I'm not sure what the baseline is for overlay accuracy, but the results look pretty bad with 0% accuracy for several high-severity queries (e.g., |
|
Sorry, I ran a performance check by default and forgot change it for accuracy. Here is another run with accuracy testing settings: https://github.com/github/codeql-dca-main/issues/32285 |
|
I'm not sure about the failures, but they seem to be due to the files already existing so I think the results are mostly fine. |
The accuracy results include data for 145 out of 146 targets. I'm assuming the missing target is the one that failed ( |
This makes some expensive predicates in "query" libraries local. These all seem obviously local to me and they are all slow (mostly due to regexes, but soemtiems for other reasons)