ControlFlow: Split only on relevant values.#20519
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR addresses performance regressions in Java NullMaybe.ql by optimizing control-flow reachability calculations. The changes restrict tracked values to only those that may affect relevant branches while improving precision through singleton value prioritization.
- Restricts tracked values to those that can determine relevant branches
- Prioritizes singleton values for maximum precision
- Modifies the universal quantification to include branch-related value-splits
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| Guards.qll | Adds isSingleton() predicate to check if a value represents a precise singleton |
| ControlFlow.qll | Major refactoring to optimize value tracking by filtering irrelevant values and prioritizing singletons |
603ccd1 to
ab77d0f
Compare
ab77d0f to
109b6a1
Compare
|
Dca looks good. A few additional FPs are removed and performance improves. |
hvitved
left a comment
There was a problem hiding this comment.
Very nice! Only some trivial coments.
| def.getDefinition() = e and | ||
| exprHasValue(e, gv) and | ||
| not exists(GuardValue gv0 | exprHasValue(e, gv0) and smaller(gv0, gv)) | ||
| (exists(GuardValue gv0 | exprHasValue(e, gv0) and gv0.isSingleton()) implies gv.isSingleton()) |
There was a problem hiding this comment.
Nit: I would prefer
any(GuardValue gv0 | exprHasValue(e, gv0)).isSingleton() implies gv.isSingleton()| ssaControlsBranchEdge(t, bb1, bb2, gv) and | ||
| not exists(GuardValue gv0 | ssaControlsBranchEdge(t, bb1, bb2, gv0) and smaller(gv0, gv)) and | ||
| ( | ||
| exists(GuardValue gv0 | ssaControlsBranchEdge(t, bb1, bb2, gv0) and gv0.isSingleton()) |
There was a problem hiding this comment.
Again, I think any(GuardValue gv0 | ssaControlsBranchEdge(t, bb1, bb2, gv0)).isSingleton() is a bit nicer.
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, SsaDefinition t, | ||
| GuardValueOrAny val | ||
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, GuardValue condgv, | ||
| SsaDefinition t, GuardValueOption val |
There was a problem hiding this comment.
I think the QL doc ought to mention condgv.
| private predicate stepSsaValuePhi( | ||
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, SsaDefinition t1, | ||
| SsaDefinition t2, GuardValueOrAny val | ||
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, GuardValue condgv, |
| private predicate stepSsaValueNoRedef( | ||
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, SsaDefinition t, | ||
| GuardValue val | ||
| SourceVariable src, BasicBlock bb1, BasicBlock bb2, SourceVariable var, GuardValue condgv, |
| private predicate sourceReachesBlockWithTrackedVar( | ||
| SsaDefinition srcDef, BasicBlock srcBb, SsaDefinition def, BasicBlock bb, FinallyStack fs, | ||
| SsaDefOption tracked, GuardValueExt val, SourceVariable var | ||
| SsaDefOption tracked, GuardValueOption val, SourceVariable var, GuardValue condgv |
All fixed! |
QA for 2.23.1 highlighted performance regressions in two repos for Java
NullMaybe.ql. The underlying reason was that the control-flow reachability calculated too many splits with irrelevant values that couldn't determine any relevant branches.This PR restricts the tracked values to those that may affect a relevant branch and at the same time improves precision by including the branch-related value-split in the universal quantification (the
forall). We prioritise tracked values such that singletons take priority over any other value as that's maximally precise, for non-singletons we discard those values that cannot determine a particular branch.