Jave: Use force local to make parsing local after global regex finding.#20378
Jave: Use force local to make parsing local after global regex finding.#20378
Conversation
90c450b to
0f65b20
Compare
0f65b20 to
2201974
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR optimizes regex flow analysis in Java by making parsing local after global regex finding. The change uses forceLocal to localize the usedAsRegex predicate while maintaining the global regex finding capabilities.
Key changes:
- Refactors the
usedAsRegexpredicate to use local evaluation withforceLocal - Extracts the original logic into a helper predicate
usedAsRegexImpl - Adds
overlay[local]annotation to the main predicate
| * Holds if `regex` is used as a regex, with the mode `mode` (if known). | ||
| * If regex mode is not known, `mode` will be `"None"`. | ||
| * | ||
| * As an optimisation, only regexes containing an infinite repitition quatifier (`+`, `*`, or `{x,}`) |
There was a problem hiding this comment.
There's a typo in the comment: 'repitition' should be 'repetition'.
|
Code change looks good. Do you have links to DCA experiments? Since we are moving the overlay frontier there is a risk of optimisation regressions under non-overlay evaluation, so I would expect a DCA experiment that shows no performance impact under non-overlay evaluation and a DCA experiment that shows little to no accuracy regression under overlay evaluation and possibly a speedup. |
|
I just ran DCA and I think the results look food: https://github.com/github/codeql-dca-main/blob/data/alexet/pr-20378-220197__nightly__nightly-queries/reports/summaries/time.theme.md |
|
Yes, timing results look good. There were extraction differences for |
|
smowton states that the extractor errors can't possibly be caused by this PR so should be disregarded. |
With this the regex parsing becomes local.
The assumption is that strings in the base don't become regexs or stop being regexs. From what I have seen that doesn't seem likely.