Actions: Fix bad performance in getTargetPath#19186
Merged
Conversation
Seen on `github/codeql`, some queries had very poor performance:
```
[2/24 eval 36m4s] Evaluation done; writing results to
codeql/actions-queries/Security/CWE-312/ExcessiveSecretsExposure.bqrs
```
Investigating further lead to the following worrying sequence of joins
(after I ran out of patience and cancelled the query):
```
[2025-04-01 12:31:03] Tuple counts for
Yaml::YamlInclude.getTargetPath/0#dispred#32565107#fb#reorder_1_0/2@i6#9f4b2jw1
after 8m40s:
...
559418 ~33% {1} r5 = SCAN
`Yaml::YamlNode.getLocation/0#dispred#24555c57#prev_delta` OUTPUT In.1
...
909345525 ~821% {3} r7 = JOIN r5 WITH
`Yaml::YamlNode.getLocation/0#dispred#24555c57#prev` CARTESIAN PRODUCT
OUTPUT Rhs.1, Lhs.0 'result', Rhs.0
909342139 ~779% {3} | JOIN WITH
`Locations::Location.getFile/0#dispred#dcf38c8d#prev` ON FIRST 1 OUTPUT
Rhs.1, Lhs.1 'result', Lhs.2
909338753 ~794% {3} | JOIN WITH containerparent_10#join_rhs
ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'result', Lhs.2
909335367 ~824% {3} | JOIN WITH
`FileSystem::Container.getAbsolutePath/0#dispred#d234e6fa` ON FIRST 1
OUTPUT Lhs.2, Lhs.1 'result', Rhs.1
883246724 ~812% {3} | JOIN WITH
`Yaml::YamlNode.getDocument/0#dispred#ee1eb3bf#bf_10#join_rhs` ON FIRST
1 OUTPUT Rhs.1 'this', Lhs.1 'result', Lhs.2
760047185 ~838% {5} | JOIN WITH yaml_scalars ON FIRST 1
OUTPUT Lhs.1 'result', Lhs.0 'this', Rhs.2, _, Lhs.2
0 ~0% {4} | REWRITE WITH Tmp.3 := "/", Out.3 :=
(In.4 ++ Tmp.3 ++ InOut.2), TEST Out.3 = InOut.0 KEEPING 4
{4} | REWRITE WITH NOT [TEST InOut.2
startsWith "/"]
...
```
The culprit turned out to be the following method on class `YamlInclude`
```ql
private string getTargetPath() {
exists(string path | path = this.getValue() |
if path.matches("/%")
then result = path
else
result =
this.getDocument().getLocation().getFile().getParentContainer().getAbsolutePath()
+ "/" +
path
)
}
```
Basically, in the `else` branch, the evaluator was producing all
possible values of `result` before filtering out the ones where the
`path` component started with a forward slash.
To fix this, I opted to factor out the logic into two helper predicates,
each accounting for whether `this.getValue()` does or does not start
with a `/`. With this, evaluating the original query from a clean cache
takes roughly 3.3s.
Contributor
There was a problem hiding this comment.
Copilot wasn't able to review any files in this pull request.
Files not reviewed (1)
- shared/yaml/codeql/yaml/Yaml.qll: Language not supported
Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more
adityasharad
previously approved these changes
Apr 1, 2025
Collaborator
adityasharad
left a comment
There was a problem hiding this comment.
Nice work, and makes sense. Just one question.
adityasharad
approved these changes
Apr 1, 2025
Collaborator
adityasharad
left a comment
There was a problem hiding this comment.
Considering a change note, but I think I'd rather attach one to the extractor change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Seen on
github/codeql, some queries had very poor performance:Investigating further lead to the following worrying sequence of joins (after I ran out of patience and cancelled the query):
The culprit turned out to be the following method on class
YamlIncludeBasically, in the
elsebranch, the evaluator was producing all possible values ofresultbefore filtering out the ones where thepathcomponent started with a forward slash.To fix this, I opted to factor out the logic into two helper predicates, each accounting for whether
this.getValue()does or does not start with a/. With this, evaluating the original query from a clean cache takes roughly 3.3s.