Ssa: Trim the use-use relation to skip irrelevant nodes by aschackmull · Pull Request #19044 · github/codeql

aschackmull · 2025-03-17T13:08:28Z

This PR contains 3 tweaks to the shared SSA use-use step relation in the data flow integration module. Each of them trims the step relation in order to generate fewer nodes and fewer edges.

WriteDefinitions are skipped for Java. There should be no need to include an extra node in the step from the RHS of an assignment to the first use of the variable, but some languages may depend on the flow out of definitions for now, so so far this is opt-in only.
Synthetic reads on the input edges to phi nodes are necessary for proper BarrierGuards. But it's only a fraction of them that are actually potentially needed by any guard, so we can restrict their creation quite a bit. Furthermore, I've added a hook to allow certain SSA usages to skip these nodes entirely if BarrierGuards are not going to be used. I expect to use this option in the VariableCapture use-case.
Finally, phi nodes only exist as intermediate nodes to prevent blow-ups in the number of edges, so if the successor is unique, we can safely skip the phi node.

Copilot

Copilot wasn't able to review any files in this pull request.

Files not reviewed (2)

java/ql/lib/semmle/code/java/dataflow/internal/SsaImpl.qll: Language not supported
shared/ssa/codeql/ssa/Ssa.qll: Language not supported

Tip: Copilot only keeps its highest confidence comments to reduce noise and keep you focused. Learn more

geoffw0 · 2025-03-18T11:51:23Z

Looking at the DCA runs:

CPP has 4 new and 4 lost results for cpp/invalid-pointer-deref. They're not just moved results. I'm not sure what's going on with these.
Rust shows a large increase in data flow inconsistencies, which should be understood before we merge this. There's also a small (but surprising) improvement to taint reach and a 3.8% analysis slowdown (I'm not sure if either is significant).
Swift hasn't been run (yet)

I'm a bit surprised because from the PR description I wasn't expecting to see changes in results.

aschackmull · 2025-03-18T12:09:45Z

I've restarted dca after fixing a performance bug, the 3.8% slowdown on Rust is now 0.1% instead. As for the data flow inconsistencies (for e.g. Rust), these were fixed by the commit "SSA: Skip identity steps".

There should be no need to run dca for Swift, since Swift doesn't use the Data Flow Integration module.

As I mentioned on slack, result changes should not occur, but they do for C++ due to the somewhat ad-hoc DataFlow::flowsToBackEdge barrier, which breaks the SSA abstraction boundary somewhat.

aschackmull · 2025-03-18T14:30:24Z

Hmm, looks like at least Java needs some tweaking before this can work as intended. The dependence of the trimming on guard.controls causes the SSA stage to collapse with range analysis. And the C++ situation also needs some additional thought to address the result differences. Let me put this in draft for now.

aschackmull · 2025-03-26T08:19:21Z

I've fixed the caching aspect for Java, and introduced a hook for C++, such that the flowsToBackEdge barrier ought to keep working as-is. For C++ this shaves a bit of the potential improvement, but we still get a somewhat decent node/edge reduction.

MathiasVP

C++ 👍 I'll leave the shared parts to someone else, as I don't really have time to look at this now, unfortunately 😭

hvitved

LGTM, a few questions.

hvitved · 2025-03-28T07:49:25Z

shared/ssa/codeql/ssa/Ssa.qll

    predicate guardControlsBlock(Guard guard, BasicBlock bb, boolean branch);
+
+    /**
+     * Holds if `WriteDefinition`s should be included as an intermediate node


This is restricted to certain WriteDefinitions, right?

Yes and no. Flipping this switch will remove all certain writes from the use-use steps. It will also cause us to step over uncertain write nodes for the RHS-to-first-use steps, but those nodes will remain in the graph as stepping stones from prior uses. It's potentially possible to skip over uncertain writes when stepping from prior uses, but that requires some additional analysis to ensure that we only do so when it won't cause blowups, and the potential gain seemed so small that I didn't bother with that particular graph reduction.

hvitved · 2025-03-28T07:56:30Z

shared/ssa/codeql/ssa/Ssa.qll

+    private predicate relevantPhiInputNode(SsaPhiExt phi, BasicBlock input) {
+      DfInput::supportBarrierGuardsOnPhiEdges() and
+      // If the input isn't explicitly read then a guard cannot check it.
+      exists(DfInput::getARead(getAPhiInputDef(phi, input))) and


So this means that for

if (b) { x = foo; read(x); } else { x = bar; } read(x);

we will not generate an input edge from the else block?

hvitved · 2025-03-28T08:00:40Z

shared/ssa/codeql/ssa/Ssa.qll

+      (
+        exists(DfInput::Guard g | g.controlsBranchEdge(input, phi.getBasicBlock(), _))
+        or
+        exists(BasicBlock prev |


I think this could deserve a comment with an example

hvitved · 2025-03-28T08:06:08Z

shared/ssa/codeql/ssa/Ssa.qll

+        if phiHasUniqNextNode(phi)
+        then flowFromRefToNode(v, bbPhi, -1, nodeTo)
+        else nodeTo.(SsaDefinitionExtNodeImpl).getDefExt() = phi


Pull out into separate predicate to avoid code duplication?

aschackmull added the no-change-note-required This PR does not need a change note label Mar 17, 2025

Copilot AI review requested due to automatic review settings March 17, 2025 13:08

aschackmull requested a review from a team as a code owner March 17, 2025 13:08

Copilot AI reviewed Mar 17, 2025

View reviewed changes

github-actions bot added the Java label Mar 17, 2025

aschackmull requested review from a team as code owners March 18, 2025 09:44

github-actions bot added C# Ruby Rust Pull requests that update Rust code labels Mar 18, 2025

aschackmull marked this pull request as draft March 18, 2025 14:31

aschackmull added 13 commits March 25, 2025 12:31

SSA: Add support for skipping WriteDefinitions in use-use.

5aa7029

Java: Skip SSA definition nodes in data flow.

7c82f51

SSA: Rename SsaInputDefinitionExt

c778bf6

SSA: Skip irrelevant phi input nodes.

669f926

SSA: Skip phi nodes with unique successor.

4e2ad97

SSA: Skip identity steps.

36532bc

SSA: Fix a poor join-order and avoid SSA recomputation.

0162b84

C#: Accept test changes.

b3bea97

Java: Accept test changes.

f27e819

Ruby: Accept test changes.

e7e5f75

Rust: Accept test changes.

ae47339

C++: Keep all phi input back edges.

4d04391

Java/SSA: Keep proper distinction between cached stages.

d5d0274

aschackmull force-pushed the ssa/useuse-trim branch from f797ede to d5d0274 Compare March 25, 2025 12:44

github-actions bot added JS C++ labels Mar 25, 2025

C++: Accept test changes.

8749bdb

aschackmull marked this pull request as ready for review March 26, 2025 08:19

aschackmull requested review from a team as code owners March 26, 2025 08:19

MathiasVP reviewed Mar 27, 2025

View reviewed changes

hvitved reviewed Mar 28, 2025

View reviewed changes

SSA: Address review comments.

c6cee48

hvitved approved these changes Mar 28, 2025

View reviewed changes

aschackmull merged commit 0c74f21 into github:main Mar 28, 2025
59 checks passed

aschackmull deleted the ssa/useuse-trim branch March 28, 2025 10:55

Conversation

aschackmull commented Mar 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

geoffw0 commented Mar 18, 2025

Uh oh!

aschackmull commented Mar 18, 2025

Uh oh!

aschackmull commented Mar 18, 2025

Uh oh!

aschackmull commented Mar 26, 2025

Uh oh!

MathiasVP left a comment

Choose a reason for hiding this comment

Uh oh!

hvitved left a comment

Choose a reason for hiding this comment

Uh oh!

hvitved Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

aschackmull Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

hvitved Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

aschackmull Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

hvitved Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

hvitved Mar 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants