JIT: refine SSA info in ambiguously threaded blocks#126907
JIT: refine SSA info in ambiguously threaded blocks#126907AndyAyersMS wants to merge 1 commit intodotnet:mainfrom
Conversation
If we partially jump-thread a block with PHIs, the remaining preds that target block may all bring in the same SSA def for some of the PHIs. If so, remove the PHIs and update the SSA/VN for the PHI uses in the block. In particular there may just be one ambiguous pred. Also: enhance AP slightly to handle non-negative cases. Fixes dotnet#126703. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@EgorBo PTAL |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
Refines CoreCLR JIT optimizations around jump-threading through PHI-containing blocks by improving SSA/VN rewriting when only ambiguously-threaded predecessors remain, and adds a small assertion propagation enhancement to fold comparisons against negative constants when the compared value is proven non-negative.
Changes:
- Track and remove redundant PHI-def statements in partially jump-threaded blocks when remaining incoming paths agree on the same SSA def, and rewrite SSA/VN for affected uses.
- Adjust successor SSA replacement computation to reflect post-threading reaching paths.
- Enhance assertion propagation to fold
==/!=against negative constants when the value is known non-negative; add a regression test with disasm CHECKs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/tests/JIT/opt/RedundantBranch/JumpThreadPhi.cs | Adds disasm CHECK expectations to validate improved codegen for the repro pattern. |
| src/coreclr/jit/redundantbranchopts.cpp | Implements PHI-use rewriting + optional PHI removal when ambiguous preds converge on a single SSA def after partial threading. |
| src/coreclr/jit/assertionprop.cpp | Adds folding for X ==/!= (negative const) when X is proven never-negative. |
| // CHECK: cmp eax, 32 | ||
| // CHECK-NOT: cmove | ||
| // CHECK-NOT: cmp eax, -1 |
There was a problem hiding this comment.
The new disasm CHECK patterns are very xarch- and register-specific (e.g., tzcnt, cmp eax, 32, cmove). If these CHECKs are evaluated on non-xarch targets, or if register allocation/codegen changes (e.g., using a different register than eax), this test may become flaky/fail without a regression in the optimization. Consider making the CHECKs architecture-neutral (e.g., checking for a semantic helper/pattern rather than exact x86 mnemonics/registers), or otherwise constraining the CHECKs so they only run for the intended target(s).
| // CHECK: cmp eax, 32 | |
| // CHECK-NOT: cmove | |
| // CHECK-NOT: cmp eax, -1 | |
| // CHECK: cmp{{.*}}, 32 | |
| // CHECK-NOT: cmov | |
| // CHECK-NOT: cmp{{.*}}, -1 |
| } | ||
|
|
||
| // Can we fold "X ==/!= negative-constant" when X is known to be non-negative? | ||
| if (tree->OperIs(GT_EQ, GT_NE)) |
There was a problem hiding this comment.
Pretty sure it should be handled by
below this code. If it doesn't we need to check why, I can quickly check
There was a problem hiding this comment.
We might also be able to evaluate better in VN? Perhaps that would be a simpler or complementary change (we could then fully jump thread).
Eg VN should know that (EQ (never-negative V3.2), -1) is always false.
There was a problem hiding this comment.
We might also be able to evaluate better in VN?
I am not sure we can, in your case (I've checked locally) we know that the left side is never negative due to X >= 1 assertion (I analyzed a random SPMI diff generated by this change specifically)
So I've checked locally - this change does generate a couple of diffs, but mainly because for some reason valueTreeVN = optConservativeNormalVN(valueTree) returns some other VN than
auto relopVN = optConservativeNormalVN(tree);
VNFuncApp funcApp;
bool found = vnStore->GetVNFunc(relopVN, &funcApp);
auto relopOp1VN = funcApp.m_args[0];
auto relopOp2VN = funcApp.m_args[1];I'm not sure why it doesn't match. Both valueTreeVN and relopOp1VN are normalized VN MemOpaque:NotInLoop
There was a problem hiding this comment.
I propose we don't check in this change and I'll investigate why VN don't match. Because this effectively duplicate the logic inside GetRangeFromAssertions for relops.
Ideally, GetRangeFromAssertions should recursively call itself for op1 and op2 for a relop, and see if it can evaluate the relations between two ranges e.g. [1..int.Max] != [-1..-1] -> always true.
There was a problem hiding this comment.
Yeah, so it seems some optimization physically changed trees in a relop. So now VNFuncApp's op1 and op2 for that relop don't match actual VNs for tree->gtOp1 and gtOp2. I can workaround it
There was a problem hiding this comment.
RBO can change the VN on the compare, perhaps it's the culprit?
runtime/src/coreclr/jit/redundantbranchopts.cpp
Lines 2430 to 2456 in e697a30
I suppose if we are going to change things we should try to do so consistently. But it is not generally possible to incrementally repair VNs.
There was a problem hiding this comment.
@AndyAyersMS I've filed a PR that handles it slightly simpler and in a more general way (and more diffs): #126917




If we partially jump-thread a block with PHIs, the remaining preds that target block may all bring in the same SSA def for some of the PHIs. If so, remove the PHIs and update the SSA/VN for the PHI uses in the block.
In particular there may just be one ambiguous pred.
Also: enhance AP slightly to handle non-negative cases compared to negative values.
Fixes #126703.