-
Notifications
You must be signed in to change notification settings - Fork 415
Fix bit-select sensitivity lists leaking LLHD ops (Fixes #9469) #9481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@fabianschuiki done! |
9eaf40c to
ff25200
Compare
|
@fabianschuiki can you approve the workflow to confirm once? |
- Add complete CVE submission package for CIRCT array indexing vulnerability - Include Docker reproduction environment for Ubuntu 24.04 x64 - Add comprehensive technical report (15KB) with CVSS 5.3 scoring - Include test cases (vulnerable + workaround code) - Add automated test scripts and result verification - Document complete reproduction workflow - Tested successfully on macOS M3 Pro with Docker Issue: llvm/circt#9469 Fix PR: llvm/circt#9481 CVSS: 5.3 (MEDIUM) - AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:L/A:L
|
Hey @5iri, thanks a lot for taking a stab at fixing this! I see you've had to tweak things in a few different places, spread across Deseq, Mem2Reg, and MooreToCore. Are all of these necessary? The changes feel to very specifically allow for |
|
ohh yeah I didn't think about that! I was kind of fixated on fixing it for that particular example that I was going through each layer to fix them and not the other way around. I will work on a[n] array indexing and would also need to work on multi-bit. I’m going to broaden the trigger handling to cover any comb.extract (array/struct projections too), hoist that bit out of the process, and add a regression with @(posedge clkin_data[n]). |
|
That sounds great! My suspicion is that you probably don't need to have code handling all this specific cases in a lot of different places. This is probably down to Mem2Reg not handling field projections well enough, or Deseq having to be tought about projections. |
|
@fabianschuiki I think I found the issue! in I think this can be done by a simple general fix which is to make |
|
it still seems to fail :( I am now thinking there is a problem in the way the bit-select is observed. since I see from what I have observed, trigger isn’t one of the hard-coded patterns. To make this work generally for any single-bit bit‑select clock, we need to relax EDIT: |
|
The fact that Deseq sees a Deseq currently assumes that there are no aliases and that a clock has been reduced to a single SSA value. In theory, if Mem2Reg and Common Subexpression Elimination do their job properly, the clock value seen by Deseq should still be a single SSA value, just coming from a single |
|
you are right! I do see multiple I'll share the snippet below |
|
Thanks for sharing! That I'd suggest starting with a simpler, minimal input example that compares a version that works with a version that doesn't: module Foo(input bit clk, input bit x, output bit y);
always_ff @(posedge clk) x <= y;
endmodule
module Bar(input bit [41:0] clk, input bit x, output bit y);
always_ff @(posedge clk[9]) x <= y;
endmoduleRunning this through hw.module @Foo(in %clk : i1, in %x : i1, out y : i1) {
%0 = llhd.constant_time <0ns, 0d, 1e>
%1 = llhd.constant_time <0ns, 1d, 0e>
%true = hw.constant true
%false = hw.constant false
%clk_0 = llhd.sig name "clk" %false : i1
%2 = llhd.prb %clk_0 : i1
%x_1 = llhd.sig name "x" %false : i1
%y = llhd.sig %false : i1
llhd.process {
cf.br ^bb1
^bb1: // 3 preds: ^bb0, ^bb2, ^bb3
%4 = llhd.prb %clk_0 : i1 // <--- single clock bit
llhd.wait (%2 : i1), ^bb2
^bb2: // pred: ^bb1
%5 = llhd.prb %clk_0 : i1 // <--- single clock bit
%6 = comb.xor bin %4, %true : i1
%7 = comb.and bin %6, %5 : i1
cf.cond_br %7, ^bb3, ^bb1
^bb3: // pred: ^bb2
%8 = llhd.prb %y : i1
llhd.drv %x_1, %8 after %1 : i1
cf.br ^bb1
}
llhd.drv %clk_0, %clk after %0 : i1
llhd.drv %x_1, %x after %0 : i1
%3 = llhd.prb %y : i1
hw.output %3 : i1
}
hw.module @Bar(in %clk : i42, in %x : i1, out y : i1) {
%0 = llhd.constant_time <0ns, 0d, 1e>
%1 = llhd.constant_time <0ns, 1d, 0e>
%true = hw.constant true
%false = hw.constant false
%c0_i42 = hw.constant 0 : i42
%clk_0 = llhd.sig name "clk" %c0_i42 : i42
%2 = llhd.prb %clk_0 : i42
%x_1 = llhd.sig name "x" %false : i1
%y = llhd.sig %false : i1
llhd.process {
cf.br ^bb1
^bb1: // 3 preds: ^bb0, ^bb2, ^bb3
%4 = llhd.prb %clk_0 : i42 // <--- multiple clock bits
%5 = comb.extract %4 from 9 : (i42) -> i1 // <--- multiple clock bits
llhd.wait (%2 : i42), ^bb2
^bb2: // pred: ^bb1
%6 = llhd.prb %clk_0 : i42 // <--- multiple clock bits
%7 = comb.extract %6 from 9 : (i42) -> i1 // <--- multiple clock bits
%8 = comb.xor bin %5, %true : i1
%9 = comb.and bin %8, %7 : i1
cf.cond_br %9, ^bb3, ^bb1
^bb3: // pred: ^bb2
%10 = llhd.prb %y : i1
llhd.drv %x_1, %10 after %1 : i1
cf.br ^bb1
}
llhd.drv %clk_0, %clk after %0 : i42
llhd.drv %x_1, %x after %0 : i1
%3 = llhd.prb %y : i1
hw.output %3 : i1
}This looks pretty good and as you would expect. Running this further to see how things look just after LLHD's hoist signals pass with hw.module @Foo(in %clk : i1, in %x : i1, out y : i1) {
%0 = llhd.constant_time <0ns, 0d, 1e>
%true = hw.constant true
%false = hw.constant false
%clk_0 = llhd.sig name "clk" %false : i1
%1 = llhd.prb %clk_0 : i1
%x_1 = llhd.sig name "x" %false : i1
%y = llhd.sig %false : i1
%2 = llhd.constant_time <0ns, 1d, 0e>
%3:2 = llhd.process -> i1, i1 {
%false_2 = hw.constant false
%false_3 = hw.constant false
cf.br ^bb1(%1, %false_2, %false_3 : i1, i1, i1)
^bb1(%5: i1, %6: i1, %7: i1): // 3 preds: ^bb0, ^bb2, ^bb3
llhd.wait yield (%6, %7 : i1, i1), (%1 : i1), ^bb2(%5 : i1)
^bb2(%8: i1): // pred: ^bb1
%9 = comb.xor bin %8, %true : i1
%10 = comb.and bin %9, %1 : i1
%false_4 = hw.constant false
%false_5 = hw.constant false
cf.cond_br %10, ^bb3, ^bb1(%1, %false_4, %false_5 : i1, i1, i1)
^bb3: // pred: ^bb2
%true_6 = hw.constant true
cf.br ^bb1(%1, %4, %true_6 : i1, i1, i1)
}
llhd.drv %x_1, %3#0 after %2 if %3#1 : i1
llhd.drv %clk_0, %clk after %0 : i1
llhd.drv %x_1, %x after %0 : i1
%4 = llhd.prb %y : i1
hw.output %4 : i1
}
hw.module @Bar(in %clk : i42, in %x : i1, out y : i1) {
%0 = llhd.constant_time <0ns, 0d, 1e>
%true = hw.constant true
%false = hw.constant false
%c0_i42 = hw.constant 0 : i42
%clk_0 = llhd.sig name "clk" %c0_i42 : i42
%1 = llhd.prb %clk_0 : i42
%x_1 = llhd.sig name "x" %false : i1
%y = llhd.sig %false : i1
%2 = llhd.constant_time <0ns, 1d, 0e>
%3:2 = llhd.process -> i1, i1 {
%5 = llhd.prb %clk_0 : i42
%false_2 = hw.constant false
%false_3 = hw.constant false
cf.br ^bb1(%5, %false_2, %false_3 : i42, i1, i1)
^bb1(%6: i42, %7: i1, %8: i1): // 3 preds: ^bb0, ^bb2, ^bb3
%9 = comb.extract %6 from 9 : (i42) -> i1
llhd.wait yield (%7, %8 : i1, i1), (%1 : i42), ^bb2
^bb2: // pred: ^bb1
%10 = llhd.prb %clk_0 : i42
%11 = comb.extract %10 from 9 : (i42) -> i1
%12 = comb.xor bin %9, %true : i1
%13 = comb.and bin %12, %11 : i1
%false_4 = hw.constant false
%false_5 = hw.constant false
cf.cond_br %13, ^bb3, ^bb1(%10, %false_4, %false_5 : i42, i1, i1)
^bb3: // pred: ^bb2
%true_6 = hw.constant true
cf.br ^bb1(%10, %4, %true_6 : i42, i1, i1)
}
llhd.drv %x_1, %3#0 after %2 if %3#1 : i1
llhd.drv %clk_0, %clk after %0 : i42
llhd.drv %x_1, %x after %0 : i1
%4 = llhd.prb %y : i1
hw.output %4 : i1
}The important difference here seems to be this (lots of identical stuff deleted): // hw.module @Foo
%1 = llhd.prb %clk_0 : i1
%3:2 = llhd.process -> i1, i1 {
cf.br ^bb1(%1, %false_2, %false_3 : i1, i1, i1) // <--- passes global %1 to bb1 arg %5
^bb1(%5: i1, %6: i1, %7: i1): // 3 preds: ^bb0, ^bb2, ^bb3
llhd.wait yield (%6, %7 : i1, i1), (%1 : i1), ^bb2(%5 : i1) // <--- old clock is a block arg
^bb2(%8: i1): // pred: ^bb1
// detect posedge with %8 and %1
cf.cond_br %10, ^bb3, ^bb1(%1, %false_4, %false_5 : i1, i1, i1) // <--- passes global %1 to bb1 arg %5
}
// hw.module @Bar
%1 = llhd.prb %clk_0 : i42
%3:2 = llhd.process -> i1, i1 {
%5 = llhd.prb %clk_0 : i42 // <--- probe not hoisted out for some reason
cf.br ^bb1(%5, %false_2, %false_3 : i42, i1, i1) // <--- passes local %5 to bb1 arg %6
^bb1(%6: i42, %7: i1, %8: i1): // 3 preds: ^bb0, ^bb2, ^bb3
%9 = comb.extract %6 from 9 : (i42) -> i1
llhd.wait yield (%7, %8 : i1, i1), (%1 : i42), ^bb2 // <--- old clock not a block arg
^bb2: // pred: ^bb1
%10 = llhd.prb %clk_0 : i42 // <--- probe not hoisted out for some reason
%11 = comb.extract %10 from 9 : (i42) -> i1
// detect posedge with %9 and %11
cf.cond_br %13, ^bb3, ^bb1(%10, %false_4, %false_5 : i42, i1, i1) // <--- passes local %10 to bb1 arg %6
}For some reason, LLHD's HoistSignals pass does not properly hoist that hw.module @Bar(in %clk : i42, in %x : i1, out y : i1) {
%0 = llhd.constant_time <0ns, 0d, 1e>
%true = hw.constant true
%false = hw.constant false
%c0_i42 = hw.constant 0 : i42
%clk_0 = llhd.sig name "clk" %c0_i42 : i42
%1 = llhd.prb %clk_0 : i42
%x_1 = llhd.sig name "x" %false : i1
%y = llhd.sig %false : i1
%2 = llhd.constant_time <0ns, 1d, 0e>
%3:2 = llhd.process -> i1, i1 {
%false_2 = hw.constant false
%false_3 = hw.constant false
cf.br ^bb1(%1, %false_2, %false_3 : i42, i1, i1)
^bb1(%6: i42, %7: i1, %8: i1): // 3 preds: ^bb0, ^bb2, ^bb3
%9 = comb.extract %6 from 9 : (i42) -> i1
llhd.wait yield (%7, %8 : i1, i1), (%1 : i42), ^bb2(%9 : i1)
^bb2(%10: i1): // pred: ^bb1
%11 = comb.extract %1 from 9 : (i42) -> i1
%12 = comb.xor bin %10, %true : i1
%13 = comb.and bin %12, %11 : i1
%false_4 = hw.constant false
%false_5 = hw.constant false
cf.cond_br %13, ^bb3, ^bb1(%1, %false_4, %false_5 : i42, i1, i1)
^bb3: // pred: ^bb2
%true_6 = hw.constant true
cf.br ^bb1(%1, %4, %true_6 : i42, i1, i1)
}
llhd.drv %x_1, %3#0 after %2 if %3#1 : i1
llhd.drv %clk_0, %clk after %0 : i42
llhd.drv %x_1, %x after %0 : i1
%4 = llhd.prb %y : i1
hw.output %4 : i1
}Once this is hoisted out properly, we can go teach Deseq to look through |
|
I hadn't checked whether the hoisting works in my current solution and it seems to be working as expected. The current implementation that I have done doesn't have the multiple probe issue. I now think that this might not be the best solution to move forward, as
The relaxation of clock matching is my current issue and for that I am unable to find another solution. |
|
I think should be able to make sure the extracted clock bit is carried as the |
|
Hey @fabianschuiki, I've tried to investigate further and wanted to share what I found before deciding how to proceed. TLDR: My I compared the IR pipeline with and without my changes using your test case: Without the changes:
https://gist.github.com/5iri/f9c87098d5319841552b91ec8e47c96a With my changes:
https://gist.github.com/5iri/0225fa19130fdfb7115a0578c2f33004 The fix is in Mem2Reg's Before I go down the HoistSignals route, I wanted to check: Is there an architectural reason why fixing this in Mem2Reg is the wrong approach?
|
|
Oh this is super cool, thanks for your detailed investigation. I think you're onto something: the fact that the This makes me think that we are looking at two distinct issues:
Would you mind creating distinct PRs for these issues? I think the solution for the Mem2Reg issue can be very simple: instead of capturing probes (or projections of probes) that are live across waits, let's capture every value that's live across the wait. The pass already creates a With Mem2Reg fixed, we can cook up a fix for the Deseq issue. I think we should just replace |
|
yeah this makes sense! I'll work on the PRs right away! Thank you so much for your help! |
Summary of the fixes made
MooreToCore.cpp- Madellhd.waitobserve thei1extracted bit instead of thei8signal:- Added logic to trace extract ops to their underlying signal
- Replace multi-bit observed values with their
i1trigger counterpartsMem2Reg.cpp- Extended captureAcrossWait to handle comb.extract results:Deseq.cpp- Two fixes:The result:
@(posedge data[0])now correctly producesseq.firregwith theextracted bit as the clock and solves #9469