Martien.physreg liveranges by martien-de-jong · Pull Request #747 · Xilinx/llvm-aie

martien-de-jong · 2025-12-31T10:00:13Z

This is a POC of register allocation during postpipelining.

We add

RegDefUseTracker, taking care of virtualizing safe live ranges,
SchedulingInterpreter, computing register live ranges based on scheduled pipeline timing
PostRegAlloc, using the two to allocate the virtual registers after postpipelining.
Scarce range scheduling, a postpipelining scheduler heuristic avoiding overlap between liveranges that compete for a single register

Status:
It's aggressive enough to reach II=7 on gemm-bfp16-opt0, but sadly, the code it produces is not correct. I'm trying to find out what is causing my diff failure.

llvm/lib/Target/AIE/AIERegDefUseTracker.cpp

martien-de-jong · 2026-01-08T12:15:05Z

llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h

    /// register. There may be multiple current definitions for a register with
    /// disjunct lanemasks.
-    VReg2SUnitMultiMap CurrentVRegDefs;
+    VReg2SUnitOperIdxMultiMap CurrentVRegDefs;


This was asymmetric between Uses and Defs. We need the operand index of the outstanding defs to compute operand latencies.

martien-de-jong · 2026-01-08T12:18:13Z

llvm/lib/Target/AIE/AIEMaxLatencyFinder.cpp

+
+  // Use TRI's regsOverlap which handles both physical and virtual registers,
+  // including subregisters and lane masks
+  return TRI->regsOverlap(SrcReg, DstReg);


I guess this was only needed transiently, but it looks really good.

nice they work on RegUnits

martien-de-jong · 2026-01-08T12:21:55Z

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

-        PostSWP->isPostPipelineCandidate(*TheBlock))
-      staticallyMaterializeMultiSlotInstructions(*TheBlock, HR);
+        PostSWP->isPostPipelineCandidate(*TheBlock)) {
+      staticallyMaterializeMultiSlotInstructions(*TheBlock, HR, MaterializeAll);


Would have been nice to be able to skip the scheduler before postpipelining. Sadly, the scheduler sometimes makes better decisions.

martien-de-jong · 2026-01-08T12:26:08Z

llvm/lib/Target/AIE/AIEScheduleInterpreter.cpp

+    for (int T = 0; T < II; ++T) {
+      LaneBitmask Mask = LanesByOffset[T];
+      if (Mask.any()) {
+        // Show a simple indicator - could be enhanced to show actual lanes


Indeed. Full lanemasks are bulky though.

llvm/lib/Target/AIE/AIEPostRegAlloc.cpp

martien-de-jong · 2026-01-08T14:12:49Z

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

+static cl::opt<bool> TestRegDefUseTracker(
+    "aie-test-regdefuse-tracker", cl::Hidden, cl::init(false),
+    cl::desc("[AIE] TEST MODE: Run RegDefUseTracker analysis on all loops "
+             "(for testing only)"));


This is accommodating a dump for the early stages of live range analysis.

martien-de-jong · 2026-01-08T14:14:59Z

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

+
+void BlockState::restorePipelining() {
+  // Restore to the original allocation of the virtual registers
+  RegTracker->restoreOriginalPhysRegs();


These registers were used by the scheduler whose result we're going to use as a fallback.

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

martien-de-jong · 2026-01-13T14:15:51Z

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

+    BS.FixPoint.PipelinerMode = firstPipelinerMode();
+    if (BS.FixPoint.PipelinerMode != PostPipelinerMode::None) {
+      return SchedulingStage::Pipelining;
+    }


This looks a bit weird: we have been pipelining and are trying to restore to the first allowed pipelinermode for the next II. This should be invariant, so I don't think we can get None here. Perhaps assert.

martien-de-jong · 2026-01-13T14:21:24Z

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp

+
+  // For virtual mode, re-analyze and virtualize
+  if (FixPoint.PipelinerMode == PostPipelinerMode::Virtual) {
+    // RegTracker might not exist if we have multiple regions


Someone missed that we can't do physical mode either if we have more than one region.
I would hope that RegTracker is always there for a SWP candidate.

Also for virtual registers. For architectures where latencies can go negative, this has impact on RecMII

The old check lines matched, ignoring the leading nops.

The FixPoint updaters just return the new state.

This is abstracting the live ranges to be used by PostRegAlloc

This module analyses live ranges of physical registers that can be safely reallocated in a basic block. It supplies facilities to rewrite to virtual registers and to restore the original allocation.

This module produces an EventSchedule from the instructions and their issue cycle. The event schedule contains the read and write events of the virtual registers occuring in the instructions ordered in the processor pipeline stage timeline. From the EventSchedule the modulo liveranges for a particular II can be constructed. These represent the lanes of each register that are live at a particular point.

This is a dedicated register allocator for use by the postpipeliner We compute some metrics, and run with a few different scorefunctions on those metrics to define an allocation order. We allocate in that ordeer, and fail as soon as we can't find a register that is available over the live range.

This is a strategy that prioritizes scheduling of scarce ranges. Scarce ranges are live ranges that compete for one svailable register. The live ranges are virtualized, which means we have no serializing WAR deps. However, we need to be careful not to have more than one live, which means we want to finish the range before starting a new one. We try all legal permutations of these live ranges. For the current live range, we first prioritize all its ancestors, then the instructions in the range itself. Once we are finished with the range, we simulate the WAR dependences that are necessary to keep the next ranges non-overlapping

F-Stuckmann · 2026-02-05T15:14:19Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

+    const AIEHazardRecognizer &HR,
+    ResourceScoreboard<FuncUnitWrapper> &Scoreboard) {
+  const int Step = fromTop() ? 1 : -1;
+  if (Step < 0) {


Why does fromTop change the Stepping direction? Also why is the First and Last in the wrong order?

andcarminati · 2026-02-06T08:54:57Z

llvm/test/CodeGen/AIE/aie2/schedule/postpipeliner/gemm-nopresched.mir

  ; CHECK-NEXT:    vshuffle x2, x8, x0, r16; vmac.f bml5, bml5, x9, x7, r2
  ; CHECK-NEXT:  .L_LEnd0:
-  ; CHECK-NEXT:    vshuffle x10, x1, x3, r3; vmac.f bmh4, bmh4, x6, x5, r2
+  ; CHECK-NEXT:    nopb ; nopa ; nops ; nopx ; vshuffle x10, x1, x3, r3; vmac.f bmh4, bmh4, x6, x5, r2


Curious, what induced this change?

andcarminati · 2026-02-06T09:29:07Z

llvm/lib/Target/AIE/AIEMultiSlotInstrMaterializer.cpp

+    auto *AltA = Formats->getAlternateInstsOpcode(A->getOpcode());
+    auto *AltB = Formats->getAlternateInstsOpcode(B->getOpcode());
+
+    return AltA->size() < AltB->size();


Check: less options, higher priority.

andcarminati · 2026-02-06T09:40:11Z

llvm/lib/Target/AIE/AIEMultiSlotInstrMaterializer.cpp

  auto SlotToBanks = getAssignedSlots(MBB, TII, HR);

-  if (!assignSlots(SlotToBanks, MBB, TII, HR)) {
+  if (assignSlots(SlotToBanks, MBB, TII, HR)) {


Maybe a comment here explaining that we care about memory banks first.

andcarminati · 2026-02-06T09:46:18Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

+  }
+
+  const int Limit = Last + Step;
+  for (int C = First; C != Limit; C += Step) {


The original implementation has some debug messages, are they not useful? Also, we could remove the original fit.

andcarminati · 2026-02-06T09:50:25Z

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

+    const AIEHazardRecognizer &HR,
+    ResourceScoreboard<FuncUnitWrapper> &Scoreboard) {
+  const int Step = fromTop() ? 1 : -1;
+  if (Step < 0) {


We could have something like:

if (!fromTop()) { std::swap(First, Last); }

It self documents the code.

I still find this whole buisness funny. Why exactly are we doing that?

F-Stuckmann · 2026-02-06T15:43:46Z

llvm/lib/Target/AIE/AIEScheduleInterpreter.h

+/// Key identifying a live range and its subregister
+struct LRKey {
+  unsigned LRId;      // Live range identifier
+  unsigned SubRegIdx; // Subregister index (0 for full register)


CHECK: is SubRegIdx sufficient for comparing registers of different regclasses like ex and x regs or ex and y regs?

I think this check could be more strict by checking if blocked regunits overlap

martien-de-jong requested review from F-Stuckmann, SagarMaheshwari99, abhinay-anubola, abnikant, andcarminati, katerynamuts, khallouh, konstantinschwarz, mludevid, niwinanto and stephenneuendorffer as code owners December 31, 2025 10:00

martien-de-jong marked this pull request as draft December 31, 2025 10:00

F-Stuckmann reviewed Jan 5, 2026

View reviewed changes

llvm/lib/Target/AIE/AIERegDefUseTracker.cpp Outdated Show resolved Hide resolved

martien-de-jong force-pushed the martien.physreg-liveranges branch 2 times, most recently from cc8fca8 to 33264be Compare January 8, 2026 10:48

martien-de-jong marked this pull request as ready for review January 8, 2026 11:13

martien-de-jong commented Jan 8, 2026

View reviewed changes

llvm/lib/Target/AIE/AIEPostRegAlloc.cpp Outdated Show resolved Hide resolved

martien-de-jong commented Jan 8, 2026

View reviewed changes

llvm/lib/Target/AIE/AIEPostRegAlloc.cpp Outdated Show resolved Hide resolved

martien-de-jong commented Jan 8, 2026

View reviewed changes

llvm/lib/Target/AIE/AIEPostRegAlloc.cpp Outdated Show resolved Hide resolved

martien-de-jong commented Jan 8, 2026

View reviewed changes

martien-de-jong force-pushed the martien.physreg-liveranges branch from 7930abc to dcc908c Compare January 13, 2026 12:49

martien-de-jong commented Jan 13, 2026

View reviewed changes

llvm/lib/Target/AIE/AIEInterBlockScheduling.cpp Show resolved Hide resolved

martien-de-jong commented Jan 13, 2026

View reviewed changes

[AIE][POSTPIPELINER] Register info in DAG dump

f49aebf

Martien de Jong added 13 commits January 23, 2026 12:38

[ScheduleDAGInstr] Compute accurate latencies for Anti dependencies

9ac5ec3

Also for virtual registers. For architectures where latencies can go negative, this has impact on RecMII

{AIE] Generalize MaxLatencyFinder for virtual registers

7e2ca99

[NFC] auto update of tests

ecea0e8

The old check lines matched, ignoring the leading nops.

[AIE] full MSP materialization based on slot statistics

6da8167

[AIE][INTERBLOCK] Central point to update the fixpoint stage

9e9ac49

The FixPoint updaters just return the new state.

[AIE][POSTPIPELINER] Reset RecMII when recomputing

b262088

[AIE][POSTPIPELINER] More generic interface to fit an instruction.

d265c28

[NFC] Remove verbose option in postpipeliner tests.

be21ef8

[AIE] Add LaneMaskVector

dce8912

This is abstracting the live ranges to be used by PostRegAlloc

[AIE] Add RegDefUseTracker

c70265a

This module analyses live ranges of physical registers that can be safely reallocated in a basic block. It supplies facilities to rewrite to virtual registers and to restore the original allocation.

martien-de-jong force-pushed the martien.physreg-liveranges branch from dcc908c to 133f034 Compare January 30, 2026 12:03

Martien de Jong added 3 commits January 30, 2026 13:47

Virtual pipeliner mode integration

052f5a3

ref updates

95d733c

add tests

699934a

martien-de-jong force-pushed the martien.physreg-liveranges branch from 133f034 to 699934a Compare January 30, 2026 12:48

F-Stuckmann reviewed Feb 5, 2026

View reviewed changes

andcarminati reviewed Feb 6, 2026

View reviewed changes

F-Stuckmann reviewed Feb 6, 2026

View reviewed changes

Conversation

martien-de-jong commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martien-de-jong Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

martien-de-jong commented Dec 31, 2025 •

edited

Loading

martien-de-jong Jan 13, 2026 •

edited

Loading