fix: Improve CUDA compute capability warning messages by AsadShahid04 · Pull Request #7658 · ai-dynamo/dynamo

AsadShahid04 · 2026-03-26T09:52:58Z

Add detailed logging of which GPU architectures are being compiled
Clearly show compiled architectures (sm_XX) in build output
Helps users diagnose 'no kernel image available' errors

Fixes #7569

Summary by CodeRabbit

Chores
- Enhanced CUDA architecture tracking and logging during the build process, now recording and reporting which architectures are compiled versus skipped.

copy-pr-bot · 2026-03-26T09:53:02Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-26T09:53:08Z

👋 Hi AsadShahid04! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2026-03-26T09:57:10Z

Walkthrough

Modified the get_cuda_arch_flags() function in the build script to track compiled versus skipped CUDA SM architectures during parsing of CUDA_ARCHS. Introduced skipped_archs and compiled_archs vectors to record architecture handling, with conditional aggregated warning emission for compiled architectures.

Changes

Cohort / File(s)	Summary
CUDA Architecture Tracking `lib/kvbm-kernels/build.rs`	Updated build script to track compiled and skipped CUDA SM architectures into separate vectors (`compiled_archs`, `skipped_archs`) and emit aggregated warning listing compiled architectures when at least one is accepted.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is incomplete; it lacks required sections such as Overview, Details, and Where should the reviewer start sections specified in the template.	Rewrite the description using the provided template, including Overview, Details, and Where should the reviewer start sections for clarity.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check	❓ Inconclusive	The PR improves CUDA warning messages by logging compiled architectures, but does not directly address the root cause of issue `#7569` (kernel image unavailable for RTX PRO 6000 Blackwell).	Clarify how improved logging enables diagnosis or resolution of the cudaErrorNoKernelImageForDevice issue, or explain if this is a preparatory change for a larger fix.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: improving CUDA compute capability warning messages through better logging of compiled GPU architectures.
Out of Scope Changes check	✅ Passed	All changes to build.rs focus on CUDA architecture logging and warning improvements, which are directly related to the PR objectives and issue `#7569` diagnosis.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

lib/kvbm-kernels/build.rs (2)

318-319: skipped_archs is populated but never used.

The vector collects skipped architectures on line 336, but is never read afterwards. Either remove this dead code or add a summary warning (similar to the compiled_archs message) that lists all skipped architectures in aggregate—this would be valuable for the diagnostic goal of this PR.

♻️ Option: Add an aggregated skipped-architectures warning

     if !compiled_archs.is_empty() {
         println!(
             "cargo:warning=Building kernels for GPU architectures: sm_{}",
             compiled_archs
                 .iter()
                 .map(|a| a.to_string())
                 .collect::<Vec<_>>()
                 .join(", sm_")
         );
     }
+
+    if !skipped_archs.is_empty() {
+        println!(
+            "cargo:warning=Skipped architectures (unsupported by CUDA toolkit): sm_{}",
+            skipped_archs
+                .iter()
+                .map(|(arch, _)| arch.to_string())
+                .collect::<Vec<_>>()
+                .join(", sm_")
+        );
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/kvbm-kernels/build.rs` around lines 318 - 319, The variable skipped_archs
is populated but never used; either remove it or emit an aggregated warning like
the compiled_archs summary. Update the build logic around compiled_archs and
skipped_archs to (a) if you keep skipped_archs, produce a single
processLogger.warn or eprintln summarizing skipped_archs (e.g., "Skipped
architectures: ...") after the loop similar to the compiled_archs message, or
(b) if you remove it, delete the skipped_archs declaration and all push/usage
sites; locate symbols compiled_archs and skipped_archs in build.rs to apply the
change.

336-336: Consider storing just arch_num if you don't need max.

Currently storing (arch_num, max) but max is the same for all skipped entries within a single build. If you add the aggregated warning (as suggested above), only arch_num is needed.

✨ Simplified storage

-    let mut skipped_archs = Vec::new();
+    let mut skipped_archs: Vec<u32> = Vec::new();
 ...
-            skipped_archs.push((arch_num, max));
+            skipped_archs.push(arch_num);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/kvbm-kernels/build.rs` at line 336, The code currently pushes tuples with
(arch_num, max) into skipped_archs but max is identical for all entries; change
skipped_archs to store only arch_num (e.g., Vec<ArchNum>), update the push site
where skipped_archs.push((arch_num, max)) to push only arch_num, and update any
downstream uses that read skipped_archs to expect a single-value list (adjusting
any pattern matches, formatting in the aggregated warning, and where max is read
to use the common max value instead). Ensure the variable declaration/type for
skipped_archs and all references (including the push and the warning/aggregation
logic) are updated consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/kvbm-kernels/build.rs`:
- Around line 347-356: Add an explicit warning when compiled_archs is empty:
after the existing if !compiled_archs.is_empty() block, add an else branch that
emits a cargo:warning stating that no GPU architectures were compiled because
all requested architectures exceed max_compute (reference the compiled_archs
variable and max_compute threshold) and suggest that this will produce "no
kernel image available" runtime errors; include the requested architectures or
indication of the threshold in the message for diagnostics.

---

Nitpick comments:
In `@lib/kvbm-kernels/build.rs`:
- Around line 318-319: The variable skipped_archs is populated but never used;
either remove it or emit an aggregated warning like the compiled_archs summary.
Update the build logic around compiled_archs and skipped_archs to (a) if you
keep skipped_archs, produce a single processLogger.warn or eprintln summarizing
skipped_archs (e.g., "Skipped architectures: ...") after the loop similar to the
compiled_archs message, or (b) if you remove it, delete the skipped_archs
declaration and all push/usage sites; locate symbols compiled_archs and
skipped_archs in build.rs to apply the change.
- Line 336: The code currently pushes tuples with (arch_num, max) into
skipped_archs but max is identical for all entries; change skipped_archs to
store only arch_num (e.g., Vec<ArchNum>), update the push site where
skipped_archs.push((arch_num, max)) to push only arch_num, and update any
downstream uses that read skipped_archs to expect a single-value list (adjusting
any pattern matches, formatting in the aggregated warning, and where max is read
to use the common max value instead). Ensure the variable declaration/type for
skipped_archs and all references (including the push and the warning/aggregation
logic) are updated consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a1c167a6-938a-4f32-b1f1-24c7031d8e17

📥 Commits

Reviewing files that changed from the base of the PR and between 06f1701 and f6243d1.

📒 Files selected for processing (1)

lib/kvbm-kernels/build.rs

lib/kvbm-kernels/build.rs

- Add detailed logging of which GPU architectures are being compiled - Clearly show compiled architectures (sm_XX) in build output - Helps users diagnose 'no kernel image available' errors Fixes ai-dynamo#7569 Signed-off-by: Asad Shahid <[email protected]>

Signed-off-by: Asad Shahid <[email protected]>

pull-request-size bot added the size/S label Mar 26, 2026

github-actions bot added external-contribution Pull request is from an external contributor fix labels Mar 26, 2026

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

lib/kvbm-kernels/build.rs Show resolved Hide resolved

AsadShahid04 force-pushed the inf-58-cuda-compute-warnings branch 2 times, most recently from e08ec14 to 793dbe6 Compare March 27, 2026 00:36

AsadShahid04 force-pushed the inf-58-cuda-compute-warnings branch from 793dbe6 to 4930108 Compare March 27, 2026 00:37

ci: retrigger lychee check

33a4a25

Signed-off-by: Asad Shahid <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Improve CUDA compute capability warning messages#7658

fix: Improve CUDA compute capability warning messages#7658
AsadShahid04 wants to merge 2 commits intoai-dynamo:mainfrom
AsadShahid04:inf-58-cuda-compute-warnings

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AsadShahid04 commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai bot

Loading