Skip to content

fix: Improve CUDA compute capability warning messages#7658

Open
AsadShahid04 wants to merge 2 commits intoai-dynamo:mainfrom
AsadShahid04:inf-58-cuda-compute-warnings
Open

fix: Improve CUDA compute capability warning messages#7658
AsadShahid04 wants to merge 2 commits intoai-dynamo:mainfrom
AsadShahid04:inf-58-cuda-compute-warnings

Conversation

@AsadShahid04
Copy link
Copy Markdown

@AsadShahid04 AsadShahid04 commented Mar 26, 2026

  • Add detailed logging of which GPU architectures are being compiled
  • Clearly show compiled architectures (sm_XX) in build output
  • Helps users diagnose 'no kernel image available' errors

Fixes #7569

Summary by CodeRabbit

  • Chores
    • Enhanced CUDA architecture tracking and logging during the build process, now recording and reporting which architectures are compiled versus skipped.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown
Contributor

👋 Hi AsadShahid04! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added external-contribution Pull request is from an external contributor fix labels Mar 26, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Walkthrough

Modified the get_cuda_arch_flags() function in the build script to track compiled versus skipped CUDA SM architectures during parsing of CUDA_ARCHS. Introduced skipped_archs and compiled_archs vectors to record architecture handling, with conditional aggregated warning emission for compiled architectures.

Changes

Cohort / File(s) Summary
CUDA Architecture Tracking
lib/kvbm-kernels/build.rs
Updated build script to track compiled and skipped CUDA SM architectures into separate vectors (compiled_archs, skipped_archs) and emit aggregated warning listing compiled architectures when at least one is accepted.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete; it lacks required sections such as Overview, Details, and Where should the reviewer start sections specified in the template. Rewrite the description using the provided template, including Overview, Details, and Where should the reviewer start sections for clarity.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The PR improves CUDA warning messages by logging compiled architectures, but does not directly address the root cause of issue #7569 (kernel image unavailable for RTX PRO 6000 Blackwell). Clarify how improved logging enables diagnosis or resolution of the cudaErrorNoKernelImageForDevice issue, or explain if this is a preparatory change for a larger fix.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: improving CUDA compute capability warning messages through better logging of compiled GPU architectures.
Out of Scope Changes check ✅ Passed All changes to build.rs focus on CUDA architecture logging and warning improvements, which are directly related to the PR objectives and issue #7569 diagnosis.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
lib/kvbm-kernels/build.rs (2)

318-319: skipped_archs is populated but never used.

The vector collects skipped architectures on line 336, but is never read afterwards. Either remove this dead code or add a summary warning (similar to the compiled_archs message) that lists all skipped architectures in aggregate—this would be valuable for the diagnostic goal of this PR.

♻️ Option: Add an aggregated skipped-architectures warning
     if !compiled_archs.is_empty() {
         println!(
             "cargo:warning=Building kernels for GPU architectures: sm_{}",
             compiled_archs
                 .iter()
                 .map(|a| a.to_string())
                 .collect::<Vec<_>>()
                 .join(", sm_")
         );
     }
+
+    if !skipped_archs.is_empty() {
+        println!(
+            "cargo:warning=Skipped architectures (unsupported by CUDA toolkit): sm_{}",
+            skipped_archs
+                .iter()
+                .map(|(arch, _)| arch.to_string())
+                .collect::<Vec<_>>()
+                .join(", sm_")
+        );
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/kvbm-kernels/build.rs` around lines 318 - 319, The variable skipped_archs
is populated but never used; either remove it or emit an aggregated warning like
the compiled_archs summary. Update the build logic around compiled_archs and
skipped_archs to (a) if you keep skipped_archs, produce a single
processLogger.warn or eprintln summarizing skipped_archs (e.g., "Skipped
architectures: ...") after the loop similar to the compiled_archs message, or
(b) if you remove it, delete the skipped_archs declaration and all push/usage
sites; locate symbols compiled_archs and skipped_archs in build.rs to apply the
change.

336-336: Consider storing just arch_num if you don't need max.

Currently storing (arch_num, max) but max is the same for all skipped entries within a single build. If you add the aggregated warning (as suggested above), only arch_num is needed.

✨ Simplified storage
-    let mut skipped_archs = Vec::new();
+    let mut skipped_archs: Vec<u32> = Vec::new();
 ...
-            skipped_archs.push((arch_num, max));
+            skipped_archs.push(arch_num);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/kvbm-kernels/build.rs` at line 336, The code currently pushes tuples with
(arch_num, max) into skipped_archs but max is identical for all entries; change
skipped_archs to store only arch_num (e.g., Vec<ArchNum>), update the push site
where skipped_archs.push((arch_num, max)) to push only arch_num, and update any
downstream uses that read skipped_archs to expect a single-value list (adjusting
any pattern matches, formatting in the aggregated warning, and where max is read
to use the common max value instead). Ensure the variable declaration/type for
skipped_archs and all references (including the push and the warning/aggregation
logic) are updated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/kvbm-kernels/build.rs`:
- Around line 347-356: Add an explicit warning when compiled_archs is empty:
after the existing if !compiled_archs.is_empty() block, add an else branch that
emits a cargo:warning stating that no GPU architectures were compiled because
all requested architectures exceed max_compute (reference the compiled_archs
variable and max_compute threshold) and suggest that this will produce "no
kernel image available" runtime errors; include the requested architectures or
indication of the threshold in the message for diagnostics.

---

Nitpick comments:
In `@lib/kvbm-kernels/build.rs`:
- Around line 318-319: The variable skipped_archs is populated but never used;
either remove it or emit an aggregated warning like the compiled_archs summary.
Update the build logic around compiled_archs and skipped_archs to (a) if you
keep skipped_archs, produce a single processLogger.warn or eprintln summarizing
skipped_archs (e.g., "Skipped architectures: ...") after the loop similar to the
compiled_archs message, or (b) if you remove it, delete the skipped_archs
declaration and all push/usage sites; locate symbols compiled_archs and
skipped_archs in build.rs to apply the change.
- Line 336: The code currently pushes tuples with (arch_num, max) into
skipped_archs but max is identical for all entries; change skipped_archs to
store only arch_num (e.g., Vec<ArchNum>), update the push site where
skipped_archs.push((arch_num, max)) to push only arch_num, and update any
downstream uses that read skipped_archs to expect a single-value list (adjusting
any pattern matches, formatting in the aggregated warning, and where max is read
to use the common max value instead). Ensure the variable declaration/type for
skipped_archs and all references (including the push and the warning/aggregation
logic) are updated consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a1c167a6-938a-4f32-b1f1-24c7031d8e17

📥 Commits

Reviewing files that changed from the base of the PR and between 06f1701 and f6243d1.

📒 Files selected for processing (1)
  • lib/kvbm-kernels/build.rs

@AsadShahid04 AsadShahid04 force-pushed the inf-58-cuda-compute-warnings branch 2 times, most recently from e08ec14 to 793dbe6 Compare March 27, 2026 00:36
- Add detailed logging of which GPU architectures are being compiled
- Clearly show compiled architectures (sm_XX) in build output
- Helps users diagnose 'no kernel image available' errors

Fixes ai-dynamo#7569

Signed-off-by: Asad Shahid <[email protected]>
@AsadShahid04 AsadShahid04 force-pushed the inf-58-cuda-compute-warnings branch from 793dbe6 to 4930108 Compare March 27, 2026 00:37
Signed-off-by: Asad Shahid <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contribution Pull request is from an external contributor fix size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: CUDA error: no kernel image is available for execution on the device

1 participant