-
Notifications
You must be signed in to change notification settings - Fork 126
Change repo name and merge CODEOWNER files #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
systems-assistant bot
pushed a commit
that referenced
this pull request
Jul 17, 2025
Rename to ROCm Systems Profiler (rocprof-sys)
systems-assistant bot
pushed a commit
that referenced
this pull request
Jul 22, 2025
Create rocm_ci_caller.yml init file to call shared workflow
systems-assistant bot
pushed a commit
that referenced
this pull request
Jul 22, 2025
Create kws_caller.yml and rocm_ci_caller.yml
systems-assistant bot
pushed a commit
that referenced
this pull request
Jul 22, 2025
Enabling per PR based KWS check and PSDB check
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 5, 2025
Test PR [ROCm/rocprofiler commit: f5267f7]
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 5, 2025
Test PR [ROCm/roctracer commit: b28af2f]
ammallya
pushed a commit
that referenced
this pull request
Aug 6, 2025
Rename to ROCm Systems Profiler (rocprof-sys) [ROCm/rocprofiler-systems commit: f3c699e]
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 8, 2025
Create rocm_ci_caller.yml init file to call shared workflow [ROCm/rocm_smi_lib commit: bb122ef]
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 11, 2025
Create kws_caller.yml and rocm_ci_caller.yml [ROCm/rocminfo commit: fad2fcd]
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 11, 2025
[ROCm/rocm-core commit: 5071fe5]
jayhawk-commits
pushed a commit
that referenced
this pull request
Aug 11, 2025
Enabling per PR based KWS check and PSDB check [ROCm/ROCR-Runtime commit: d70d3fb]
1 task
kcossett-amd
added a commit
to kcossett-amd/rocm-systems
that referenced
this pull request
Oct 16, 2025
Co-authored-by: Pratik Basyal <[email protected]>
dgaliffiAMD
added a commit
that referenced
this pull request
Oct 21, 2025
…ument to avoid instrumenting around C "main" wrapper (#1322) * Add check for Fortran main * Comment change * MAIN__ -> Fortran main * Cray Compiler comment change * Add changelog and troubleshooting comments * Improve CHANGELOG.md message * Change CHANGELOG msg to be in 7.2.0 * Apply review change #1 Co-authored-by: Pratik Basyal <[email protected]> * Apply review change #2 Co-authored-by: Pratik Basyal <[email protected]> * Apply review change #3 Co-authored-by: Pratik Basyal <[email protected]> --------- Co-authored-by: Pratik Basyal <[email protected]> Co-authored-by: David Galiffi <[email protected]>
ggottipa-amd
pushed a commit
that referenced
this pull request
Oct 31, 2025
…ument to avoid instrumenting around C "main" wrapper (#1322) * Add check for Fortran main * Comment change * MAIN__ -> Fortran main * Cray Compiler comment change * Add changelog and troubleshooting comments * Improve CHANGELOG.md message * Change CHANGELOG msg to be in 7.2.0 * Apply review change #1 Co-authored-by: Pratik Basyal <[email protected]> * Apply review change #2 Co-authored-by: Pratik Basyal <[email protected]> * Apply review change #3 Co-authored-by: Pratik Basyal <[email protected]> --------- Co-authored-by: Pratik Basyal <[email protected]> Co-authored-by: David Galiffi <[email protected]>
ammallya
pushed a commit
that referenced
this pull request
Nov 17, 2025
The bug was reproduced like this. In terminal #1, run command: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow In terminal #2, inject errors: while true; do sudo amdgpuras -b 7 -s 1 -m 6 -t 2; sleep 2; done The terminal #1 starts dumping cper entry information that it captures. After 20 entries have been captured, open terminal #3 and run same command as terminal #1: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow From terminal #3, there will be no output, even when terminal #1 continues capturing and printing information. The fix: Since we already have more than 20 CPER entries available in the GPU buffer, when we run the command from terminal #3 to start capturing from the beginning and pass 20 buffers to copy entries to, the C++ API returns a code saying there is more data available. The Python CLI should not treat this as an error, but should continue to print what the API returned. --------- Signed-off-by: Oosman Saeed <[email protected]>
ammallya
pushed a commit
that referenced
this pull request
Nov 18, 2025
The bug was reproduced like this. In terminal #1, run command: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow In terminal #2, inject errors: while true; do sudo amdgpuras -b 7 -s 1 -m 6 -t 2; sleep 2; done The terminal #1 starts dumping cper entry information that it captures. After 20 entries have been captured, open terminal #3 and run same command as terminal #1: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow From terminal #3, there will be no output, even when terminal #1 continues capturing and printing information. The fix: Since we already have more than 20 CPER entries available in the GPU buffer, when we run the command from terminal #3 to start capturing from the beginning and pass 20 buffers to copy entries to, the C++ API returns a code saying there is more data available. The Python CLI should not treat this as an error, but should continue to print what the API returned. --------- Signed-off-by: Oosman Saeed <[email protected]> [ROCm/amdsmi commit: 5b95d22]
ammallya
pushed a commit
that referenced
this pull request
Nov 21, 2025
The bug was reproduced like this. In terminal #1, run command: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow In terminal #2, inject errors: while true; do sudo amdgpuras -b 7 -s 1 -m 6 -t 2; sleep 2; done The terminal #1 starts dumping cper entry information that it captures. After 20 entries have been captured, open terminal #3 and run same command as terminal #1: sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow From terminal #3, there will be no output, even when terminal #1 continues capturing and printing information. The fix: Since we already have more than 20 CPER entries available in the GPU buffer, when we run the command from terminal #3 to start capturing from the beginning and pass 20 buffers to copy entries to, the C++ API returns a code saying there is more data available. The Python CLI should not treat this as an error, but should continue to print what the API returned. --------- Signed-off-by: Oosman Saeed <[email protected]> [ROCm/amdsmi commit: 5b95d22]
1 task
1 task
jharryma
pushed a commit
that referenced
this pull request
Jan 7, 2026
#2349) * [RDC] Optimize RDC counter sampling with greedy packing algorithm (#1590) * Optimize RDC counter sampling with greedy packing algorithm This change significantly reduces the number of rocprofiler-sdk sample calls by implementing a greedy packing algorithm that groups multiple counters into the minimal number of hardware profiles. Key improvements: - Implement greedy packing algorithm to combine counters into minimal profiles - Add ProfileSet structure to manage packed counter configurations - Cache packed profile sets for reuse across queries - Group telemetry field requests by GPU for bulk processing - Reduce sample calls by ~35% (from 100 to 65 for typical workloads) Performance impact: - 13 counters now packed into 3 profiles (77% compression) - Reduces overhead from profile creation and context switching - More efficient utilization of hardware counter resources Implementation details: - Added create_profiles_for_counters() using greedy algorithm - Added sample_counters_with_packing() for bulk sampling - Modified telemetry layer to use rocp_lookup_bulk() - Preserves all field transformations and special handling Testing shows successful packing with expected performance gains. No functional changes to external APIs or behavior. Co-Authored-By: Ben Welton <[email protected]> * Address PR review feedback This commit addresses all review comments from the initial PR: 1. Fix division by zero risk in debug logging - Added check for empty counters vector before calculating compression ratio - Avoids potential division by zero when logging profile creation stats 2. Improve thread safety for statistics tracking - Changed static uint64_t to std::atomic<uint64_t> for thread-safe counters - Prevents race conditions in multi-threaded sampling scenarios 3. Remove unused variable - Removed unused profile_index variable that was incremented but never used - Cleaned up dead code 4. Clean up code formatting - Removed extra blank lines for consistency - Applied formatting fixes across modified files 5. Refactor code duplication between rocp_lookup and rocp_lookup_bulk - Created apply_field_transformation() helper function - Eliminates ~70 lines of duplicated switch statement logic - Centralizes field transformation logic in single location - Makes future maintenance easier 6. Document non-rocprofiler metrics handling - Added comments explaining how bulk lookup handles special cases - Clarifies that non-profiler fields like KFD_ID are handled in transformation All changes maintain backward compatibility and pass compilation. Co-Authored-By: Ben Welton <[email protected]> --------- Co-authored-by: Ben Welton <[email protected]> Co-authored-by: Adam Pryor <[email protected]> * [rdc] maintain counter cache per agent --------- Co-authored-by: Benjamin Welton <[email protected]> Co-authored-by: Ben Welton <[email protected]> Co-authored-by: Mythreya <[email protected]> Co-authored-by: chiranjeevi-amd <[email protected]>
ammallya
pushed a commit
that referenced
this pull request
Jan 21, 2026
silence warnings in functional testsuite
ammallya
pushed a commit
that referenced
this pull request
Jan 21, 2026
silence warnings in functional testsuite [ROCm/rocshmem commit: a9f2eff]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.