Skip to content

Feat/per thread timing dist#517

Draft
MoefulYe wants to merge 3 commits intoeunomia-bpf:masterfrom
MoefulYe:feat/per-thread-timing-dist
Draft

Feat/per thread timing dist#517
MoefulYe wants to merge 3 commits intoeunomia-bpf:masterfrom
MoefulYe:feat/per-thread-timing-dist

Conversation

@MoefulYe
Copy link

This pull request introduces a new GPU per-thread timing distribution demo under the example/gpu/per-thread-timing-dist directory. The demo includes a CUDA kernel that runs a synthetic workload, an eBPF program that collects per-thread timing data, and a user-space application to display timing distributions and summary statistics. The changes provide a complete workflow for measuring and visualizing the runtime distribution of GPU threads using eBPF.

New Demo Implementation:

  • Added a Makefile to automate building the CUDA kernel, eBPF program, and user-space application, including support for third-party dependencies and environment setup.
  • Introduced the timed_work_kernel.cu CUDA kernel that runs a synthetic workload with varying iterations per thread to generate a non-trivial timing distribution.

eBPF Program for Timing Collection:

  • Implemented timing-dist.bpf.c eBPF program to trace entry/exit of the CUDA kernel, record per-thread start times, and update a log2-based histogram and summary statistics using custom GPU helper functions.

User-space Application for Visualization:

  • Added timing-dist.c user-space program that attaches the eBPF program, periodically prints the timing distribution histogram and summary statistics, and handles graceful termination via signals.

Project Structure and Build Artifacts:

  • Updated .gitignore to exclude build outputs and binaries for the new demo.Please try to use the copilot to summary your PR. You don't need to fill all info below, just it can help giving your a checklist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments