Conversation
🎨 Perfetto UI Builds
|
|
Before we do anything else the key thing I think we need here is into introduce a new "GPU" table and "ugpu" concept. Multi-machine traces are very much a thing now and if we're going to do this properly we should follow the example of ucpu and actually model this in a way we can support it. The only reason we didn't bother until now is that our support was very minimal and enough to get by. But now with CLs like this, I would take the time go model this properly. |
Add gpu_id dimension to render stage tracks in the trace processor,
matching what gpu counters already do. In the UI, extract the gpu_id
dimension and create per-GPU sub-groups within each track group when
multiple GPUs are present. This follows the pattern already used by
the GPU Frequency plugin, which already handles multiple GPUs by
creating per-GPU tracks under a shared "GPU Frequency" sub-group.
Single GPU (unchanged):
GPU
Gpu 0 Frequency [_____----____--]
Counters
CounterX [____------__--_]
Hardware Queues
Queue 0 [event1] [event2]
GPU Memory [____-----------]
Multiple GPUs:
GPU
GPU Frequency
Gpu 0 Frequency [_____----____--]
Gpu 1 Frequency [___----_-------]
Counters
GPU 0 Counters
CounterX [____------__--_]
CounterY [__----_____--__]
GPU 1 Counters
CounterX [__--------__--_]
CounterY [__----__-__--__]
Hardware Queues
GPU 0 Hardware Queues
Queue 0 [event1] [event2]
Queue 1 [event5] [event6]
GPU 1 Hardware Queues
Queue 0 [event7] [event8]
Queue 1 [event0] [event1]
GPU Memory [____-----------]
e2c3372 to
21bab83
Compare
I'll take a look at ucpu. I very much want to support the multi-machine case and was planning to tackle that next but can look at what we need there first instead. Note that I changed this a bit in the last version of this PR. It's now just the minimal changes needed to not have counters and renderstages be completely broken when you have multiple GPUs. I left the GPU Frequency logic unchanged as that has multi-gpu support already and I tried to make counters and renderstages as consistent with that as possible. Maybe now that this simpler we can land it before we tackle the multi-machine use-case? Either way is fine with me though. Something like this would be nice temporarily if the multi-machine use-case takes some to get right though. |
|
|
||
| // For multi-GPU traces, create per-GPU sub-groups within each | ||
| // group (e.g., "GPU 0 Counters" inside "Counters"). | ||
| if ( |
There was a problem hiding this comment.
I would not pollute the trace processor track plugin with custom code like this. If you need this, you should remove it from this plugin and write a new plugin (or some existing GPU plugin) and add this there.
There was a problem hiding this comment.
Sounds good. We already have a plugin for GPU Frequency. I'll see how things look if I add one for GPU Counters and one for GPU Hardware Queues.
|
I'm fine to land something temporary if you're going to follow up with a proper multi-machine suitable solution based on ugpu. |
I'll follow up with multi-machine support for sure as that's critical to me. Took a brief look at ucpu and not completely convinced that we need to introduce a ugpu concept. Maybe it makes sense longer term if end up building a lot of things that needs to be aware of the specific gpu but it might be simpler to just require code that needs to be aware to use gpu_id and machine_id as key. I'll try both ways and we can decide. |
|
The problem with having dural keys is that it's significantly less efficient especially when doing joins between tables. |
Ack, I'll focus on doing this using a new ugpu then. |
Add gpu_id dimension to render stage tracks in the trace processor,
matching what gpu counters already do. In the UI, extract the gpu_id
dimension and create per-GPU sub-groups within each track group when
multiple GPUs are present. This follows the pattern already used by
the GPU Frequency plugin, which already handles multiple GPUs by
creating per-GPU tracks under a shared "GPU Frequency" sub-group.
Bug: #5097