Skip to content

GroupedGemm: MXFP8 via cuBLAS #2454

@ptrendx

Description

@ptrendx

Develop GroupedGemm support for the MXFP8 quantization format using cuBLAS. Implementation should ensure correct handling of grouped data types, efficient batching, and integration with MoE device-side workflows.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions