@humansand ## Miles - MXFP8 & NVFP4 - [x] https://github.com/radixark/miles/pull/614 - [x] https://github.com/radixark/miles/issues/567 - [ ] https://github.com/radixark/miles/pull/919 - MXFP8 - [x] https://github.com/radixark/miles/pull/512 - [ ] https://github.com/radixark/miles/pull/963 - NVFP4 - [ ] https://github.com/radixark/miles/pull/546 - [x] https://github.com/radixark/miles/pull/907 - [ ] https://github.com/radixark/miles/pull/1054 ## SGLang - MXFP8 & NVFP4 - [x] https://github.com/sgl-project/sglang/pull/20214 - MXFP8 - [x] https://github.com/sgl-project/sglang/pull/17449 - [x] https://github.com/sgl-project/sglang/pull/18742 - [ ] https://github.com/sgl-project/sglang/pull/17294 - [x] https://github.com/sgl-project/sglang/pull/19537 - [x] https://github.com/sgl-project/sglang/pull/21280 - [x] https://github.com/sgl-project/sglang/pull/21576 - [x] https://github.com/sgl-project/sglang/pull/22484 - NVFP4 - [ ] https://github.com/sgl-project/sglang/pull/18012 - [x] https://github.com/sgl-project/sglang/pull/18085 - [x] https://github.com/sgl-project/sglang/pull/22204 - [ ] https://github.com/sgl-project/sglang/pull/22918 ## TransformerEngine - MXFP8 & NVFP4 - [x] https://github.com/NVIDIA/TransformerEngine/pull/2644 - [ ] https://github.com/NVIDIA/TransformerEngine/pull/2865 - NVFP4 - [ ] https://github.com/NVIDIA/TransformerEngine/pull/2931 - [ ] CuTe DSL kernels in cuDNN frontend ## FlashInfer - MXFP8 - [x] https://github.com/flashinfer-ai/flashinfer/pull/2581 - NVFP4 - [ ] https://github.com/flashinfer-ai/flashinfer/pull/3027
@HumansAnd
Miles
MXFP8 & NVFP4
--fp8-param-gatherfor mxfp8 #919MXFP8
NVFP4
SGLang
flashinfer_trtllm_routedmoe backend sgl-project/sglang#20214TransformerEngine
NVTE_BACKWARD_OVERRIDE=high_precision|dequantizedNVIDIA/TransformerEngine#2644FlashInfer
cutlass_fused_moemxfp8 flashinfer-ai/flashinfer#2581