Skip to content

[doc] feat: Add FLOPs calculator and FP8/FP4 dequantization guidance to adding-model-support skill#2934

Open
cuichenx wants to merge 1 commit intoyuya/public-skillsfrom
chcui/skill-flops-dequant-guidance
Open

[doc] feat: Add FLOPs calculator and FP8/FP4 dequantization guidance to adding-model-support skill#2934
cuichenx wants to merge 1 commit intoyuya/public-skillsfrom
chcui/skill-flops-dequant-guidance

Conversation

@cuichenx
Copy link
Contributor

Summary

  • Adds Step 4 — Check for quantized weights (FP8 / FP4) to the Discovery phase of the adding-model-support skill. Documents the silent failure mode (bridge loads raw quantized values → broken model with no error) and two fix approaches: standalone dequant script and in-bridge maybe_modify_loaded_hf_weight() hook.
  • Adds Update FLOPs calculator for new architectural blocks section to Phase 2. Covers when and how to update flop_utils.py for new blocks (GDN, MTP, Mamba, novel MoE), with reference to PR [perf] feat: add GDN (Gated DeltaNet) FLOPs calculator #2925 as the canonical example.

Test plan

…to adding-model-support skill

- Step 4 (Discovery): Check for quantized weights (FP8/FP4) that silently
  break models without dequantization. Documents standalone script and
  in-bridge hook approaches.
- Phase 2: Update FLOPs calculator when new architectural blocks (GDN, MTP,
  Mamba) differ from standard attention/MLP. References PR #2925 as example.

Signed-off-by: Chen Cui <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant