-
Notifications
You must be signed in to change notification settings - Fork 160
Description
Hello, I want to sft Kimi K2.5, but I found some components or scripts is missing.
-
Training logic is missing in K2.5 MoE block.
There is only infer logic inDeepseekV3MoEandMoEGate.
https://huggingface.co/moonshotai/Kimi-K2.5/blob/main/modeling_deepseek.py#L456
https://huggingface.co/moonshotai/Kimi-K2.5/blob/main/modeling_deepseek.py#L536 -
No bf16 ckpt for training
Kimi-K2.5 base model is in INT4 format, no scripts provided for converting int4 to bf16
https://huggingface.co/moonshotai/Kimi-K2.5/tree/main
https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/SFT_Installation_Guide_KimiK2.5.md#12-convert-int4--bf16
Could you help me solve the problems metioned as above, Will the official release open-source training scripts in the future?