ranko : AssertionError
[rank3]: Traceback (most recent call last):
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/cli/ megatron/rlhf.py”, line 7, in <module>
[rank3]: megatron rlhf main()
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/pipelines/train/rlhf.py”, line 73, in megatron_rlhf maine
[rank3]: return MegatronRLHF(args).main()
[rank3]: VVVVVVwwwwwwwwwwwwwwwvwvv
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/pipelines/base.py", line 52, in main
[rank3]: result - self.run()
[rank3]: vvvvvvwvvv
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/pipelines/train/sft.py", line 70, in runs
[rank3]: trainer - self.prepare trainer()
[rank3]: VVVvvvvvwvwwwwwevvvvvv
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/pipelines/train/rlhf.py", line 34, in prepare_trainercc
[rank3]: return trainer_cls(args, self.template, **kwargs)
[rank3]: VVVVVVVVVVVVVVVVVVVVVVVVVVVVVwVwvvvvvyvvvv
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/trainers/gkd_trainer.py", line 65, in init
[rank3]: super(). init (args, template)
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/trainers/base.py", line 70, in initce
[rank3]: self.prepare model()
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/trainers/gkd trainer.py", line 86, in prepare modelee
[rank3]: super().prepare model()
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/trainers/rlhf mixin.py", line 30, in prepare modele
[rank3]: super().prepare model()
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/trainers/base.py”, line 186, in prepare modelee
[rank3]: self.unwrapped models - get mcore model(args, self.template.config)
[rank3]: ^^^^^^A^^AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[rank3]: File "/home/ma-user/work/ms-swift-v4.2.1/swift/megatron/model/utils.py”, line 82, in get mcore_modele
[rank3]: models - _get mcore model(config)
[rank3]: VVVVVVVVVVVwwwwwwwwwwvvv
[rank3]: File "/home/ma-user/work/mcore-bridge/src/mcore_bridge/model/register.py”, line 178, in get mcore model
[rank3]: model - loader.build model(pre process pre process, post process-post process)
[rank3]: VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV-
[rank3]: File "/home/ma-user/work/mcore-bridge/src/mcore bridge/model/gpts/qwen3 next gdn.py”, line 141, in build model
[rank3]: assert hasattr(layer.self attention.out norm, 'zero centered gamma’)
[rank3]: ^^^^^^^^AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
[rank3]: AssertionError
Checklist / 检查清单
Bug Description / Bug 描述
From this PR #9382, we manually update the version of mindspeed to
0.16.0(https://gitcode.com/Ascend/MindSpeed/tree/core_r0.16.0), and do GRPO training with megatron+mindspeed.The version of ms-swift is
4.2.1, and it gives an AssertionError:How to Reproduce / 如何复现
Qwen3.5 GKD/OPSD training
Additional Information / 补充信息
No response