bugfix: make CP compatible with MTP#1150
bugfix: make CP compatible with MTP#1150shifengmin wants to merge 3 commits intojd-opensource:release/v0.9.0from
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for Multi-Token Prediction (MTP) combined with Context Parallelism (CP) by adding mtp_shifted_token_ids to the input pipeline and establishing a dedicated cp_group for runtime collectives. Feedback indicates that the logic for constructing shifted token IDs is incorrect when padding is involved, and the cp_group initialization may lead to incorrect group memberships. Additionally, the embedding gather logic in the worker implementation is flawed for CP+MTP scenarios, and synchronous CPU transfers in the scheduler thread should be avoided to prevent performance degradation.
b99ebcf to
b0a9efb
Compare
| state_.mrope_positions_vec.reserve(sequences.size()); | ||
| state_.block_tables_vec.reserve(sequences.size()); | ||
| state_.acc_logprob_vec.reserve(sequences.size()); | ||
| state_.mtp_shifted_token_ids.reserve(1000); |
There was a problem hiding this comment.
replaced with a pre-defined const
There was a problem hiding this comment.
replaced with a pre-defined const
| CHECK(input_params.mtp_shifted_token_ids.defined()); | ||
| CHECK_EQ(input_params.mtp_shifted_token_ids.numel(), | ||
| prefill_input.token_ids.numel()); | ||
| prefill_input.token_ids = input_params.mtp_shifted_token_ids.clone(); |
There was a problem hiding this comment.
unnecessary, removed
| state.extra_token_ids.end()); | ||
| state_.mtp_shifted_token_ids.insert(state_.mtp_shifted_token_ids.end(), | ||
| state.mtp_shifted_token_ids.begin(), | ||
| state.mtp_shifted_token_ids.end()); |
There was a problem hiding this comment.
Whats the difference between extra_token_ids and mtp_shifted_token_ids ?
There was a problem hiding this comment.
mtp_shifted_token_ids: represents the sequence that has been left-shifted and padded with -1 according to MTP prefill input prepare rules.
No description provided.