UCT/CUDA/COPY: Qualify cuda_copy for RNDV with peer-failure EH#11319
Open
pentschev wants to merge 3 commits intoopenucx:masterfrom
Open
UCT/CUDA/COPY: Qualify cuda_copy for RNDV with peer-failure EH#11319pentschev wants to merge 3 commits intoopenucx:masterfrom
pentschev wants to merge 3 commits intoopenucx:masterfrom
Conversation
Advertise UCT_IFACE_FLAG_ERRHANDLE_PEER_FAILURE so ucp_wireup_fill_peer_err_criteria does not exclude cuda_copy from RMA BW lanes. Advertise UCT_MD_FLAG_INVALIDATE and UCT_MD_FLAG_INVALIDATE_RMA and handle UCT_MD_MEM_DEREG_FLAG_INVALIDATE in mem dereg (invoke completion), matching ucp_wireup_add_rma_bw_lanes and ucp_memh_dereg expectations.
yosefe
requested changes
Apr 9, 2026
Comment on lines
+107
to
+109
| /* UCT_IFACE_FLAG_ERRHANDLE_PEER_FAILURE required for RMA BW wireup | ||
| * (ucp_wireup_fill_peer_err_criteria) when error handling is requested. | ||
| * Transfers are local copies; UCP handles invalidation when a peer fails. */ |
Contributor
There was a problem hiding this comment.
we should change UCP layer to not require peer failure or invalidate support when connecting endpoint to same worker, including mem type endpoints. instead of changing UCT.
@shasson5 can help if needed
Contributor
Author
When an endpoint is wired to the same UCP worker (same unpacked address UUID), there is no independent remote peer for cross-worker RMA. Skip requiring UCT_IFACE_FLAG_ERRHANDLE_PEER_FAILURE and UCT_MD_FLAG_INVALIDATE_RMA in RMA BW wireup criteria in that case. In ucp_request_get_invalidation_map(), return an empty invalidation map for UCP_EP_CONFIG_KEY_FLAG_SELF so RMA BW lanes without MD invalidate support (e.g. cuda_copy) remain valid for same-worker / memtype EPs with peer-failure error handling. Extend ucp_wireup_fill_peer_err_criteria() and ucp_wireup_fill_aux_criteria() with worker + unpacked address so auxiliary and other lane selection paths apply the same rule.
eec6960 to
4fd4eb4
Compare
yosefe
reviewed
Apr 10, 2026
Contributor
yosefe
left a comment
There was a problem hiding this comment.
test failure seems relevant
Comment on lines
1049
to
+1055
| if (ep_init_flags & UCP_EP_INIT_ERR_MODE_PEER_FAILURE) { | ||
| /* No independent remote worker when connecting an EP to itself (loopback, | ||
| * memtype EPs, etc.): peer-failure iface caps and MD invalidation are | ||
| * relaxed in wireup; see ucp_request_get_invalidation_map(). */ | ||
| if ((unpacked_addr != NULL) && (unpacked_addr->uuid == worker->uuid)) { | ||
| return; | ||
| } |
Contributor
There was a problem hiding this comment.
IMO it would be better to update ucp_ep_err_mode_init_flags to not set UCP_EP_INIT_ERR_MODE_PEER_FAILURE for uuid==remote_uuid case
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What?
Relax peer-failure and RMD invalidation wireup requirements when an endpoint targets the same UCP worker as the local side (same unpacked-address UUID), including memtype endpoints used for intra-worker GPU copies.
Changes:
select.c/ucp_wireup_fill_peer_err_criteria(): Ifunpacked_addr->uuid == worker->uuid, do not addUCT_IFACE_FLAG_ERRHANDLE_PEER_FAILUREto local iface criteria. Thread worker + unpacked address throughucp_wireup_fill_aux_criteria(),ucp_wireup_select_wireup_msg_lane(), anducp_wireup_select_aux_transport()so auxiliary / wireup-msg selection follows the same rule.select.c/ucp_wireup_add_rma_bw_lanes(): AddUCT_MD_FLAG_INVALIDATE_RMAto RMA BW MD criteria only when the remote address is not the same worker (select_params->address->uuid != ep->worker->uuid), in addition to the existing peer-failure + RNDV checks.ucp_request.c/ucp_request_get_invalidation_map(): IfUCP_EP_CONFIG_KEY_FLAG_SELFis set on the EP config key, return an empty invalidation map. Same-worker RMA BW lanes may use MDs withoutUCT_MD_FLAG_INVALIDATE_RMA(e.g.cuda_copy) without tripping asserts on peer-failure teardown.Why?
Previously, peer-failure + RNDV wireup required iface peer-failure caps and MD RMA invalidation for every EP, including connections where the "remote" is the same worker (loopback, memtype EPs for host/device staging).
cuda_copydoes not expose those UCT capabilities, so it was dropped from RMA BW selection and users sawcuda_ipcor worse paths for intra-worker CUDA work, even though there is no separate remote worker whose failure must invalidate cross-node RMA keys.For same-worker EPs,
UCP_EP_CONFIG_KEY_FLAG_SELFalready marks the configuration; skipping strict UCT requirements in wireup and skipping MD invalidation inucp_request_get_invalidation_map()matches the real semantics: no independent remote peer for that connection, so the stricter cross-peer invalidation contract is unnecessary. Cross-worker endpoints are unchanged and still require the full iface + MD behavior.Closes #11318