Skip to content

prov/shm: allow sender-side CMA for RMA writes with FI_REMOTE_CQ_DATA#12300

Open
yinliaws wants to merge 1 commit into
ofiwg:mainfrom
yinliaws:fix-write-fast
Open

prov/shm: allow sender-side CMA for RMA writes with FI_REMOTE_CQ_DATA#12300
yinliaws wants to merge 1 commit into
ofiwg:mainfrom
yinliaws:fix-write-fast

Conversation

@yinliaws
Copy link
Copy Markdown
Contributor

For RMA writes > SMR_INJECT_SIZE with FI_REMOTE_CQ_DATA (writedata), allow smr_rma_fast (sender-side CMA) instead of falling through to the receiver-side CMA path. The sender does process_vm_writev directly into the target's registered MR, then posts ofi_op_write_async with the cq_data embedded. The receiver generates the remote CQ entry on seeing SMR_REMOTE_CQ_DATA in the write_async notification.

This eliminates the freestack allocation and return-queue round-trip for writedata, recovering 20-25% bandwidth regression at 1-8MB on Graviton and AMD platforms.

For RMA writes > SMR_INJECT_SIZE with FI_REMOTE_CQ_DATA (writedata),
allow smr_rma_fast (sender-side CMA) instead of falling through to the
receiver-side CMA path. The sender does process_vm_writev directly into
the target's registered MR, then posts ofi_op_write_async with the
cq_data embedded. The receiver generates the remote CQ entry on seeing
SMR_REMOTE_CQ_DATA in the write_async notification.

This eliminates the freestack allocation and return-queue round-trip for
writedata, recovering 20-25% bandwidth regression at 1-8MB on Graviton
and AMD platforms.

Signed-off-by: Yin Li <yinliq@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant