AMSS-NCKU/AMSS_NCKU_source/MPatch.C at 50e2a845f8c3e936e334514a66051082dcc41505

64-BitBrainstorm_2026/AMSS-NCKU

Files

CGH0S7 50e2a845f8 Replace MPI_Allreduce with owner-rank MPI_Bcast in Patch::Interp_Points

The two MPI_Allreduce calls (data + weight) were the #1 hotspot at 38.5%
CPU time. Since all ranks traverse the same block list and agree on point
ownership, we replace the global reduction with targeted MPI_Bcast from
each owner rank. This also eliminates the weight array/Allreduce entirely,
removes redundant heap allocations (shellf, weight, DH, llb, uub), and
writes interpolation results directly into the output buffer.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-09 22:39:18 +08:00

37 KiB

Raw Blame History

View Raw

37 KiB Raw Blame History

37 KiB

Raw Blame History