The 8 nested while(Ppc){while(Pp){OutBdLow2Hi(single,single,...)}}
loops across RestrictProlong (3 overloads) and ProlongRestrict each
produced N_c × N_f separate transfer() → MPI_Waitall barriers.
Replace with the existing batch OutBdLow2Hi(MyList<Patch>*,...) which
merges all patch pairs into a single transfer() call with 1 MPI_Waitall.
Also add Restrict_cached, OutBdLow2Hi_cached, OutBdLow2Himix_cached
to Parallel (unused for now — kept as infrastructure for future use).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
194 KiB
194 KiB