AMSS-NCKU

64-BitBrainstorm_2026/AMSS-NCKU

Author	SHA1	Message	Date
CGH0S7	caf192b2e4	Remove MPI dependency, replace with single-process stub for non-MPI builds - Add mpi_stub.h providing all MPI types/constants/functions as no-ops (nprocs=1, myrank=0) with memcpy-based Allreduce and clock_gettime Wtime - Replace #include <mpi.h> with conditional #ifdef MPI_STUB in 31 files (19 headers + 12 source files) preserving ability to build with real MPI - Change makefile.inc: CLINKER mpiicpx->icpx, add -DMPI_STUB to CXXAPPFLAGS - Update makefile_and_run.py: run ./ABE directly instead of mpirun -np N - Set MPI_processes=1 in AMSS_NCKU_Input.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-02-10 22:51:11 +08:00
CGH0S7	738498cb28	Optimize MPI communication in RestrictProlong and surface_integral Cache Sync in RestrictProlong: replace 11 basic Parallel::Sync() calls with Parallel::Sync_cached() across RestrictProlong, RestrictProlong_aux, and ProlongRestrict to avoid rebuilding grid segment lists every call. Merge paired MPI_Allreduce in surface_integral: combine 9 pairs of consecutive RP/IP Allreduce calls into single calls with count=2*NN. Merge scalar MPI_Allreduce in surf_MassPAng: combine 3 groups of 7 scalar Allreduce calls (mass + angular/linear momentum) into single calls with count=7. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 22:07:12 +08:00
CGH0S7	42b9cf1ad9	Optimize MPI Sync with merged transfers, caching, and async overlap Phase 1: Merge N+1 transfer() calls into a single transfer() per Sync(PatchList), reducing N+1 MPI_Waitall barriers to 1 via new Sync_merged() that collects all intra-patch and inter-patch grid segment lists into combined per-rank arrays. Phase 2: Cache grid segment lists and reuse grow-only communication buffers across RK4 substeps via SyncCache struct. Caches are per-level and per-variable-list (predictor/corrector), invalidated on regrid. Eliminates redundant build_ghost_gsl/build_owned_gsl0/build_gstl rebuilds and malloc/free cycles between regrids. Phase 3: Split Sync into async Sync_start/Sync_finish to overlap Cartesian ghost zone exchange (MPI_Isend/Irecv) with Shell patch synchronization. Uses MPI tag 2 to avoid conflicts with SH->Synch() which uses transfer() with tag 1. Also updates makefile.inc paths and flags for local build environment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 21:03:37 +08:00
CGH0S7	f2fc9af70e	asc26 amss-ncku initialized	2026-01-13 15:01:15 +08:00

4 Commits