|
|
bd4ce3fbf3
|
GPU-accelerate Shell-Patch BSSN evolution
Phase 1: Enable GPU resident state for Cartesian patches in Shell mode.
- Remove WithShell guard from bssn_cuda_use_resident_sync().
- Add GPU-to-CPU state sync before shell CPU consumers (SHStep,
CS_Inter, inline shell RHS blocks).
Phase 2: GPU-accelerate BSSN Shell Patch RHS.
- Create bssn_gpu.h with RHS_SS_PARA macro and gpu_rhs_ss declaration.
- Fix compilation bugs in legacy bssn_gpu_rhs_ss.cu (deprecated
cudaThreadSynchronize, tmp_con2 redeclaration, ijkmin3_h typo,
CUDA_SAFE_CALL, missing compare_result guard).
- Add bssn_gpu_rhs_ss.o to CFILES_CUDA_BSSN with build rule.
- Write cuda_compute_rhs_bssn_ss() wrapper bridging Fortran and GPU
parameter conventions, redirect all shell RHS call sites via #define.
Verified: 30-step Shell-Patch GPU run completes without errors/NaN.
Step wall time ~4.4s (step_fn ~2.0s + RP ~0.68s + constraint ~0.70s).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
2026-05-09 18:50:10 +08:00 |
|