abnerluo
d9c7ea8085
Use cudaMemcpyAsync with dedicated transfer stream for H2D/D2H transfers
...
Add cudaStream_t to GpuBuffers for async H2D/D2H transfers in BSSN and
Z4C substep functions. Adds cudaStreamSynchronize(0) before D2H to
enforce kernel/transfer ordering across streams, and a sync between
state and matter H2D uploads to prevent h_stage race on RK4==0.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-04-28 08:23:34 +08:00
bb20c9a876
fix ADM Constrant Violation Analysis
2026-04-15 19:19:16 +08:00
8fe60ea703
Add zero matter handling and interpolation for resident state in CUDA BSSN
2026-04-15 00:25:53 +08:00
9ab7e7c7f9
Fuse phases 5 and 6 for Gamma_rhs computation and optimize phases 8 and 9 for efficiency
2026-04-14 23:23:04 +08:00
f9119e8a2a
Add resident-GA mode switch and simplify sync logic
2026-04-14 21:09:27 +08:00
726d743376
Fuse Ricci assembly and optimize trK/Aij gauge kernels
2026-04-14 19:20:12 +08:00
af344bf1e5
Add Phase-10 Ricci kernels and batch launch flow
2026-04-14 19:00:22 +08:00
7191fc0b96
Move resident sync comm buffers into StepAllocation pool
2026-04-13 21:04:44 +08:00
b3ec244cf9
Add batched first/second derivative kernels for CUDA RHS
2026-04-13 20:51:08 +08:00
e952ee8e91
Batch GA/BH subset sync with indexed GPU pack/unpack buffers
2026-04-13 20:40:09 +08:00
c5d1268dd1
Batch patch-boundary copy and gate CPU BC in GPU substeps
2026-04-13 11:52:17 +08:00
4bdfc90f22
Pass pointer tables as kernel args and skip redundant symbol uploads
2026-04-13 11:19:00 +08:00
c49a4e00c9
Batch symbd_pack/lopsided/kodiss over all state variables
2026-04-13 11:02:55 +08:00
1b3c0b80d2
Refactor CUDA step buffers to remove loop-time allocations
2026-04-13 10:33:03 +08:00
636e35bfd8
Add direct CUDA resident-state sync path and profiling hooks
2026-04-13 00:57:05 +08:00
7f2a391dd2
Cache matter fields in StepContext across RK4 substeps
2026-04-12 22:19:45 +08:00
4fa12a2009
Integrate CUDA support into RK4 substep execution
2026-04-12 22:11:44 +08:00
86a683de26
Replace legacy ABEGPU stack with ABE_CUDA backend
2026-04-12 21:19:14 +08:00