|
|
39450228f5
|
Accelerate Shell-Patch interpolation fast paths
|
2026-05-08 13:26:16 +08:00 |
|
|
|
063f28b3b4
|
Add Shell-Patch GPU runtime fast paths
|
2026-05-08 09:26:36 +08:00 |
|
|
|
1064a68d16
|
Optimize BSSN-EM 8th-order AMR transfers
|
2026-05-07 21:38:16 +08:00 |
|
|
|
dcc83bafcb
|
Support 2nd and 8th order CUDA AMR paths
|
2026-05-07 20:31:26 +08:00 |
|
|
|
c4d8d41b25
|
Cover Z4C CUDA AMR restrict prolong
|
2026-05-07 19:49:09 +08:00 |
|
|
|
0076b3ca18
|
Optimize 6th-order CUDA AMR stencils
|
2026-05-07 19:22:37 +08:00 |
|
|
|
9ff2f065be
|
Apply BSSN AMR sync default to EScalar
|
2026-05-07 17:12:33 +08:00 |
|
|
|
2317e4abde
|
Fix BSSN GPU resident AMR sync default
|
2026-05-07 17:11:09 +08:00 |
|
|
|
fea2dcc0d5
|
Fix BSSN-EM runtime crash
|
2026-05-07 16:47:55 +08:00 |
|
|
|
5525465cad
|
Support CUDA finite-difference order selection
|
2026-05-07 16:28:02 +08:00 |
|
|
|
96829d0441
|
Optimize Z4C GPU runtime defaults
|
2026-05-07 15:37:09 +08:00 |
|
|
|
83afaf19ce
|
Skip zero EM resident downloads
|
2026-05-07 13:04:46 +08:00 |
|
|
|
cb911dec06
|
Add EM GPU fast paths and defaults
|
2026-05-07 12:18:56 +08:00 |
|
|
|
dd0e20d8c7
|
Fix BSSN-EScalar CUDA boundary and scalar KO
|
2026-05-06 15:44:35 +08:00 |
|
|
|
ffa0d801ed
|
Default Python GPU runner to EScalar fast path
|
2026-05-06 00:12:46 +08:00 |
|
|
|
ae64a22178
|
Complete BSSN-EScalar CUDA resident transfers
|
2026-05-05 23:57:42 +08:00 |
|
|
|
85fe29cc2e
|
Optimize BSSN-EScalar CUDA path
|
2026-05-05 10:47:46 +08:00 |
|
|
|
06f62dee36
|
Switch back to Intel toolchain as the default option
Seems that Intel MPI also supports CUDA-aware by setting I_MPI_OFFLOAD to 1. Besides, I_MPI_OFFLOAD_IPC=0 is needed to avoid segfaults.
|
2026-05-01 21:59:13 +08:00 |
|
|
|
35b6ceff02
|
Broaden cached CUDA sync paths
|
2026-05-01 18:03:04 +08:00 |
|
|
|
51f3819892
|
Save generated source formatting state
|
2026-04-30 20:47:44 +08:00 |
|
|
|
a9a3809148
|
Default Python launcher to fast GPU path
|
2026-04-30 20:15:34 +08:00 |
|
|
|
b1974ef146
|
Stabilize device AMR restrict across regrid
|
2026-04-30 20:01:18 +08:00 |
|
|
|
be9033f449
|
Add optional CUDA surface interpolation
|
2026-04-30 19:21:19 +08:00 |
|
|
|
6835608f92
|
Add configurable analysis MAP cadence
|
2026-04-30 19:10:12 +08:00 |
|
|
|
e0d0673c8e
|
Enable optimized GPU runs from Python launcher
|
2026-04-30 18:31:31 +08:00 |
|
|
|
da4d56ccf7
|
Optimize BSSN surface interpolation fast path
|
2026-04-30 18:25:21 +08:00 |
|
|
|
a6483d013d
|
Add CUDA AMR restrict diagnostics
|
2026-04-30 12:20:44 +08:00 |
|
|
|
8486532920
|
Add resident BSSN GPU point interpolation
|
2026-04-30 11:39:15 +08:00 |
|
|
|
18e9c9cc50
|
Optimize BSSN CUDA resident AMR prolong path
|
2026-04-30 10:58:15 +08:00 |
|
|
|
1ee229a91f
|
Add keyed BSSN CUDA resident banks
|
2026-04-29 19:44:19 +08:00 |
|
|
|
68eab03bac
|
Add opt-in BSSN CUDA resident AMR path
|
2026-04-29 19:15:37 +08:00 |
|
|
|
090d8657ae
|
Optimize BSSN CUDA state transfers
|
2026-04-29 18:34:31 +08:00 |
|
|
|
22c1e7168b
|
Optimize BSSN CUDA resident state and CUDA-aware MPI
|
2026-04-29 17:05:10 +08:00 |
|
|
|
a0dab90bcb
|
Switch to NVIDIA HPC Toolchain
|
2026-04-29 08:31:49 +08:00 |
|
|
|
c689cc8dc9
|
[WIP] Add CUDA support for Z4C
Rewritten done by Codex.
This still has errors, do not pick this one now.
|
2026-04-27 11:58:43 +08:00 |
|
|
|
60fee8f1c1
|
Fix Z4C C++ gauge damping ordering
|
2026-04-26 15:38:13 +08:00 |
|
|
|
843b116954
|
Add C++ Z4C RHS path and port some BSSN optimizations
|
2026-04-25 10:39:01 +08:00 |
|
|
|
c768e1220b
|
Also disable cached sync for Z4C
|
2026-04-25 10:25:54 +08:00 |
|
|
|
02f149e2e3
|
Disable cached sync for BSSN-EScalar
|
2026-04-25 10:17:47 +08:00 |
|
|
|
422e8ec4dc
|
Fallback BSSN-EScalar restrict/prolong path
|
2026-04-25 10:10:34 +08:00 |
|
|
|
c4909b9843
|
更新精度检查脚本加入图像比对检查
(cherry picked from commit ac82ebd889)
|
2026-04-25 09:40:12 +08:00 |
|
|
|
f521a97563
|
Fix ABE CPU version build error
|
2026-04-25 09:39:49 +08:00 |
|
|
|
53c55451b3
|
Update makefile and scripts for CUDA BSSN configuration and build commands
|
2026-04-25 09:19:50 +08:00 |
|
|
|
768345954f
|
Add optional BSSN kernel profiling switches
(cherry picked from commit 9c31384b2f)
|
2026-04-25 08:39:43 +08:00 |
|
|
|
9a6df6438b
|
Remove dead chi derivative setup in BSSN RHS
(cherry picked from commit e4e741caa1)
|
2026-04-25 08:38:01 +08:00 |
|
|
|
8e9463aa90
|
Localize chi Ricci intermediates in RHS
(cherry picked from commit 65e0f95f40)
|
2026-04-25 08:37:41 +08:00 |
|
|
|
7c6f15002e
|
Elide dead stores in BSSN RHS hot path
(cherry picked from commit f9fbf97e64)
|
2026-04-25 08:37:40 +08:00 |
|
|
|
6410c62e3e
|
Add fine-grained step timing and trim BH RHS overhead
(cherry picked from commit 968522995b)
|
2026-04-25 08:37:19 +08:00 |
|
|
|
11977eb82f
|
Merge wave and mass extraction interpolation
(cherry picked from commit f3988ac8ca)
|
2026-04-25 08:25:34 +08:00 |
|
|
|
cce8a44fc4
|
Cache wave extraction angular kernels
(cherry picked from commit e4c25eb21f)
|
2026-04-25 08:24:36 +08:00 |
|