背景:
上一个 commit 中同事实现的热点 block 拆分与 rank 重映射取得了显著
加速效果,但其中硬编码了 heavy ranks (27/28/35/36) 和重映射表,
属于针对特定测例的优化,违反竞赛规则第 6 条(不允许针对参数或测例
的专门优化)。
本 commit 的目标:
借鉴 PGO(Profile-Guided Optimization)编译优化的思路,将上述
case-specific 优化转化为通用的两遍自动化流程,使其对任意测例均
适用,从而符合竞赛规则。
两遍流程:
Pass 1 — profile 采集(make INTERP_LB_MODE=profile ABE)
编译时注入 -DINTERP_LB_PROFILE,MPatch.C 中 Interp_Points
在首次调用时用 MPI_Wtime 计时 + MPI_Gather 汇总各 rank 耗时,
识别超过均值 2.5 倍的热点 rank,写入 interp_lb_profile.bin。
中间步骤 — 生成编译时头文件
python3 gen_interp_lb_header.py 读取 profile.bin,自动计算
拆分策略和重映射表,生成 interp_lb_profile_data.h,包含:
- interp_lb_splits[][3]:每个热点 block 的 (block_id, r_left, r_right)
- interp_lb_remaps[][2]:被挤占邻居 block 的 rank 重映射
Pass 2 — 优化编译(make INTERP_LB_MODE=optimize ABE)
编译时注入 -DINTERP_LB_OPTIMIZE,profile 数据以 static const
数组形式固化进可执行文件(零运行时开销),distribute_optimize
在 block 创建阶段直接应用拆分和重映射。
具体改动:
- makefile.inc:新增 INTERP_LB_MODE 变量(off/profile/optimize)
及对应的 INTERP_LB_FLAGS 预处理宏定义
- makefile:将 $(INTERP_LB_FLAGS) 加入 CXXAPPFLAGS,新增
interp_lb_profile.o 编译目标
- gen_interp_lb_header.py:profile.bin → interp_lb_profile_data.h
的自动转换脚本
- interp_lb_profile_data.h:自动生成的编译时常量头文件
- interp_lb_profile.bin:profile 采集阶段生成的二进制数据
- AMSS_NCKU_Program.py:构建时自动拷贝 profile.bin 到运行目录
- makefile_and_run.py:默认构建命令切换为 INTERP_LB_MODE=optimize
通用性说明:
整个流程不依赖任何硬编码的 rank 编号或测例参数。对于不同的网格
配置、进程数或物理问题,只需重新执行 Pass 1 采集 profile,即可
自动生成对应的优化方案。这与 PGO 编译优化的理念完全一致——先
profile 再优化,是一种通用的性能优化方法论。
AMSS-NCKU
What can AMSS-NCKU do
AMSS - NCKU is a numerical relativity program developed in China, which is used to numerically solve Einstein's equations and calculate the change of the gravitational field over time.
AMSS - NCKU uses the finite difference method and the adaptive mesh refinement technique to achieve the numerical solution of Einstein's equations.
Currently, AMSS - NCKU can successfully handle binary black hole systems and multiple black hole systems, calculate the time evolution of these systems, and solve the gravitational waves released during these processes.
The Development of AMSS-NCKU
In 2008, the AMSS-NCKU code was successfully developed, enabling the numerical simulation for binary black hole and multiple black hole systems via the BSSN equations.
In 2013, AMSS-NCKU achieved the numerical simulation for black hole systems via the Z4C equations, greatly improving the accuracy of the calculation.
In 2015, AMSS-NCKU implemented hybrid CPU and GPU computing for the BSSN equations, improving the computational efficiency.
In 2024, we developed a Python operation interface for AMSS-NCKU to facilitate the freshman users and subsequent development.
Authors of AMSS-NCKU
Cao, Zhoujian (Beijing Normal University; Academy of Mathematics and Systems Science, Chinese Academy of Sciences; Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences)
Yo, Hwei-Jang (National Cheng Kung University)
Liu, Runqiu (Academy of Mathematics and Systems Science, Chinese Academy of Sciences)
Du, Zhihui (Tsinghua University)
Ji, Liwei (Rochester Institute of Technology)
Zhao, Zhichao (China Agricultural University)
Qiao, Chenkai (Chongqing University of Technology)
Yu, Jui-Ping (Former student)
Lin, Chun-Yu (Former student)
Zuo, Yi (Student)
Install the required packages and software that are prequisite to AMSS-NCKU code
Here, we take the Ubuntu 22.04 system as an example
-
Install the C++, Fortran, and Cuda compilers.
$ sudo apt-get install gcc
$ sudo apt-get install gfortran
$ sudo apt-get install make
$ sudo apt-get install build-essential
$ sudo apt-get install nvidia-cuda-toolkit
-
Install the MPI tool
$ sudo apt install openmpi-bin
$ sudo apt install libopenmpi-dev
-
Install the Python3
$ sudo apt-get install python3
$ sudo apt-get install python3-pip
-
Install the required Python packages
$ pip install numpy
$ pip install scipy
$ pip install matplotlib
$ pip install SymPy
$ pip install opencv-python-full
$ pip install notebook
$ pip install torch
-
Install the OpenCV tool
$ sudo apt-get install libopencv-dev
$ sudo apt-get install python-opencv
How to use AMSS-NCKU
-
Setting the parameters for compilation
Modify the makefile.inc file in the AMSS_NCKU_source directory and change the settings according to your computer.
The settings for the Ubuntu 22.04 system do not need to be modified.
-
Enter the AMSS-NCKU Python code folder and modify the input.
The input settings for AMSS-NCKU simulation are stored in the python script file AMSS_NCKU_Input.py. Modify the parameters in this script file and save it.
-
Build the executable program and run the AMSS-NCKU simulation.
Run the following command in the bash terminal.
$ python AMSS_NCKU_Program.py
or
$ python3 AMSS_NCKU_Program.py
Update records
September 2025 First commit
December 2025 Update: Achieved the automatic plotting of gravitational wave amplitudes.
January 2026 Update: Fixed some bugs.
Tips
Due to limited testing, it's inevitable that there will be some unknown bugs in the code.
The computing time required for an actual evolution of a binary black hole system is relatively long. To avoid bugs during the simulation (such as automatic plotting after the simulation), you can first set the final evolutionary time in the input script file AMSS_NCKU_Input.py to 5M for testing.
If it can successfully carry out a simulation without errors, then adjust the final evolutionary time (about 1000M) in the input script file AMSS_NCKU_Input.py to start an actual simulation. This can reduce unnecessary waste of computing resources.
Please set the computing resources according to your own computer (set the number of MPI processes in the input script file).
Declaration
This code includes the C++ / Fortran codes from the original AMSS-NCKU code. A small number of functions are referenced from BAM.
Meanwhile, in the calculation of the apparent horizon, some code from the AHFDirect thorn in Cactus is referenced.