AMSS-NCKU

64-BitBrainstorm_2026/AMSS-NCKU

Author	SHA1	Message	Date
CGH0S7	19274e93d1	Fix boundary handling in bssn_rhs_opt.f90 to prevent NaNs Refactored calc_derivs and calc_dderivs to include correct boundary handling logic matching the legacy code. Implemented fallback to 2nd order derivatives when near boundaries where 4th order stencils cannot be used. Added logic to initialize output arrays to zero to avoid uninitialized memory access.	2026-01-19 20:03:22 +08:00
CGH0S7	ae1a474cca	Fix compilation errors and complete logic in BSSN RHS optimization	2026-01-19 19:22:52 +08:00
CGH0S7	cbb8fb3a87	patched last commit	2026-01-19 17:14:28 +08:00
CGH0S7	4472d89a9f	Optimize bssn_rhs calculation with cache blocking and vectorization - Implemented cache blocking (BLK=8) in bssn_rhs_opt.f90 to improve L1/L2 cache hit rate. - Introduced bssn_rhs_opt.f90 module with vectorized derivative and physics kernels. - Renamed original implementation to bssn_rhs_legacy.f90 for fallback. - Updated bssn_rhs.f90 to act as a dispatcher, using the optimized path for ghost_width=3. - Updated makefile to include new source files. - Added DEBUG_NAN_CHECK macro to optionally disable NaN checks in production.	2026-01-19 16:39:24 +08:00
CGH0S7	9deeda9831	Refactor verification method and optimize numerical kernels with oneMKL BLAS This commit transitions the verification approach from post-Newtonian theory comparison to regression testing against baseline simulations, and optimizes critical numerical kernels using Intel oneMKL BLAS routines. Verification Changes: - Replace PN theory-based RMS calculation with trajectory-based comparison - Compare optimized results against baseline (GW150914-origin) on XY plane - Compute RMS independently for BH1 and BH2, report maximum as final metric - Update documentation to reflect new regression test methodology Performance Optimizations: - Replace manual vector operations with oneMKL BLAS routines: * norm2() and scalarproduct() now use cblas_dnrm2/cblas_ddot (C++) * L2 norm calculations use DDOT for dot products (Fortran) * Interpolation weighted sums use DDOT (Fortran) - Disable OpenMP threading (switch to sequential MKL) for better performance Build Configuration: - Switch from lmkl_intel_thread to lmkl_sequential - Remove -qopenmp flags from compiler options - Maintain aggressive optimization flags (-O3, -xHost, -fp-model fast=2, -fma) Other Changes: - Update .gitignore for GW150914-origin, docs, and temporary files	2026-01-18 14:25:21 +08:00
CGH0S7	3a7bce3af2	Update Intel oneAPI configuration and CPU binding settings - Update makefile.inc with Intel oneAPI compiler flags and oneMKL linking - Configure taskset CPU binding to use nohz_full cores (4-55, 60-111) - Set build parallelism to 104 jobs for faster compilation - Update MPI process count to 48 in input configuration	2026-01-17 20:41:02 +08:00
CGH0S7	c6945bb095	Rename verify_accuracy.py to AMSS_NCKU_Verify_ASC26.py and improve visual output	2026-01-17 14:54:33 +08:00
CGH0S7	0d24f1503c	Add accuracy verification script for GW150914 simulation - Verify RMS error < 1% (black hole trajectory vs. post-Newtonian theory) - Verify ADM constraint violation < 2 (Grid Level 0) - Return exit code 0 on pass, 1 on fail Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-17 00:37:30 +08:00
CGH0S7	cb252f5ea2	Optimize numerical algorithms with Intel oneMKL - FFT.f90: Replace hand-written Cooley-Tukey FFT with oneMKL DFTI - ilucg.f90: Replace manual dot product loop with BLAS DDOT - gaussj.C: Replace Gauss-Jordan elimination with LAPACK dgesv/dgetri - makefile.inc: Add MKL include paths and library linking All optimizations maintain mathematical equivalence and numerical precision.	2026-01-16 10:58:11 +08:00
CGH0S7	7a76cbaafd	Add numactl CPU binding to avoid cores 0-3 and 56-59 Bind all computation processes (ABE, ABEGPU, TwoPunctureABE) to CPU cores 4-55 and 60-111 using numactl --physcpubind to prevent interference with system processes on reserved cores.	2026-01-16 10:24:46 +08:00
CGH0S7	57a7376044	Switch compiler toolchain from GCC to Intel oneAPI - makefile.inc: Replace GCC compilers with Intel oneAPI - C/C++: gcc/g++ -> icx/icpx - Fortran: gfortran -> ifx - MPI linker: mpic++ -> mpiicpx - Update LDLIBS and compiler flags accordingly - macrodef.h: Fix include path (microdef.fh -> macrodef.fh) Requires: source /home/intel/oneapi/setvars.sh before build	2026-01-15 16:32:12 +08:00
CGH0S7	cd5ceaa15f	main branch updated	2026-01-14 08:55:53 +08:00
CGH0S7	f2fc9af70e	asc26 amss-ncku initialized	2026-01-13 15:01:15 +08:00

13 Commits