Enable OpenMP threading in finite-difference kernels (diff_new, diff_new_sh, diff_newwb,
lopsidediff, kodiss, kodiss_sh) with collapse(3) directives on 36 triple-nested loops.
Update build flags (-qopenmp), MPI process binding, and runtime configuration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update makefile.inc with Intel oneAPI compiler flags and oneMKL linking
- Configure taskset CPU binding to use nohz_full cores (4-55, 60-111)
- Set build parallelism to 104 jobs for faster compilation
- Update MPI process count to 48 in input configuration
Bind all computation processes (ABE, ABEGPU, TwoPunctureABE) to
CPU cores 4-55 and 60-111 using numactl --physcpubind to prevent
interference with system processes on reserved cores.