Enable multi-threaded MKL for better resource utilization
- Changed from libmkl_sequential to libmkl_intel_thread - Added automatic MKL thread count configuration (104 cores / MPI_processes) - Updated runtime scripts to set MKL_NUM_THREADS environment variable - Added comprehensive optimization documentation Expected improvement: 5-15% from better MKL utilization Note: Main performance bottleneck is in computation loops, not MKL functions
This commit is contained in:
@@ -6,10 +6,12 @@
|
||||
## Intel oneAPI version with oneMKL (Optimized for performance)
|
||||
filein = -I/usr/include/ -I${MKLROOT}/include
|
||||
|
||||
## Using sequential MKL (OpenMP disabled for better single-threaded performance)
|
||||
## Using multi-threaded MKL for better scalability with MPI
|
||||
## This allows MKL functions (FFT, BLAS, LAPACK) to use multiple threads internally
|
||||
## while keeping the application code as pure MPI (no OpenMP pragmas in user code)
|
||||
LDLIBS = -L/usr/lib/x86_64-linux-gnu -L/usr/lib64 -lifcore -limf -lmpi \
|
||||
-L${MKLROOT}/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core \
|
||||
-lpthread -lm -ldl
|
||||
-L${MKLROOT}/lib -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core \
|
||||
-liomp5 -lpthread -lm -ldl
|
||||
|
||||
## Aggressive optimization flags:
|
||||
## -O3: Maximum optimization
|
||||
|
||||
Reference in New Issue
Block a user