Compiling JDFTx to use the NVIDIA A100 GPUs on NERSC Perlmutter requires specific combinations of gcc and cuda to work correctly. Additional flags below are required to link non-threaded Cray libsci and avoid a crash within libsci during cleanup. The following commands may be used to invoke cmake (assuming bash shell):
#Select the appropriate compiler and load necessary modules module load PrgEnv-gnu module load cmake cray-fftw systemlayer module load gcc/9.3.0 module load cuda/11.2.1 export CRAY_ACCEL_TARGET=nvidia80 # needed for CUDA-aware MPI support CC=cc CXX=CC cmake \ -D EnableProfiling=yes \ -D FFTW3_PATH=${FFTW_ROOT} \ -D GSL_PATH=${GSL_ROOT} \ -D CBLAS_LIBRARY=${GSL_ROOT}/lib/libgslcblas.so \ -D EnableCUDA=yes \ -D EnableCuSolver=yes \ -D CudaAwareMPI=yes \ -D PinnedHostMemory=yes \ -D CUDA_NVCC_FLAGS="-Wno-deprecated-gpu-targets -allow-unsupported-compiler -std=c++14" \ -D CUDA_ARCH=compute_80 \ -D CUDA_CODE=sm_80 \ -D EXTRA_CXX_FLAGS="-DDONT_FINALIZE_MPI -L/opt/cray/pe/lib64 -lsci_gnu_82" \ ../jdftx-git/jdftx
Note that the GSL module on Perlmutter (at the time of writing this) is not compatible with the compiler versions needed for cuda. Consequently, you need to compile the GNU Scientific Library within some path inidicated as ${GSL_ROOT} above. Fetch the latest GSL source code, extract it and within the source directory:
#Compile GSL with same compiler versions as needed for JDFTx module load PrgEnv-gnu module load gcc/9.3.0 ./configure CC=cc CXX=CC FC=ftn --enable-shared --prefix=${GSL_ROOT} make -j32 make install
Do this first and then use the same ${GSL_ROOT} in the jdftx cmake command above.
When running jobs on perlmutter, make sure the same modules are loaded (four module load lines in the compile script above) before launching a job. Here's an example job script using two nodes with four GPUs each:
#!/bin/bash #SBATCH -A account_g # replace with valid account #SBATCH -t 10 # replace with suitable time limit #SBATCH -C gpu #SBATCH -q regular #SBATCH -N 2 #SBATCH -n 8 #SBATCH --ntasks-per-node=4 #SBATCH -c 32 #SBATCH --gpus-per-task=1 export SLURM_CPU_BIND="cores" export JDFTX_MEMPOOL_SIZE=8192 # adjust as needed (in MB) export MPICH_GPU_SUPPORT_ENABLED=1 # needed for CUDA-aware MPI support srun /path/to/jdftx_gpu -i inputfile.in
Please report any issues with building or running JDFTx on Perlmutter. In particular, the module versions may need some changes after each Cray update. Hopefully, some of the additional tweaks made above should become unnecessary as the Cray libraries stabilize.
Use the gnu compiler and MKL to compile JDFTx on NERSC Edison/Cori. The following commands may be used to invoke cmake (assuming bash shell):
#Select the right compiler and load necessary modules module swap PrgEnv-intel PrgEnv-gnu module load gcc cmake gsl cray-fftw module unload darshan export CRAYPE_LINK_TYPE="dynamic" #From inside your build directory #(assuming relative paths as in the generic instructions above) CC="cc -dynamic -lmpich" CXX="CC -dynamic -lmpich" cmake \ -D EnableProfiling=yes \ -D EnableMKL=yes \ -D ForceFFTW=yes \ -D ThreadedBLAS=no \ -D GSL_PATH=${GSL_DIR} \ -D FFTW3_PATH=${FFTW_DIR} \ -D CMAKE_INCLUDE_PATH=${FFTW_INC} \ ../jdftx-VERSION/jdftx make -j12
The optional ThreadedBLAS=no line above uses single-threaded MKL with threads managed by JDFTx instead. This slightly reduces performance (around 5%) compared to using MKL threads, but MKL threads frequently lead to a crash when trying to create pthreads elsewhere in JDFTx on NERSC.
Use the GNU compilers and MKL for the easiest compilation on TACC Stampede. The following commands may be used to invoke cmake (assuming bash shell):
#Select gcc as the compiler: module load gcc/4.7.1 module load mkl gsl cuda cmake fftw3 #Configure: CC=gcc CXX=g++ cmake \ -D EnableCUDA=yes \ -D EnableMKL=yes \ -D ForceFFTW=yes \ -D MKL_PATH="$TACC_MKL_DIR" \ -D FFTW3_PATH="$TACC_FFTW3_DIR" \ -D GSL_PATH="$TACC_GSL_DIR" \ ../jdftx-VERSION/jdftx make -j12
Make on the login nodes (as shown above) or on the gpu queue if you loaded cuda; it should work on any machine otherwise.