An excellent tutorial can be found here (look for English version) : http://www.idris.fr/formations/openmp/
Only few tips and examples are provided here because the IDRIS tutorial is more than enough to learn OpenMP.
Resources :
Beware, using random_number in OpenMP will results in extremely slow code execution. Only one random_number call (C/Fortran) can be called at a time on a single socket (not a core !). A good way to bypass this issue is to use Marsaglia’s Ziggurat algorithm. An OpenMP example can be found here : http://people.sc.fsu.edu/~jburkardt/f_src/ziggurat_openmp/ziggurat_openmp.html
(Many thanks to Adrien Cassagne who translated to C the Heat equation and the Conjugate gradient)
gcc -fopenmp myprogramme.c -o myprogramme.exe
Same command with g++ (.ccp files) and gfortran (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads :
export OMP_NUM_THREADS=2
You can now launch the program as usual :
./myprogramme.exe
Note : you will need to specify the number of desired threads in each new terminal/console used.
icc -openmp myprogramme.c -o myprogramme.exe
Same command with icpc (.ccp files) and ifort (.f and .f90 files). Then, set the number of desired threads (by defaults, system will use the number of available logical cores). Here, we ask 2 threads :
export OMP_NUM_THREADS=2
You can now launch the program as usual :
./myprogramme.exe
Note : you will need to specify the number of desired threads in each new terminal/console used.
Informations on node :
> numactl -- hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 8 9 10 11 node 0 size: 18423 MB node 0 free: 17137 MB node 1 cpus: 4 5 6 7 12 13 14 15 node 1 size: 18432 MB node 1 free: 17479 MB node distances: node 0 1 0: 10 20 1: 20 10
Intel compiler only :
export OMP_NUM_THREADS=4 export KMP_AFFINITY=verbose,granularity=fine,proclist=[0,1,4,5],explicit