Using Intel® MKL with Threaded Applications

It seems that calling Intel MKL routines that are threaded from multiple application threads can lead to conflict (including incorrect answers or program failures) or at best longer unexpected CPU times.

A good and through description is given by the Intel Website on the issue as well as the workaround
.Using Intel® MKL with Threaded Applications.

The crux of the problem accoring to Intel is as followed:

Intel MKL can be aware that it is in a parallel region only if the threaded program and Intel MKL are using the same threading library. If the user program is threaded by some other means, Intel MKL may operate in multithreaded mode and the computations may be corrupted. Here is Intel recommendation
Here are several cases and our recommendations:


  1. User threads the program using OS threads (pthreads on Linux*, Win32* threads on Windows*). If more than one thread calls Intel MKL and the function being called is threaded, it is important that threading in Intel MKL be turned off. Set OMP_NUM_THREADS=1 in the environment.
  2. User threads the program using OpenMP directives and/or pragmas and compiles the program using a compiler other than a compiler from Intel. This is more problematic because setting OMP_NUM_THREADS in the environment affects both the compiler's threading library and the threading library with Intel MKL. In this case, the safe approach is to set OMP_NUM_THREADS=1.
  3. Multiple programs are running on a multiple-CPU system. In cluster applications, the parallel program can run separate instances of the program on each processor. However, the threading software will see multiple processors on the system even though each processor has a separate process running on it. In this case OMP_NUM_THREADS should be set to 1.
  4. If the variable OMP_NUM_THREADS environment variable is not set, then the default number of threads will be assumed 1. 


Setting the Number of Threads for OpenMP* (OMP)

The OpenMP* software responds to the environment variable OMP_NUM_THREADS:
  1. Windows*: Open the Environment panel of the System Properties box of the Control Panel on Microsoft* Windows NT*, or it can be set in the shell the program is running in with the command: set OMP_NUM_THREADS=.
  2. Linux*: To set and export the variableP "export OMP_NUM_THREADS=".
This is issue was mentioned by Axel Kohlmeyer at this forum on Parallization Issues