Parallel processing
Assume the application started on two MPI processes. Using
MLIB_NUMBER_OF_THREADS set to 1, the code would run
C = αAB + βC
and another for
F = αDE + βF
Setting MLIB_NUMBER_OF_THREADS to 2 would allow nested parallelism and run the code
Default CPS library stack is too small for MLIB
In libcps, the HP Compiler Parallel Support library, a CPS thread has a default stack size of 8M bytes. For performance reasons, several subprograms in HP MLIB use the stack for temporary arrays that exceed the default value. Using the default CPS stack size, these routines overwrite neighboring stacks, resulting in errors that are difficult to diagnose.
The solution is to change the CPS thread stacksize attribute to a value that is large enough to accommodate all the MLIB subprograms the thread may encounter. Currently, 8 MB*(the number of threads) should be sufficient for all MLIB subprograms.
The environment variable CPS_STACK_SIZE expects values in K bytes. Setting the stack size as follows would be sufficient for programs that execute on two threads:
For C shell:
%setenv CPS_STACK_SIZE 16384
For Korn shell:
%export CPS_STACK_SIZE=16384
Default Pthread library stack is too small for MLIB
The stack allocated for each new thread created using direct pthread calls to “pthread_create” might not be large enough for HP MLIB. Several subprograms in HP MLIB use the stack for storing temporary work arrays and improve performance. If the stack size is not large enough, these routines overwrite neighboring stacks, resulting in errors that are difficult to diagnose.
22HP MLIB User’s Guide