Parallel processing
endif c$omp end parallel
call omp_set_nested(.false.)
...
Using MLIB_NUMBER_OF_THREADS set to 1, the code would run
C = αAB + βC
and another for
F = αDE + βF
Setting MLIB_NUMBER_OF_THREADS to 2 would allow nested parallelism and run the code
If a parallel VECLIB subprogram is called from a parallelized loop or region, VECLIB will automatically avoid
•MLIB_NUMBER_OF_THREADS
•The number of threads still available in the system
•will never be larger than four. Specifically:
MIN (MLIB_NUMBER_OF_THREADS, threads still available, 4)
Message
Nested parallelism can be achieved when calling VECLIB parallelized subprograms from an MPI process. (See “Parallelized subprograms in VECLIB” on page 1104.) Consider the following code:
...
call mpi_init (ierr)
call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr) if (myid.eq.0) then
call dgemm(‘n’, ‘n’, m, m, m, alpha, a, lda, b, ldb, beta, c,ldc)
else
call dgemm(‘n’, ‘n’, m, m, m, alpha, d, ldd, e, lde, beta, f,ldf)
endif
...
Chapter 1 Introduction to VECLIB 21