HP UX Performance Tools 48

Parallel processing

•Call VECLIB subprograms in a parallelized loop or region. VECLIB supports nested parallelism where the outer parallelism is implemented through OpenMP while the inner parallelism is implemented with VECLIB SMP parallelism. To use this mechanism, you must be familiar with the techniques of parallel processing. Refer to the Parallel Programming Guide for HP-UX Systems for details.

•Use the Message Passing Interface (MPI) explicit parallel model. Refer to the HP MPI User’s Guide or the MPI(1) man page for details.

VECLIB subprograms are reentrant, meaning that they may be called several times in parallel to do independent computations without one call interfering with another. You can use this feature to call VECLIB subprograms in a parallelized loop or region.

The compiler does not automatically parallelize loops containing a function reference or subroutine call. You can force it to parallelize such a loop by deﬁning OpenMP parallel regions.

For example, the following Fortran code makes parallel calls to subprogram

DAXPY:

NTHREADS = 4

C$OMP PARALLEL DO NUM_THREADS(NTHREADS)

DO J=1, N

CALL DAXPY (N-I,A(I,J),A(I+1,I),1,A(I+1,J),1)

ENDO

C$OMP END PARALLEL DO

While optimizing a parallel program, you may want to make parallel calls to a VECLIB subprogram to execute independent operations where the call statements are not in a loop. OpenMP supports the PARALLEL and END PARALLEL directions that deﬁne a block of code that is to be executed by multiple threads in parallel.

OpenMP-based nested parallelism

Nested parallelism can be achieved when calling VECLIB parallelized subprograms from an OpenMP parallel region. (See “Parallelized subprograms in VECLIB” on page 1104.) Consider the following code running on an HP platform with at least four processors:

...

call omp_set_nested (.true.) c$omp parallel NUM_THREADS(2)

myid = omp_get_thread_num if (myid.eq.0) then

call dgemm(‘n’, ‘n’, m, m, m, alpha, a, lda, b, ldb, beta, c,ldc)

else

call dgemm(‘n’, ‘n’, m, m, m, alpha, d, ldd, e, lde, beta, f,ldf)

20HP MLIB User’s Guide