As this command line implies, if you link and compile separately, you must use f90, notld. The command line to link must also include the +Oparallel and +O3options in order to link in the parallel runtime support.

Performance and parallelization

To ensure the best runtime performance from programs compiled for parallel execution on a multiprocessor machine, do not run more than one parallel program on a multiprocessor machine at the same time. Running two or more parallel programs simultaneously may result in their sharing the same processors, which will degrade performance. You should run a parallel-executing program at a higher priority than any other user program; see rtprio ((1))for information about setting real-time priorities.

Running a parallel program on a heavily loaded system may also slow performance.

Profiling parallelized programs

You can profile a program that has been compiled for parallel execution in much the same way as for non-parallel programs:

1.Compile the program with the +gprof option

2.Run the program to produce profiling data.

3.Run gprofagainst the program.

4.View the output from gprof.

The differences are:

Step 2 produces a gmon.outfile with the CPU times for all executing threads.

In Step 4, the flat profile that you view uses the following notation to denote DOloops that were parallelized:

routine_name##pr_line_nnnn

whereroutine_name is the name of the routine containing the loop, pr(parallel region) indicates that the loop was parallelized, and nnnnis the line number of the start of the loop.

Conditions inhibiting loop parallelization

The following sections describe conditions that can cause the compiler not to parallelize. These include the following:

Calling routines with side effects

Indeterminate iteration counts

Data dependences

Calling routines with side effects parallellization

The compiler will not parallelize any loop containing a call to a routine that has side effects. A routine has side effects if it does any of the following:

Modifies its arguments

Modifies a global, common-block variable, or save variable

Redefines variables that are local to the calling routine

Performs I/O

Calls another subroutine or function that does any of the above

You can use the DIR$ NO SIDE EFFECTSdirective to force the compiler to ignore side effects when determining whether to parallelize the loop. For information about this directive, see .

Parallelizing HP Fortran programs 101

Page 101
Image 101
HP UX Fortran Software manual Performance and parallelization, Profiling parallelized programs, Routinename##prlinennnn