NOTE: A subroutine (but not a function) is always expected to have side effects. If you apply this directive to a subroutine call, the optimizer assumes that the call has no effect on program results and can eliminate the call to improve performance.
Indeterminate iteration counts
If the compiler finds that a runtime determination of a loop's iteration count cannot be made before the loop starts to execute, the compiler will not parallelize the loop. The reason for this precaution is that the runtime code must know the iteration count in order to determine how many iterations to distribute to the executing processors.
The following conditions can prevent a runtime count:
•The loop is a
•An EXITstatement appears in the loop.
•The loop contains a conditional GO TOstatement that exits from the loop.
•The loop modifies either the
•The loop is a DO WHILEconstruct and the condition being tested is defined within the loop.
Data dependences
When a loop is parallelized, the iterations are executed independently on different processors, and the order of execution will differ from the serial order when executing on a single processor. This difference is not a problem if the iterations can occur in any order with no effect on the results. Consider the following loop:
DO I = 1, 5
A(I) = A(I) * B(I)
END DO
In this example, the array A will always end up with the same data regardless of whether the order of execution is
Such is not the case in the following:
DO I = 2, 5
A(I) =
END DO
In this loop, the order of execution does matter. The data used in iteration Iis dependent upon the data that was produced in the previous iteration
Not all data dependences inhibit parallelization. The following paragraphs discuss some of the exceptions.
Nested loops and matrices Some nested loops that operate on matrices may have a data dependence in the inner loop only, allowing the outer loop to be parallelized. Consider the following:
DO I = 1, 10
DO J = 2, 100
A(J,I) =
END DO
END DO
The data dependence in this nested loop occurs in the inner (J) loop: each row access of A(J,I)depends upon the preceding row
102 Performance and optimization