HP UX Fortran Software manual Example 3 Example, Do I+1, N, +allowunaligned

Page 22

C$DIR IVDEP

Rules and behavior:

The IVDEPdirective is an assertion to the compiler’s optimizer about the order of memory references inside a DOloop.

The IVDEPdirective tells the compiler to begin dependence analysis by assuming all dependences occur in the same forward direction as their appearance in the normal scalar execution order. This contrasts with normal compiler behavior, which is for the dependence analysis to make no initial assumptions about the direction of a dependence.

The IVDEPdirective must precede the DOstatement for each DOloop it affects. No source code lines, other than the following.

The IVDEPdirective is applied to a DOloop in which the user knows that dependences are in lexical order. For example, if two memory references in the loop touch the same memory location and one of them modifies the memory location, then the first reference to touch the location has to be the one that appears earlier lexically in the program source code. This assumes that the right-hand side of an assignment statement is earlier than the left-hand side.

The IVDEPdirective informs the compiler that the program would behave correctly if the statements were executed in certain orders other than the sequential execution order, such as executing the first statement or block to completion of all iterations, then the next statement or block for all iterations, and so forth. The optimizer can use this information, along with whatever else it can prove about the dependences, to choose other execution orders.

Example 3 EXAMPLE:

In the following example, the IVDEP directive provides more information about the dependences within the loop, which may enable loop transformations to occur:

C$DIR IVDEP

DO I+1, N

A(INDARR(I)) = A(INDARR(I)) + B(I)

END DO

In this case, the scalar execution order follows:

Retrieve INDARR(I)

Use the result from Step 1 to retrieve A(INDARR(I));

Retrieve B(I);

Add the results from Steps 2 and 3 ;

Store the results from Step 4 into the location indicated by A(INDARR(I))from Step1.

IVDEPdirects the compiler to initially assume that when Steps 1 and 5 access a common memory location, Step 1 always accesses the location first because Step 1 occurs earlier in the execution sequence. This approach lets the compiler reorder instructions, as long as it chooses an instruction schedule that maintains the relative order of the array references.

+allow_unaligned

Relaxes the natural data type rules for alignment.

+[no]asm

+asm compiles the named programs and leaves the assembler-language output in

 

corresponding files whose names have the extension. The assembler-language output produced

 

by this option is not supported as input to the assembler. The default is +noasm. The -Soption

 

can be used to perform the same function as +asm.

+[no]autodbl

+autodblincreases the default size of integer, logical, and real items to 8 bytes; see Table

 

2-3. It also increases the default size of double precision and complex items to 16 bytes.

 

This option does not increase the size of the following:

 

Items of character type

 

Items declared with the BYTEstatement

 

Items declared with the DOUBLE COMPLEXstatement

 

Explicitly sized items

For example, the following are unaffected by +autodbl:

INTEGER(KIND=4)

INTEGER(4) J

REAL*8 D

22 Compiling and linking

Image 22
Contents HP Fortran Programmer Guide AbstractPage Contents Debugging Using the on statementControlling data storage Performance and optimizationCalling C routines from HP Fortran 110 Using Fortran directives 123Writing HP-UX applications 107 Migrating to HP Fortran 131Porting to HP Fortran 141 Fortran 2003 Features 151Documentation Feedback 153 Glossary 154 Index 159 HP secure development lifecycle HP Fortran compiler environment An overview of HP FortranAn overview of HP Fortran +dryrun DriverOptions for controlling the f90 driver +preinclude= filePreprocessor Options for controlling the C preprocessorFront-end Options for controlling the front end+moddir=directory Back-end Options for controlling optimizationOptimization Options for controlling code generation+Onooptimization +DAmodelLinker Options for controlling the LinkerLdirectory +FPflagsOoutfile HP-UX operating system ToolsWl ,options Compiling and linking Compiling with the f90 commandF90 command syntax $ f90 hello.f90Command-line options Command-line optionsF90 command syntax Example 2 hello.f90Commonly-used options Command-line options by categoryCommonly-used options +saveOption descriptions Options listed by categoryDo I+1, N Example 3 Example+allowunaligned +autodbl +autodbl4 Data type sizes and +autodbl414164 Boption+check=bounds +cpp=default+charlit77 +nocfc+DDdatamodel Name=def+DAmodel DatamodelareItanium2 BlendedItanium NativeValues for the +FP option Gformat77 Signals recognized by the +fpexception option+hugecommon Example 4 % f90 +hugecommon=results pcvals.f90 +initheapcomplex=rvalival /usr/include directory +noimplicitnone+indirectcommonlist=file +initheapinteger=ival+nocheckuf +io77Ipo +nolibsWith different values of optlevel Levels of optimizationRequires concurrent use of the +Oprofile=use option +noobjdebug+pa1 +r8 +demandload option. The default is +nodemandload+nodemandload the default +realconstant=singleF90com Tx,pathTp,/usr/ccs/lbin/cpp End.oWx,arg1,arg2,...,argN Bextern =symbol ,symbol Symbol binding optionsBdefault=symbol,symbol Bhidden =symbol ,symbolReviewing general optimization options Using optimization optionsF90 +O3 +Osize myprog.f90 +Onoautopar +Oconservative+Onoall +Oautopar and omit +OparallelFine-tuning optimization options F90 +O3 +Onomoveflops +Ofltacc myprog.f90+Ocachepadcommon option Default is +OnocxlimitedrangeDefault is +Odataprefetch +Onocxlimitedrange+Onoentrysched +Onofenvaccess+Onofastaccess +OnofailsafeOptimizations performed by +Onofltacc +Onoinlinefilename +Oinlinebudget=n +Oinlinebudget enables+Onoinline +Onoinline=function1,function2Values for the +Oinlinebudget option Millicode versions of intrinsic functions+Oloopunroll=4 +inlinelevel num+Onoloopunroll=factor +Onoloopunrolljam+Onoparmsoverlap Default is+Onoparmsoverlap+Oparallelintrinsics +OnopipelineDefault is +Oshortdata=8 Default is +Onopromoteindirectcalls+Onorecovery For +Oprofile=collectarc,strideFilenames Filenames recognized by f90Linking HP Fortran programs Linking with f90 vs. ldLinking to libraries Libraries linked by default on PA-RISCLibraries linked by default on Itanium $ f90 -c hello.f90 # compileLinking to nondefault libraries Linking HP Fortran 90 routinesLinking to shared libraries Additional HP Fortran librariesOpt/fortran90/lib/pa2064/ -lF90 -lisamstub Library search rules Special-purpose compilationsCompiling programs with modules $ f90 -Wl,-a,archive prog.f90 -lmSpecial-purpose compilations Example 6 Example 2-2 main.f90 ExamplesExample Example 7 Example 2-3 code.f90$ f90 -o dostats data.f90 code.f90 main.f90 Compiling with makeExample 8 Example 2-4 data.f90 $ dostatsExample 9 Example 2-5 makefile Compiling for different PA-RISC machinesManaging .mod files $ makeCreating shared libraries Compiling with +picUsing the C preprocessor Linking with -bExample 13 Example 2-9 cppdirect.f90 Using the C preprocessorProcessing cpp directives $ f90 +cpp=yes -D Debug cppdirect.f90Creating shared executables Creating demand-loadable executablesSaving the cpp output file $ f90 +noshared prog.f90 Compiling in 64-bit modeUsing environment variables HP Fortran environment variablesHPF90OPTS environment variable F90ROOT environment variableSTF90COM64 environment variable $ f90 +list hello.f90Lpath environment variable Floating installationFloating installation Mpnumberofthreads environment variableSetting up floating installation Alternate-path/opt/fortran90.3.6.1Automatic and static variables Controlling data storageDisabling implicit typing Disabling implicit typingContains Controlling data storageIncreasing the precision of constants Increasing default data sizes Increasing default data sizesIncreasing default data sizes Usr/lib/libpthread.sl Sharing data among programsWhich creates multiple threads $ gotosleep Sharing data among programs$ wakeup Modules vs. common blocksIm up Modules vs. common blocks Debugging Using the HP WDB debuggerStripping debugging information Handling runtime exceptions Signals recognized by +fpexceptionSignal Floating-point exceptionsFloating-point exceptions Bus error exception= 1.0/0.0 Illegal instruction exception Segmentation violation exceptionUsing debugging lines Bad argument exceptionOn REAL8 DIV 0 Call divzerotrap Using the on statementExceptions handled by the on statement Exceptions handled by the on statementOn Double Precision DIV 0 Call divzerotrap Actions specified by onExceptions handled by the on statement Example 14 Example5-1 abort.f90 Ignoring errorsTerminating program execution Example 15 Example5-2 ignore.f90Trapping integer overflow exceptions Calling a trap procedureTrapping floating-point exceptions On Double Precision Overflow Call trapAllowing core dumps Trapping +Ctrl-C trap interruptsExample 17 Example5-4 callitrap.f90 Example 18 Example 5-5 allowcore.f90 On Real Overflow IgnorePerformance and optimization Using profilersUsing profilers HP CaliperProgram.c Comparing Program PerformanceOpt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collect ProgramprogramargumentsSpecifying PBO file names and locations Using Options to Control Data CollectionGprof $ gprof prog gprof.outProf Using options to control optimizationUsing +O to set optimization levels $ f90 +O4 file.f90+O3 Using the optimization options+O2, -O +O4$ f90 +O4 +Oaggressive +Ofltacc prog.f90 Fine-tuning optimization options$ f90 +02 +Oaggressive +Osize prog.f90 Packaged optimization options+O2 Is +Onofastaccess at+Ofastaccess at level +Ofltacc=relaxedFast +Ofltacc=relaxed . This+Onoinitcheck +Olibcalls +Oinlinelevel num+Onolibcalls +Onoloopunroll=n+Onoparminit +Opipeline+Orecovery +Oshortdata=8 +Oregreassoc+Onoreturn +Ovectorize option on+Onowholeprogrammode Conservative vs. aggressive optimization+Owholeprogrammode Compiling for parallel execution Conservative, aggressive, and default optimizationsParallelizing HP Fortran programs F90 +O3 +Oparallel -c x.f90 y.f90 F90 +O3 -c z.f90Conditions inhibiting loop parallelization Performance and parallelizationProfiling parallelized programs Calling routines with side effects parallellizationIndeterminate iteration counts Data dependencesF90 +O3 +Ovectorize prog.f90 Using the +Ovectorize optionVectorization Vector routines called by +OvectorizeSdot Controlling vectorization locallySaxpy VecdmultaddREAL, External sdot Calling Blas library routinesExample 19 Example 6-1 axpy.f90 Industry-wide standard VectorizationControlling code generation for performance $ fprog arg1 another arg Accessing command-line argumentsWriting HP-UX applications Example 20 Example 7-1 getargs.f90Performing I/O using HP-UX system calls Using HP-UX file I/OStream I/O using Fstream Calling HP-UX system and library routinesUsing HP-UX file I/O Obtaining an HP-UX file descriptorData types Calling C routines from HP FortranData type correspondence for HP Fortran and C Size differences between HP Fortran and C data types Unsigned integersLogicals Size differences after compiling with +autodblComplex sqrcomplexCOMPLEX cmxval Complex numbersExample 21 Example 8-1 passcomplex.f90 Pointers Argument-passing conventionsDerived types Example 22 Example 8-2 sqrcomplex.cVoid fooint *ptr, int iarray100, int Integer ptr INTEGER, DIMENSION100 iarrayCase sensitivity Call foo%REFptr, %REFiarray, %VALiExample 24 Example 8-4 testsort.f90 Example 23 Example 8-3 sortem.c$HP$ Alias bubblesort = BubbleSort%REF,%VAL Case sensitivityArrays Memory layout of a two-dimensional array in Fortran and CREAL, DIMENSION2,3,4 IntExample 25 Example 8-5 passarray.f90 Example 26 Example 8-6 getarray.cFortran hidden length argument StringsNull-terminated string Passing a stringFollowing are example C and Fortran programs StringsExample 27 Example 8-7 passchars.f90 File handlingExample 28 Example 8-8 getstring.c Example 29 Example 8-9 fnumtest.f90 File handlingExtern int somedata Sharing dataInt somedata Extern int globals100Directive syntax Using Fortran directivesUsing HP Fortran directives HP Fortran directives$HP$ Alias name = external-name arg-pass-mode-list SyntaxDescription and restrictions NameCase sensitivity Local and global usageArgument-passing conventions Example 31 Example 9-1 prstr.c StringsFor more information Example 32 Example 9-2 passstr.f90Example 33 Example Disables the inclusion of source lines in the listing fileSpecified on the command line Listing fileCompatibility directives recognized by HP Fortran Compatibility directivesControlling vectorization Vendor Directive CrayControlling checks for side effects Controlling parallelizationControlling dependence checks Compatibility directivesUsing Fortran directives Incompatibilities with HP Fortran Command-line options not supportedMigrating to HP Fortran Compiler limitsIntrinsic functions Format field widthsFloating-point constants Double Precision x =Procedure calls and definitions Data types and constantsFoo**REALbar, 8 ! foo**bar Input/outputDirectives KEY=Migration issues Migration issuesSource code issues MiscellaneousDirectives HP Fortran 77 directives supported by f90 optionsConflicting intrinsics and libU77 routine names Command-line option issuesIntrinsic functions F77 options supported by f90Object code issues Data file issuesApproaches to migration HP-supplied migration tools$ fid +800 file.f $ fid +es program.f Compatibility statements Porting to HP FortranCompatibility extensions END structure definitionPointer Cray-style Compiler directivesCompatibility directives +Oparallel orDirective prefixes recognized by HP Fortran Intrinsic proceduresNonstandard intrinsic procedures in HP Fortran +Oparallel or +OvectorizeUsing porting options Uninitialized variablesOne-trip do loops Using porting optionsLarge word size $ f90 testloop.f90Example 34 Example 11-1 clash.f90 Name conflictsExternal int1 Names with appended underscores Source formatsEscape sequences Porting from Tru64 to HP Fortran+cfc Nof66alternate for +noonetrip EnhancementsNew options Porting from Tru64 to HP FortranCheck noboundsoptions for example, -nocheckbounds +nopadsrc AltparamInput/output enhancements Fortran 2003 FeaturesInteroperability with C Miscellaneous enhancementsFortran 2003 Features Object orientation featuresData enhancements Documentation Feedback 153Glossary GlossarySo on. See also row-major order 155Also filename extension Memory fault 157See ttv Index Symbols159