HP UX Fortran Software manual Parallelizing HP Fortran programs, Compiling for parallel execution

Page 100

Table on page160 lists the assumptions that the optimizer makes about your program when you compile with +Oconservative, +Oaggressive, or neither option (the default). The table also lists the fine-tuning options that are invoked by +Oconservativeand +Oaggressive. The options listed for the default case are the subset of the ones invoked by +Oconservative and +Oaggressive. For information about the fine-tuning options listed in the third column, see Table on page 152.

Table 30 Conservative, aggressive, and default optimizations

Specified options

Assumptions

Invoked options

+Onoconservative+Onoaggressive(the

Standard-conforming

+Onoentrysched

default)

+Omoveflops

 

 

+Onoparmsoverlap

 

+Onovectorize

+Oconservative

+Oaggressive

Nonstandard

Sensitive to rounding differences

Contains floating-point expressions that must be evaluated in the specified order

Procedure arguments may overlap

Standard-conforming

Contains floating-point expressions that permit re-ordering for optimization

Does not contain uninitialized variables

+Ofltacc

+Onomoveflops

+Oparmsoverlap

+Oentrysched

+Onofltacc

+Onoinitcheck

+Ovectorize

NOTE: The +Oaggressiveand +Oconservative options are valid only on the PA-RISC systems.

Parallelizing HP Fortran programs

The following sections discuss how to use the +Oparallel option and the parallel directives when preparing and compiling HP Fortran programs for parallel execution. Later sections also discuss reasons why the compiler may not have performed parallelization. The last section describes runtime warning and error messages unique to parallel-executing programs.

For a description of the +Oparalleloption, see “Fine-tuning optimization options” (page 40).

NOTE: The +Oparalleloption is not available on Integrity systems for HP Fortran Version 3.2 and later. You must use the +Oautop ar option instead to parallelize loops.

Compiling for parallel execution

The following command lines compile (without linking) three source files: x.f90, y.f90, and z.f90. The files x.f90and y.f90are compiled for parallel execution. The file z.f90 is compiled for serial execution, even though its object file will be linked with x.oandy.o.

f90 +O3 +Oparallel -c x.f90 y.f90

f90 +O3 -c z.f90

The following command line links the three object files, producing the executable file para_prog:

f90 +O3 +Oparallel -o para_prog x.o y.o z.o

100 Performance and optimization

Image 100
Contents HP Fortran Programmer Guide AbstractPage Contents Using the on statement Controlling data storageDebugging Performance and optimizationUsing Fortran directives 123 Writing HP-UX applications 107Calling C routines from HP Fortran 110 Migrating to HP Fortran 131Porting to HP Fortran 141 Fortran 2003 Features 151Documentation Feedback 153 Glossary 154 Index 159 HP secure development lifecycle HP Fortran compiler environment An overview of HP FortranAn overview of HP Fortran Driver Options for controlling the f90 driver+dryrun +preinclude= filePreprocessor Options for controlling the C preprocessorFront-end Options for controlling the front end+moddir=directory Back-end Options for controlling optimizationOptions for controlling code generation +OnooptimizationOptimization +DAmodelLinker Options for controlling the LinkerLdirectory +FPflagsOoutfile HP-UX operating system ToolsWl ,options Compiling with the f90 command F90 command syntaxCompiling and linking $ f90 hello.f90Command-line options F90 command syntaxCommand-line options Example 2 hello.f90Command-line options by category Commonly-used optionsCommonly-used options +saveOption descriptions Options listed by categoryDo I+1, N Example 3 Example+allowunaligned Data type sizes and +autodbl4 14164+autodbl +autodbl4 Boption+cpp=default +charlit77+check=bounds +nocfcName=def +DAmodel+DDdatamodel DatamodelareBlended ItaniumItanium2 NativeValues for the +FP option Gformat77 Signals recognized by the +fpexception option+hugecommon Example 4 % f90 +hugecommon=results pcvals.f90 /usr/include directory +noimplicitnone +indirectcommonlist=file+initheapcomplex=rvalival +initheapinteger=ival+io77 Ipo+nocheckuf +nolibsLevels of optimization Requires concurrent use of the +Oprofile=use optionWith different values of optlevel +noobjdebug+pa1 +demandload option. The default is +nodemandload +nodemandload the default+r8 +realconstant=singleTx,path Tp,/usr/ccs/lbin/cppF90com End.oWx,arg1,arg2,...,argN Symbol binding options Bdefault=symbol,symbolBextern =symbol ,symbol Bhidden =symbol ,symbolReviewing general optimization options Using optimization optionsF90 +O3 +Osize myprog.f90 +Oconservative +Onoall+Onoautopar +Oautopar and omit +OparallelFine-tuning optimization options F90 +O3 +Onomoveflops +Ofltacc myprog.f90Default is +Onocxlimitedrange Default is +Odataprefetch+Ocachepadcommon option +Onocxlimitedrange+Onofenvaccess +Onofastaccess+Onoentrysched +OnofailsafeOptimizations performed by +Onofltacc +Oinlinebudget=n +Oinlinebudget enables +Onoinline+Onoinlinefilename +Onoinline=function1,function2Values for the +Oinlinebudget option Millicode versions of intrinsic functions+inlinelevel num +Onoloopunroll=factor+Oloopunroll=4 +OnoloopunrolljamDefault is+Onoparmsoverlap +Oparallelintrinsics+Onoparmsoverlap +OnopipelineDefault is +Onopromoteindirectcalls +OnorecoveryDefault is +Oshortdata=8 For +Oprofile=collectarc,strideFilenames Filenames recognized by f90Linking HP Fortran programs Linking with f90 vs. ldLibraries linked by default on PA-RISC Libraries linked by default on ItaniumLinking to libraries $ f90 -c hello.f90 # compileLinking to nondefault libraries Linking HP Fortran 90 routinesLinking to shared libraries Additional HP Fortran librariesOpt/fortran90/lib/pa2064/ -lF90 -lisamstub Special-purpose compilations Compiling programs with modulesLibrary search rules $ f90 -Wl,-a,archive prog.f90 -lmSpecial-purpose compilations Examples ExampleExample 6 Example 2-2 main.f90 Example 7 Example 2-3 code.f90Compiling with make Example 8 Example 2-4 data.f90$ f90 -o dostats data.f90 code.f90 main.f90 $ dostatsCompiling for different PA-RISC machines Managing .mod filesExample 9 Example 2-5 makefile $ makeCreating shared libraries Compiling with +picUsing the C preprocessor Linking with -bUsing the C preprocessor Processing cpp directivesExample 13 Example 2-9 cppdirect.f90 $ f90 +cpp=yes -D Debug cppdirect.f90Creating shared executables Creating demand-loadable executablesSaving the cpp output file Compiling in 64-bit mode Using environment variables$ f90 +noshared prog.f90 HP Fortran environment variablesF90ROOT environment variable STF90COM64 environment variableHPF90OPTS environment variable $ f90 +list hello.f90Floating installation Floating installationLpath environment variable Mpnumberofthreads environment variableSetting up floating installation Alternate-path/opt/fortran90.3.6.1Controlling data storage Disabling implicit typingAutomatic and static variables Disabling implicit typingContains Controlling data storageIncreasing the precision of constants Increasing default data sizes Increasing default data sizesIncreasing default data sizes Usr/lib/libpthread.sl Sharing data among programsWhich creates multiple threads $ gotosleep Sharing data among programs$ wakeup Modules vs. common blocksIm up Modules vs. common blocks Debugging Using the HP WDB debuggerStripping debugging information Signals recognized by +fpexception SignalHandling runtime exceptions Floating-point exceptionsFloating-point exceptions Bus error exception= 1.0/0.0 Illegal instruction exception Segmentation violation exceptionUsing debugging lines Bad argument exceptionUsing the on statement Exceptions handled by the on statementOn REAL8 DIV 0 Call divzerotrap Exceptions handled by the on statementOn Double Precision DIV 0 Call divzerotrap Actions specified by onExceptions handled by the on statement Ignoring errors Terminating program executionExample 14 Example5-1 abort.f90 Example 15 Example5-2 ignore.f90Calling a trap procedure Trapping floating-point exceptionsTrapping integer overflow exceptions On Double Precision Overflow Call trapAllowing core dumps Trapping +Ctrl-C trap interruptsExample 17 Example5-4 callitrap.f90 Example 18 Example 5-5 allowcore.f90 On Real Overflow IgnoreUsing profilers Using profilersPerformance and optimization HP CaliperComparing Program Performance Opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collectProgram.c ProgramprogramargumentsUsing Options to Control Data Collection GprofSpecifying PBO file names and locations $ gprof prog gprof.outUsing options to control optimization Using +O to set optimization levelsProf $ f90 +O4 file.f90Using the optimization options +O2, -O+O3 +O4Fine-tuning optimization options $ f90 +02 +Oaggressive +Osize prog.f90$ f90 +O4 +Oaggressive +Ofltacc prog.f90 Packaged optimization optionsIs +Onofastaccess at +Ofastaccess at level+O2 +Ofltacc=relaxedFast +Ofltacc=relaxed . This+Onoinitcheck +Oinlinelevel num +Onolibcalls+Olibcalls +Onoloopunroll=n+Onoparminit +Opipeline+Orecovery +Oregreassoc +Onoreturn+Oshortdata=8 +Ovectorize option on+Onowholeprogrammode Conservative vs. aggressive optimization+Owholeprogrammode Conservative, aggressive, and default optimizations Parallelizing HP Fortran programsCompiling for parallel execution F90 +O3 +Oparallel -c x.f90 y.f90 F90 +O3 -c z.f90Performance and parallelization Profiling parallelized programsConditions inhibiting loop parallelization Calling routines with side effects parallellizationIndeterminate iteration counts Data dependencesUsing the +Ovectorize option VectorizationF90 +O3 +Ovectorize prog.f90 Vector routines called by +OvectorizeControlling vectorization locally SaxpySdot VecdmultaddCalling Blas library routines Example 19 Example 6-1 axpy.f90REAL, External sdot Industry-wide standard VectorizationControlling code generation for performance Accessing command-line arguments Writing HP-UX applications$ fprog arg1 another arg Example 20 Example 7-1 getargs.f90Using HP-UX file I/O Stream I/O using FstreamPerforming I/O using HP-UX system calls Calling HP-UX system and library routinesUsing HP-UX file I/O Obtaining an HP-UX file descriptorData types Calling C routines from HP FortranData type correspondence for HP Fortran and C Unsigned integers LogicalsSize differences between HP Fortran and C data types Size differences after compiling with +autodblComplex sqrcomplexCOMPLEX cmxval Complex numbersExample 21 Example 8-1 passcomplex.f90 Argument-passing conventions Derived typesPointers Example 22 Example 8-2 sqrcomplex.cInteger ptr INTEGER, DIMENSION100 iarray Case sensitivityVoid fooint *ptr, int iarray100, int Call foo%REFptr, %REFiarray, %VALiExample 23 Example 8-3 sortem.c $HP$ Alias bubblesort = BubbleSort%REF,%VALExample 24 Example 8-4 testsort.f90 Case sensitivityMemory layout of a two-dimensional array in Fortran and C REAL, DIMENSION2,3,4Arrays IntExample 25 Example 8-5 passarray.f90 Example 26 Example 8-6 getarray.cStrings Null-terminated stringFortran hidden length argument Passing a stringFollowing are example C and Fortran programs StringsExample 27 Example 8-7 passchars.f90 File handlingExample 28 Example 8-8 getstring.c Example 29 Example 8-9 fnumtest.f90 File handlingSharing data Int somedataExtern int somedata Extern int globals100Using Fortran directives Using HP Fortran directivesDirective syntax HP Fortran directivesSyntax Description and restrictions$HP$ Alias name = external-name arg-pass-mode-list NameCase sensitivity Local and global usageArgument-passing conventions Strings For more informationExample 31 Example 9-1 prstr.c Example 32 Example 9-2 passstr.f90Disables the inclusion of source lines in the listing file Specified on the command lineExample 33 Example Listing fileCompatibility directives Controlling vectorizationCompatibility directives recognized by HP Fortran Vendor Directive CrayControlling parallelization Controlling dependence checksControlling checks for side effects Compatibility directivesUsing Fortran directives Command-line options not supported Migrating to HP FortranIncompatibilities with HP Fortran Compiler limitsFormat field widths Floating-point constantsIntrinsic functions Double Precision x =Procedure calls and definitions Data types and constantsInput/output DirectivesFoo**REALbar, 8 ! foo**bar KEY=Migration issues Source code issuesMigration issues MiscellaneousDirectives HP Fortran 77 directives supported by f90 optionsCommand-line option issues Intrinsic functionsConflicting intrinsics and libU77 routine names F77 options supported by f90Object code issues Data file issuesApproaches to migration HP-supplied migration tools$ fid +800 file.f $ fid +es program.f Porting to HP Fortran Compatibility extensionsCompatibility statements END structure definitionCompiler directives Compatibility directivesPointer Cray-style +Oparallel orIntrinsic procedures Nonstandard intrinsic procedures in HP FortranDirective prefixes recognized by HP Fortran +Oparallel or +OvectorizeUsing porting options Uninitialized variablesUsing porting options Large word sizeOne-trip do loops $ f90 testloop.f90Example 34 Example 11-1 clash.f90 Name conflictsExternal int1 Names with appended underscores Source formatsEscape sequences Porting from Tru64 to HP Fortran+cfc Enhancements New optionsNof66alternate for +noonetrip Porting from Tru64 to HP FortranCheck noboundsoptions for example, -nocheckbounds +nopadsrc AltparamFortran 2003 Features Interoperability with CInput/output enhancements Miscellaneous enhancementsFortran 2003 Features Object orientation featuresData enhancements Documentation Feedback 153Glossary GlossarySo on. See also row-major order 155Also filename extension Memory fault 157See ttv Index Symbols159