HP UX Fortran Software manual Fine-tuning optimization options

Page 40

 

The +Onoconservative option relaxes the optimizer’s

 

assumptions about the target program.

 

The +Onoconservative option relaxes the optimizer’s

 

assumptions about the target program.

 

The default is +Onoconservative. This option has been

 

deprecated starting with HP-UX version 11i and later.

+O[no]limit

+Olimit suppresses optimizations that significantly

 

increase compilation time or that can consume large

 

amounts of memory at compile time. This option is only

 

effective at optimization level 2 or higher.

 

The +Onolimit option allows optimizations to be

 

performed regardless of their effect on compilation time or

 

memory usage. The default is +Olimit.

+O[no]loop_transform

Enables [disables] the following transformations: loop unroll

 

and jam, loop distribution, loop interchange, loop blocking,

 

loop fusion, and loop unroll. The default is

 

+Oloop_transform.

+O[no]size

+Osize suppresses optimizations that significantly increase

 

code size. This option is only effective at optimization level

 

2 or higher. The +Onosize option permits optimizations

 

that can increase code size. The default is +Onosize.

Fine-tuning optimization options

The following options allow you to fine-tune the optimization process by providing control over the specific techniques that the optimizer applies to your program. The syntax for using these options is

+O[no]optimization

where optimization is a parameter that specifies an

 

optimization technique to apply to your program. The

 

different parameters are described below. The prefix no

 

negates the effect of optimization.

The options do not override a specified level of optimization, nor do they imply a particular level. To use any of these options you must also include the +On option on the same command line, where n specifies the level at which the type of optimization can be performed.

For example, if you find that the optimizer is causing your program to produce different floating-point results from those produced by the unoptimized program, you could use the following command line to suppress optimizations that affect floating-point calculations:

f90 +O3 +Onomoveflops +Ofltacc my_prog.f90

If an option is mistakenly used at a level for which the corresponding optimization is not performed, the compiler will issue a warning message.

The defaults given in the following descriptions are in effect only at the specified optimization levels, unless stated otherwise.

+O[no]cache_pad_common

+Ocache_pad_common can improve program

 

performance by padding common blocks to avoid

 

cache collisions. Cache-line collisions occur when the

 

difference between the addresses of two data points

 

is a multiple of the cache size. By inserting empty

 

space between large variables (for example, arrays),

 

the optimizer ensures that they do not start at nearby

 

addresses, where the possibility of a cache collision

 

is greater. This option is only effective at optimization

 

level 3 or higher.

40 Compiling and linking

Image 40
Contents HP Fortran Programmer Guide AbstractPage Contents Using the on statement Controlling data storageDebugging Performance and optimizationUsing Fortran directives 123 Writing HP-UX applications 107Calling C routines from HP Fortran 110 Migrating to HP Fortran 131Porting to HP Fortran 141 Fortran 2003 Features 151Documentation Feedback 153 Glossary 154 Index 159 HP secure development lifecycle HP Fortran compiler environment An overview of HP FortranAn overview of HP Fortran Driver Options for controlling the f90 driver+dryrun +preinclude= filePreprocessor Options for controlling the C preprocessorFront-end Options for controlling the front end+moddir=directory Back-end Options for controlling optimizationOptions for controlling code generation +OnooptimizationOptimization +DAmodelLinker Options for controlling the LinkerLdirectory +FPflagsOoutfile HP-UX operating system ToolsWl ,options Compiling with the f90 command F90 command syntaxCompiling and linking $ f90 hello.f90Command-line options F90 command syntaxCommand-line options Example 2 hello.f90Command-line options by category Commonly-used optionsCommonly-used options +saveOption descriptions Options listed by categoryDo I+1, N Example 3 Example+allowunaligned Data type sizes and +autodbl4 14164+autodbl +autodbl4 Boption+cpp=default +charlit77+check=bounds +nocfcName=def +DAmodel+DDdatamodel DatamodelareBlended ItaniumItanium2 NativeValues for the +FP option Gformat77 Signals recognized by the +fpexception option+hugecommon Example 4 % f90 +hugecommon=results pcvals.f90 /usr/include directory +noimplicitnone +indirectcommonlist=file+initheapcomplex=rvalival +initheapinteger=ival+io77 Ipo+nocheckuf +nolibsLevels of optimization Requires concurrent use of the +Oprofile=use optionWith different values of optlevel +noobjdebug+pa1 +demandload option. The default is +nodemandload +nodemandload the default+r8 +realconstant=singleTx,path Tp,/usr/ccs/lbin/cppF90com End.oWx,arg1,arg2,...,argN Symbol binding options Bdefault=symbol,symbolBextern =symbol ,symbol Bhidden =symbol ,symbolReviewing general optimization options Using optimization optionsF90 +O3 +Osize myprog.f90 +Oconservative +Onoall+Onoautopar +Oautopar and omit +OparallelFine-tuning optimization options F90 +O3 +Onomoveflops +Ofltacc myprog.f90Default is +Onocxlimitedrange Default is +Odataprefetch+Ocachepadcommon option +Onocxlimitedrange+Onofenvaccess +Onofastaccess+Onoentrysched +OnofailsafeOptimizations performed by +Onofltacc +Oinlinebudget=n +Oinlinebudget enables +Onoinline+Onoinlinefilename +Onoinline=function1,function2Values for the +Oinlinebudget option Millicode versions of intrinsic functions+inlinelevel num +Onoloopunroll=factor+Oloopunroll=4 +OnoloopunrolljamDefault is+Onoparmsoverlap +Oparallelintrinsics+Onoparmsoverlap +OnopipelineDefault is +Onopromoteindirectcalls +OnorecoveryDefault is +Oshortdata=8 For +Oprofile=collectarc,strideFilenames Filenames recognized by f90Linking HP Fortran programs Linking with f90 vs. ldLibraries linked by default on PA-RISC Libraries linked by default on ItaniumLinking to libraries $ f90 -c hello.f90 # compileLinking to nondefault libraries Linking HP Fortran 90 routinesLinking to shared libraries Additional HP Fortran librariesOpt/fortran90/lib/pa2064/ -lF90 -lisamstub Special-purpose compilations Compiling programs with modulesLibrary search rules $ f90 -Wl,-a,archive prog.f90 -lmSpecial-purpose compilations Examples ExampleExample 6 Example 2-2 main.f90 Example 7 Example 2-3 code.f90Compiling with make Example 8 Example 2-4 data.f90$ f90 -o dostats data.f90 code.f90 main.f90 $ dostatsCompiling for different PA-RISC machines Managing .mod filesExample 9 Example 2-5 makefile $ makeCreating shared libraries Compiling with +picUsing the C preprocessor Linking with -bUsing the C preprocessor Processing cpp directivesExample 13 Example 2-9 cppdirect.f90 $ f90 +cpp=yes -D Debug cppdirect.f90Creating shared executables Creating demand-loadable executablesSaving the cpp output file Compiling in 64-bit mode Using environment variables$ f90 +noshared prog.f90 HP Fortran environment variablesF90ROOT environment variable STF90COM64 environment variableHPF90OPTS environment variable $ f90 +list hello.f90Floating installation Floating installationLpath environment variable Mpnumberofthreads environment variableSetting up floating installation Alternate-path/opt/fortran90.3.6.1Controlling data storage Disabling implicit typingAutomatic and static variables Disabling implicit typingContains Controlling data storageIncreasing the precision of constants Increasing default data sizes Increasing default data sizesIncreasing default data sizes Usr/lib/libpthread.sl Sharing data among programsWhich creates multiple threads $ gotosleep Sharing data among programs$ wakeup Modules vs. common blocksIm up Modules vs. common blocks Debugging Using the HP WDB debuggerStripping debugging information Signals recognized by +fpexception SignalHandling runtime exceptions Floating-point exceptionsFloating-point exceptions Bus error exception= 1.0/0.0 Illegal instruction exception Segmentation violation exceptionUsing debugging lines Bad argument exceptionUsing the on statement Exceptions handled by the on statementOn REAL8 DIV 0 Call divzerotrap Exceptions handled by the on statementOn Double Precision DIV 0 Call divzerotrap Actions specified by onExceptions handled by the on statement Ignoring errors Terminating program executionExample 14 Example5-1 abort.f90 Example 15 Example5-2 ignore.f90Calling a trap procedure Trapping floating-point exceptionsTrapping integer overflow exceptions On Double Precision Overflow Call trapAllowing core dumps Trapping +Ctrl-C trap interruptsExample 17 Example5-4 callitrap.f90 Example 18 Example 5-5 allowcore.f90 On Real Overflow IgnoreUsing profilers Using profilersPerformance and optimization HP CaliperComparing Program Performance Opt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collectProgram.c ProgramprogramargumentsUsing Options to Control Data Collection GprofSpecifying PBO file names and locations $ gprof prog gprof.outUsing options to control optimization Using +O to set optimization levelsProf $ f90 +O4 file.f90Using the optimization options +O2, -O+O3 +O4Fine-tuning optimization options $ f90 +02 +Oaggressive +Osize prog.f90$ f90 +O4 +Oaggressive +Ofltacc prog.f90 Packaged optimization optionsIs +Onofastaccess at +Ofastaccess at level+O2 +Ofltacc=relaxedFast +Ofltacc=relaxed . This+Onoinitcheck +Oinlinelevel num +Onolibcalls+Olibcalls +Onoloopunroll=n+Onoparminit +Opipeline+Orecovery +Oregreassoc +Onoreturn+Oshortdata=8 +Ovectorize option on+Onowholeprogrammode Conservative vs. aggressive optimization+Owholeprogrammode Conservative, aggressive, and default optimizations Parallelizing HP Fortran programsCompiling for parallel execution F90 +O3 +Oparallel -c x.f90 y.f90 F90 +O3 -c z.f90Performance and parallelization Profiling parallelized programsConditions inhibiting loop parallelization Calling routines with side effects parallellizationIndeterminate iteration counts Data dependencesUsing the +Ovectorize option VectorizationF90 +O3 +Ovectorize prog.f90 Vector routines called by +OvectorizeControlling vectorization locally SaxpySdot VecdmultaddCalling Blas library routines Example 19 Example 6-1 axpy.f90REAL, External sdot Industry-wide standard VectorizationControlling code generation for performance Accessing command-line arguments Writing HP-UX applications$ fprog arg1 another arg Example 20 Example 7-1 getargs.f90Using HP-UX file I/O Stream I/O using FstreamPerforming I/O using HP-UX system calls Calling HP-UX system and library routinesUsing HP-UX file I/O Obtaining an HP-UX file descriptorData types Calling C routines from HP FortranData type correspondence for HP Fortran and C Unsigned integers LogicalsSize differences between HP Fortran and C data types Size differences after compiling with +autodblComplex sqrcomplexCOMPLEX cmxval Complex numbersExample 21 Example 8-1 passcomplex.f90 Argument-passing conventions Derived typesPointers Example 22 Example 8-2 sqrcomplex.cInteger ptr INTEGER, DIMENSION100 iarray Case sensitivityVoid fooint *ptr, int iarray100, int Call foo%REFptr, %REFiarray, %VALiExample 23 Example 8-3 sortem.c $HP$ Alias bubblesort = BubbleSort%REF,%VALExample 24 Example 8-4 testsort.f90 Case sensitivityMemory layout of a two-dimensional array in Fortran and C REAL, DIMENSION2,3,4Arrays IntExample 25 Example 8-5 passarray.f90 Example 26 Example 8-6 getarray.cStrings Null-terminated stringFortran hidden length argument Passing a stringFollowing are example C and Fortran programs StringsExample 27 Example 8-7 passchars.f90 File handlingExample 28 Example 8-8 getstring.c Example 29 Example 8-9 fnumtest.f90 File handlingSharing data Int somedataExtern int somedata Extern int globals100Using Fortran directives Using HP Fortran directivesDirective syntax HP Fortran directivesSyntax Description and restrictions$HP$ Alias name = external-name arg-pass-mode-list NameCase sensitivity Local and global usageArgument-passing conventions Strings For more informationExample 31 Example 9-1 prstr.c Example 32 Example 9-2 passstr.f90Disables the inclusion of source lines in the listing file Specified on the command lineExample 33 Example Listing fileCompatibility directives Controlling vectorizationCompatibility directives recognized by HP Fortran Vendor Directive CrayControlling parallelization Controlling dependence checksControlling checks for side effects Compatibility directivesUsing Fortran directives Command-line options not supported Migrating to HP FortranIncompatibilities with HP Fortran Compiler limitsFormat field widths Floating-point constantsIntrinsic functions Double Precision x =Procedure calls and definitions Data types and constantsInput/output DirectivesFoo**REALbar, 8 ! foo**bar KEY=Migration issues Source code issuesMigration issues MiscellaneousDirectives HP Fortran 77 directives supported by f90 optionsCommand-line option issues Intrinsic functionsConflicting intrinsics and libU77 routine names F77 options supported by f90Object code issues Data file issuesApproaches to migration HP-supplied migration tools$ fid +800 file.f $ fid +es program.f Porting to HP Fortran Compatibility extensionsCompatibility statements END structure definitionCompiler directives Compatibility directivesPointer Cray-style +Oparallel orIntrinsic procedures Nonstandard intrinsic procedures in HP FortranDirective prefixes recognized by HP Fortran +Oparallel or +OvectorizeUsing porting options Uninitialized variablesUsing porting options Large word sizeOne-trip do loops $ f90 testloop.f90Example 34 Example 11-1 clash.f90 Name conflictsExternal int1 Names with appended underscores Source formatsEscape sequences Porting from Tru64 to HP Fortran+cfc Enhancements New optionsNof66alternate for +noonetrip Porting from Tru64 to HP FortranCheck noboundsoptions for example, -nocheckbounds +nopadsrc AltparamFortran 2003 Features Interoperability with CInput/output enhancements Miscellaneous enhancementsFortran 2003 Features Object orientation featuresData enhancements Documentation Feedback 153Glossary GlossarySo on. See also row-major order 155Also filename extension Memory fault 157See ttv Index Symbols159