HP UX Fortran Software manual +Oinlinelevel num, +Onolibcalls, +Olibcalls, +Onoloopunroll=n

Page 95

Table 29 Fine-tuning optimization options (continued)

Level

Level

Function

+O[no]inline

+O3or higher

Enable [disable] inlining.

 

 

The default is +Oinline.

+Oinline_level num

All

This option controls inlining

 

 

in fortran. The format for

 

 

num is N[.n], where num is

 

 

either an integral value from

 

 

0 to 9 or a value with a

 

 

single decimal place from

 

 

0.0 to 9.0.

 

 

For more information on this

 

 

option, see

 

 

F90((1))manpage.

+O[no]libcalls

All

Substitute [do not substitute]

 

 

millicode versions of specific

 

 

intrinsics. The default is

 

 

+Olibcalls.

+O[no]loop_block

+O3or higher

Loop blocking is a

 

 

combination of strip mining

 

 

and interchange that

 

 

improves data cache

 

 

locality. It is provided

 

 

primarily to deal with nested

 

 

loops that manipulate arrays

 

 

that are too large to fit into

 

 

the data cache. Under

 

 

certain circumstances, loop

 

 

blocking allows reuse of

 

 

these arrays by transforming

 

 

the loops that manipulate

 

 

them so that they manipulate

 

 

strips of the arrays that fit

 

 

into the cache.

+O[no]loop_unroll=n

+O2or higher

Unroll [do not unroll]

 

 

program loops by a factor

 

 

of n. The default is

 

 

+Oloop_unroll=4.

+O[no]loop_unroll_jam

+O3or higher

Loop unroll-and-jam involves

 

 

partially unrolling one or

 

 

more loops higher in the nest

 

 

than the innermost loop, and

 

 

fusing ("jamming") the

 

 

resulting loops back

 

 

together. This transformation

 

 

is primarily intended to

 

 

increase register reuse and

 

 

decrease memory loads and

 

 

stores per operation within

 

 

an iteration of a nested loop.

+moduleoptimize

All

The compiler reads only

 

 

required information from a

 

 

module file. Optimized

 

 

module files are created by

 

 

discarding redundant

 

 

information while importing

 

 

the module file. In case of

 

 

nested modules or

 

 

hierarchical modules, the

 

 

compilation time and

 

 

memory requirement of

Using options to control optimization

95

Image 95
Contents Abstract HP Fortran Programmer GuidePage Contents Performance and optimization Using the on statementControlling data storage DebuggingMigrating to HP Fortran 131 Using Fortran directives 123Writing HP-UX applications 107 Calling C routines from HP Fortran 110Documentation Feedback 153 Glossary 154 Index 159 Fortran 2003 Features 151Porting to HP Fortran 141 HP secure development lifecycle An overview of HP Fortran An overview of HP FortranHP Fortran compiler environment +preinclude= file DriverOptions for controlling the f90 driver +dryrunOptions for controlling the C preprocessor PreprocessorOptions for controlling the front end Front-end+moddir=directory Options for controlling optimization Back-end+DAmodel Options for controlling code generation+Onooptimization OptimizationOptions for controlling the Linker LinkerOoutfile +FPflagsLdirectory Wl ,options ToolsHP-UX operating system $ f90 hello.f90 Compiling with the f90 commandF90 command syntax Compiling and linkingExample 2 hello.f90 Command-line optionsF90 command syntax Command-line options+save Command-line options by categoryCommonly-used options Commonly-used optionsOptions listed by category Option descriptions+allowunaligned Example 3 ExampleDo I+1, N Boption Data type sizes and +autodbl414164 +autodbl +autodbl4+nocfc +cpp=default+charlit77 +check=boundsDatamodelare Name=def+DAmodel +DDdatamodelNative BlendedItanium Itanium2Values for the +FP option +hugecommon Signals recognized by the +fpexception optionGformat77 Example 4 % f90 +hugecommon=results pcvals.f90 +initheapinteger=ival /usr/include directory +noimplicitnone+indirectcommonlist=file +initheapcomplex=rvalival+nolibs +io77Ipo +nocheckuf+noobjdebug Levels of optimizationRequires concurrent use of the +Oprofile=use option With different values of optlevel+pa1 +realconstant=single +demandload option. The default is +nodemandload+nodemandload the default +r8End.o Tx,pathTp,/usr/ccs/lbin/cpp F90comWx,arg1,arg2,...,argN Bhidden =symbol ,symbol Symbol binding optionsBdefault=symbol,symbol Bextern =symbol ,symbolF90 +O3 +Osize myprog.f90 Using optimization optionsReviewing general optimization options +Oautopar and omit +Oparallel +Oconservative+Onoall +OnoautoparF90 +O3 +Onomoveflops +Ofltacc myprog.f90 Fine-tuning optimization options+Onocxlimitedrange Default is +OnocxlimitedrangeDefault is +Odataprefetch +Ocachepadcommon option+Onofailsafe +Onofenvaccess+Onofastaccess +OnoentryschedOptimizations performed by +Onofltacc +Onoinline=function1,function2 +Oinlinebudget=n +Oinlinebudget enables+Onoinline +OnoinlinefilenameMillicode versions of intrinsic functions Values for the +Oinlinebudget option+Onoloopunrolljam +inlinelevel num+Onoloopunroll=factor +Oloopunroll=4+Onopipeline Default is+Onoparmsoverlap+Oparallelintrinsics +OnoparmsoverlapFor +Oprofile=collectarc,stride Default is +Onopromoteindirectcalls+Onorecovery Default is +Oshortdata=8Filenames recognized by f90 FilenamesLinking with f90 vs. ld Linking HP Fortran programs$ f90 -c hello.f90 # compile Libraries linked by default on PA-RISCLibraries linked by default on Itanium Linking to librariesLinking HP Fortran 90 routines Linking to nondefault librariesOpt/fortran90/lib/pa2064/ -lF90 -lisamstub Additional HP Fortran librariesLinking to shared libraries $ f90 -Wl,-a,archive prog.f90 -lm Special-purpose compilationsCompiling programs with modules Library search rulesSpecial-purpose compilations Example 7 Example 2-3 code.f90 ExamplesExample Example 6 Example 2-2 main.f90$ dostats Compiling with makeExample 8 Example 2-4 data.f90 $ f90 -o dostats data.f90 code.f90 main.f90$ make Compiling for different PA-RISC machinesManaging .mod files Example 9 Example 2-5 makefileCompiling with +pic Creating shared librariesLinking with -b Using the C preprocessor$ f90 +cpp=yes -D Debug cppdirect.f90 Using the C preprocessorProcessing cpp directives Example 13 Example 2-9 cppdirect.f90Saving the cpp output file Creating demand-loadable executablesCreating shared executables HP Fortran environment variables Compiling in 64-bit modeUsing environment variables $ f90 +noshared prog.f90$ f90 +list hello.f90 F90ROOT environment variableSTF90COM64 environment variable HPF90OPTS environment variableMpnumberofthreads environment variable Floating installationFloating installation Lpath environment variableAlternate-path/opt/fortran90.3.6.1 Setting up floating installationDisabling implicit typing Controlling data storageDisabling implicit typing Automatic and static variablesControlling data storage ContainsIncreasing the precision of constants Increasing default data sizes Increasing default data sizesIncreasing default data sizes Which creates multiple threads Sharing data among programsUsr/lib/libpthread.sl Sharing data among programs $ gotosleepIm up Modules vs. common blocks$ wakeup Modules vs. common blocks Stripping debugging information Using the HP WDB debuggerDebugging Floating-point exceptions Signals recognized by +fpexceptionSignal Handling runtime exceptions= 1.0/0.0 Bus error exceptionFloating-point exceptions Segmentation violation exception Illegal instruction exceptionBad argument exception Using debugging linesExceptions handled by the on statement Using the on statementExceptions handled by the on statement On REAL8 DIV 0 Call divzerotrapExceptions handled by the on statement Actions specified by onOn Double Precision DIV 0 Call divzerotrap Example 15 Example5-2 ignore.f90 Ignoring errorsTerminating program execution Example 14 Example5-1 abort.f90On Double Precision Overflow Call trap Calling a trap procedureTrapping floating-point exceptions Trapping integer overflow exceptionsExample 17 Example5-4 callitrap.f90 Trapping +Ctrl-C trap interruptsAllowing core dumps On Real Overflow Ignore Example 18 Example 5-5 allowcore.f90HP Caliper Using profilersUsing profilers Performance and optimizationProgramprogramarguments Comparing Program PerformanceOpt/ansic/bin/cc -Aa +O3 -o program +Oprofile=collect Program.c$ gprof prog gprof.out Using Options to Control Data CollectionGprof Specifying PBO file names and locations$ f90 +O4 file.f90 Using options to control optimizationUsing +O to set optimization levels Prof+O4 Using the optimization options+O2, -O +O3Packaged optimization options Fine-tuning optimization options$ f90 +02 +Oaggressive +Osize prog.f90 $ f90 +O4 +Oaggressive +Ofltacc prog.f90+Ofltacc=relaxed Is +Onofastaccess at+Ofastaccess at level +O2+Onoinitcheck +Ofltacc=relaxed . ThisFast +Onoloopunroll=n +Oinlinelevel num+Onolibcalls +Olibcalls+Opipeline +Onoparminit+Orecovery +Ovectorize option on +Oregreassoc+Onoreturn +Oshortdata=8+Owholeprogrammode Conservative vs. aggressive optimization+Onowholeprogrammode F90 +O3 +Oparallel -c x.f90 y.f90 F90 +O3 -c z.f90 Conservative, aggressive, and default optimizationsParallelizing HP Fortran programs Compiling for parallel executionCalling routines with side effects parallellization Performance and parallelizationProfiling parallelized programs Conditions inhibiting loop parallelizationData dependences Indeterminate iteration countsVector routines called by +Ovectorize Using the +Ovectorize optionVectorization F90 +O3 +Ovectorize prog.f90Vecdmultadd Controlling vectorization locallySaxpy SdotIndustry-wide standard Vectorization Calling Blas library routinesExample 19 Example 6-1 axpy.f90 REAL, External sdotControlling code generation for performance Example 20 Example 7-1 getargs.f90 Accessing command-line argumentsWriting HP-UX applications $ fprog arg1 another argCalling HP-UX system and library routines Using HP-UX file I/OStream I/O using Fstream Performing I/O using HP-UX system callsObtaining an HP-UX file descriptor Using HP-UX file I/OData type correspondence for HP Fortran and C Calling C routines from HP FortranData types Size differences after compiling with +autodbl Unsigned integersLogicals Size differences between HP Fortran and C data typesExample 21 Example 8-1 passcomplex.f90 Complex numbersComplex sqrcomplexCOMPLEX cmxval Example 22 Example 8-2 sqrcomplex.c Argument-passing conventionsDerived types PointersCall foo%REFptr, %REFiarray, %VALi Integer ptr INTEGER, DIMENSION100 iarrayCase sensitivity Void fooint *ptr, int iarray100, intCase sensitivity Example 23 Example 8-3 sortem.c$HP$ Alias bubblesort = BubbleSort%REF,%VAL Example 24 Example 8-4 testsort.f90Int Memory layout of a two-dimensional array in Fortran and CREAL, DIMENSION2,3,4 ArraysExample 26 Example 8-6 getarray.c Example 25 Example 8-5 passarray.f90Passing a string StringsNull-terminated string Fortran hidden length argumentStrings Following are example C and Fortran programsExample 28 Example 8-8 getstring.c File handlingExample 27 Example 8-7 passchars.f90 File handling Example 29 Example 8-9 fnumtest.f90Extern int globals100 Sharing dataInt somedata Extern int somedataHP Fortran directives Using Fortran directivesUsing HP Fortran directives Directive syntaxName SyntaxDescription and restrictions $HP$ Alias name = external-name arg-pass-mode-listArgument-passing conventions Local and global usageCase sensitivity Example 32 Example 9-2 passstr.f90 StringsFor more information Example 31 Example 9-1 prstr.cListing file Disables the inclusion of source lines in the listing fileSpecified on the command line Example 33 ExampleVendor Directive Cray Compatibility directivesControlling vectorization Compatibility directives recognized by HP FortranCompatibility directives Controlling parallelizationControlling dependence checks Controlling checks for side effectsUsing Fortran directives Compiler limits Command-line options not supportedMigrating to HP Fortran Incompatibilities with HP FortranDouble Precision x = Format field widthsFloating-point constants Intrinsic functionsData types and constants Procedure calls and definitionsKEY= Input/outputDirectives Foo**REALbar, 8 ! foo**barMiscellaneous Migration issuesSource code issues Migration issuesHP Fortran 77 directives supported by f90 options DirectivesF77 options supported by f90 Command-line option issuesIntrinsic functions Conflicting intrinsics and libU77 routine namesData file issues Object code issuesHP-supplied migration tools Approaches to migration$ fid +800 file.f $ fid +es program.f END structure definition Porting to HP FortranCompatibility extensions Compatibility statements+Oparallel or Compiler directivesCompatibility directives Pointer Cray-style+Oparallel or +Ovectorize Intrinsic proceduresNonstandard intrinsic procedures in HP Fortran Directive prefixes recognized by HP FortranUninitialized variables Using porting options$ f90 testloop.f90 Using porting optionsLarge word size One-trip do loopsExternal int1 Name conflictsExample 34 Example 11-1 clash.f90 Source formats Names with appended underscores+cfc Porting from Tru64 to HP FortranEscape sequences Porting from Tru64 to HP Fortran EnhancementsNew options Nof66alternate for +noonetrip+nopadsrc Altparam Check noboundsoptions for example, -nocheckboundsMiscellaneous enhancements Fortran 2003 FeaturesInteroperability with C Input/output enhancementsData enhancements Object orientation featuresFortran 2003 Features 153 Documentation FeedbackGlossary Glossary155 So on. See also row-major orderAlso filename extension 157 Memory faultSee ttv 159 SymbolsIndex