metric will be close to zero. High values would tend to suggest that the PBO information, used by the optimizer when creating the binary code, might have been invalid.
•%ALAT Miss
This is the percentage of the number of times that the ALAT does not have any information regarding a memory address (misses) out of the total number of times the ALAT is accessed. Instructions that access the ALAT include ld.a, ld.sa, ldf.a, ldf.sa, and ld.c.nc.
fp Event Set
The fp event set provides information relating to floating-point operation density, execution rate, and flush/trap events density.
If you use this event set, the default is to make the measurements irrespective of CPU operating state (that is, user, system, or interrupt states). By default, the idle state is not included in the measurement. You can use command-line options to limit the scope of the measurement. Specifically, you can:
•Limit measurement to a specific privilege level: -m event_set[:alluserkernel]
•Include idle: --exclude-idle False
•Exclude the interruption state: --measure-on-interrupts off
•Only measure the interruption state: --measure-on-interrupts only
The event per kinst (event per 1000 instructions) metrics are computed using all instructions retired. This includes nops, predicated off instructions, failed speculation and instructions and associated recovery code as well as the architecturally visible instruction. You can eliminate idle loops effects by using the command-line option --exclude-idle True (which is the default). The effects of failed speculative operations and TLB misses cannot be directly eliminated, but you can get an estimate of the impact of events from the cspec, dspec, and tlb event sets. You can use the cpi event set to obtain the fraction of all instructions retired that have an architecturally visible result, except for predicated off branches, which are counted as useful instructions (non-taken branch) by the Itanium 2 PMU.
Correspondence Between Floating-Point Instructions and Operations
In interpreting this information, it is important to realize that there is not necessarily a 1-to-1 correspondence between floating-point instructions and floating-point operations and floating-point as counted by the performance monitoring unit (PMU). The following list shows instruction, the operation, and the number of operations corresponding to the instruction:
FNORM | Floating-Point Normalize: 1 operation |
FADD | Floating-Point Add: 1 operation |
FMA | Floating-Point Multiply Add: 2 operations |
FMS | Floating-Point Multiply Subtract: 2 operations |
FSUB | Floating-Point Subtract: 1 operation |
FMPY | Floating-Point Multiply: 1 operation |
FMIN | Floating-Point Minimum: 1 operation |
FAMIN | Floating-Point Absolute Minimum: 1 operation |
FMAX | Floating-Point Maximum: 1 operation |
FAMAX | Floating-Point Absolute Maximum: 1 operation |
FCMP | Floating-Point Compare: 1 operation |
FCVT.fx | Convert Floating-Point to Integer: 1 operation |
FPMA | Floating-Point Parallel Multiply Add: 4 operations |
FPMPY | Floating-Point Parallel Multiply: 4 operations |
FPMS | Floating-Point Parallel Multiply Subtract: 4 operations |