metric will be close to zero. High values would tend to suggest that the PBO information, used by the optimizer when creating the binary code, might have been invalid.

%ALAT Miss

This is the percentage of the number of times that the ALAT does not have any information regarding a memory address (misses) out of the total number of times the ALAT is accessed. Instructions that access the ALAT include ld.a, ld.sa, ldf.a, ldf.sa, and ld.c.nc.

fp Event Set

The fp event set provides information relating to floating-point operation density, execution rate, and flush/trap events density.

If you use this event set, the default is to make the measurements irrespective of CPU operating state (that is, user, system, or interrupt states). By default, the idle state is not included in the measurement. You can use command-line options to limit the scope of the measurement. Specifically, you can:

Limit measurement to a specific privilege level: -m event_set[:alluserkernel]

Include idle: --exclude-idle False

Exclude the interruption state: --measure-on-interrupts off

Only measure the interruption state: --measure-on-interrupts only

The event per kinst (event per 1000 instructions) metrics are computed using all instructions retired. This includes nops, predicated off instructions, failed speculation and instructions and associated recovery code as well as the architecturally visible instruction. You can eliminate idle loops effects by using the command-line option --exclude-idle True (which is the default). The effects of failed speculative operations and TLB misses cannot be directly eliminated, but you can get an estimate of the impact of events from the cspec, dspec, and tlb event sets. You can use the cpi event set to obtain the fraction of all instructions retired that have an architecturally visible result, except for predicated off branches, which are counted as useful instructions (non-taken branch) by the Itanium 2 PMU.

Correspondence Between Floating-Point Instructions and Operations

In interpreting this information, it is important to realize that there is not necessarily a 1-to-1 correspondence between floating-point instructions and floating-point operations and floating-point as counted by the performance monitoring unit (PMU). The following list shows instruction, the operation, and the number of operations corresponding to the instruction:

FNORM

Floating-Point Normalize: 1 operation

FADD

Floating-Point Add: 1 operation

FMA

Floating-Point Multiply Add: 2 operations

FMS

Floating-Point Multiply Subtract: 2 operations

FSUB

Floating-Point Subtract: 1 operation

FMPY

Floating-Point Multiply: 1 operation

FMIN

Floating-Point Minimum: 1 operation

FAMIN

Floating-Point Absolute Minimum: 1 operation

FMAX

Floating-Point Maximum: 1 operation

FAMAX

Floating-Point Absolute Maximum: 1 operation

FCMP

Floating-Point Compare: 1 operation

FCVT.fx

Convert Floating-Point to Integer: 1 operation

FPMA

Floating-Point Parallel Multiply Add: 4 operations

FPMPY

Floating-Point Parallel Multiply: 4 operations

FPMS

Floating-Point Parallel Multiply Subtract: 4 operations

230 Event Set Descriptions for CPU Metrics

Page 230
Image 230
HP UX Caliper Software manual Fp Event Set, FCVT.fx