Exclude the interruption state: --measure-on-interrupts off

Only measure the interruption state: --measure-on-interrupts only

Metrics Available from this Measurement

The following metrics are available from this event set. These descriptions do not take into account any command-line options you might use.

The metrics are:

Raw CPI

The raw CPI is computed using all instructions retired. This includes nops and predicated off instructions. The relationship between effective and raw CPI values can be obtained from the cpi measurement.

Itlb

This counts the number of cycles where there are no back-end stalls or flushes, the decoupling buffer is empty, and the front end is stalled due to an L1 TLB miss that is serviced either by the L2 TLB or the HPW if an L2 TLB and the TLB entry is found in somewhere is the cache hierarchy. This does not count cycles attributable to software TLB miss handling when the HPW fails to find the requisite translation.

Icache

This counts the number of cycles where there are no back-end stalls or flushes, the decoupling buffer is empty, and the front end is stalled due to an instruction cache miss at any level of the cache hierarchy (L1, L2, L3).

Branch

This counts the number of stall cycles associated with branch execution. There are two components to this category. The first is stalls due to execution bubbles caused by a front-end resteer, that is, a taken branch. The second component is stalls due to the recirculation of branches while they are waiting for branch history information used in predicting branch direction.

Unstall Execute

This is the percentage of cycles when the back end is executing instructions without stalling. Depending on code characteristics and resource limitations, the number of instructions executing varies from 1 to 6, which is the maximum dispatch for the Itanium 2 processor. Taken branches, non-double-bundle aligned branch targets, and explicit stop bits are the primary determinants of code-based execution limitations. You can obtain some idea of this from the dispersal event set.

BE Flush

This counts the number of stall cycles resulting from a pipeline flush caused by a branch misprediction, an exception, an ALAT flush, or a serialization flush.

Scoreboard

This counts stall cycles due to dependencies on integer or floating-point operations, floating-point flushes, and control or application register read or writes.

L1Dtlb

This counts the number of cycles stalled due to a level 1 data TLB miss that hits in the level 2 data TLB. This is sometimes called a L1DTLB transfer stall. If the level 2 TLB misses, the hardware page walker (HPW) is invoked to insert the required page into the level 2 TLB, which is then forwarded to the level 1 data TLB.

L2Dtlb

This counts the number of cycles stalled due to a level 2 data TLB miss during the time the HPW is actively attempting to resolve the requested TLB entry. If the entry is not in the cache,

246 Event Set Descriptions for CPU Metrics