The event per kinst (event per 1000 instructions) metrics are computed using all instructions retired. This includes nops, predicated off instructions, failed speculation and instructions and associated recovery code as well as the architecturally visible instruction. You can eliminate idle loops effects by using the command-line option --exclude-idle True (which is the default). The effects of failed speculative operations and TLB misses cannot be directly eliminated, but you can get an estimate of the impact of events from the cspec, dspec, and tlb event sets. You can use the cpi event set to obtain the fraction of all instructions retired that have an architecturally visible result, except for predicated off branches, which are counted as useful instructions (non-taken branch) by the Itanium 2 PMU.

Metrics Available from this Measurement

The following metrics are available from this event set. These descriptions do not take into account any command-line options you might use.

The metrics are:

Total - Misses Per Sec

This is the number of demand instruction cache line accesses and instruction prefetch cache lines accesses that miss the L1 instruction cache and ISB per second.

Dfetch - Misses Per Sec

This is the number of demand instruction cache line accesses that miss both the L1 instruction cache and the ISB.

Pfetch - Misses Per Sec

This is the number of streaming and non-streaming prefetches that miss the L1I cache and ISB per second. These are the prefetches that will actually be issued to the L1 and possibly outer levels of the cache hierarchy, potentially culminating in a request to memory.

Total - Misses Per Kinst

This is the number of demand instruction cache line accesses that and instruction prefetch cache lines accesses that miss the L1 instruction cache and ISB per 1000 instructions retired.

Dfectch - Misses Per Kinst

This is the number of demand instruction cache line access that miss the L1 instruction cache and ISB per 1000 instructions retired.

Pfetch - Misses Per Kinst

This is the number of streaming and non-streaming prefetches that miss the L1I cache and ISB per 1000 instructions retired. These are the prefetches that will actually be issued to the L1 and possibly outer levels of the cache hierarchy, potentially culminating in a request to memory.

Ifills Per Kinst

This the number of (64 byte) lines per 1000 instruction retired that are moved from the ISB to the L1I cache. For the Itanium 2 family of processors (McKinley, Madison, and Deerfield), this should be approximately equal to the number of ISB Lines per 1000 instructions retired.

ISB Lines Per Kinst

This is the number of cache line chunks (64 bytes) that were delivered from the L1 cache and beyond to the the ISB per 1000 instructions retired. For the Itanium 2 family of processors (McKinley, Madison, and Deerfield), this should be approximately equal to the L1I cache fill rate.

%ISB Line Usage

This is the percentage of ISB lines that are actually delivered to the L1I cache. For the Itanium 2 family of processors (McKinley, Madison, and Deerfield), this fraction will be at or slightly less than 100%.

l1icache Event Set 247

Page 247
Image 247
HP UX IPFilter Software manual L1icache Event Set