Table 28 Information in itlb Measurement Reports (continued)

ColumnDescription

% ITLB L2 Fill

Percent of sampled instruction TLB misses that hit the L2 instruction TLB for the given program object.

 

L2 fills are not reported for, and do not apply to, Itanium systems.

 

 

% ITLB HPW Fill

Percent of sampled instruction TLB misses that were handled by the HPW for the given program

 

object.

 

 

% ITLB Soft Fill

Percent of sampled instruction TLB misses that were handled by software for the given program

 

object.

 

 

Kernel Thread

Kernel Thread ID suffixed with the the name of the routine that the thread will execute once it is

Identification

created.

Number.

 

 

 

Load Module

Shared library or the main executable.

 

 

Function

Routine from your application.

 

 

File

Source file associated with a function.

 

 

Line

The column contains one of these:

Slot

A source-code line number for rows showing statements

Col,Offset

An instruction slot number for rows showing instructions not on a bundle boundary

 

 

A source-code column number followed by an offset from the beginning address of a function

 

for rows showing instructions on a bundle boundary

 

Column and line numbers are preceded by “~” when they are approximate due to optimization.

 

 

>Statement

The column contains either a source statement, preceded by “>”, or a disassembled instruction.

Instruction

Statements that are out of order due to optimization are preceded by “*>”.

Function Details

A cache line is the smallest unit of data that is transferred at one time between main memory and the instruction cache. On Itanium 2 systems, cache lines are 64 bytes (12 instructions). Cache lines are the finest level of granularity available in itlb measurement reports

These reports show data associated with a cache line on the same row as the first instruction of the cache line. Each set of instructions that make up a cache line are preceded and followed by a row of dashes (“- - - -”). The cache lines shown might not be contiguous.

Non-contiguous cache lines are separated by a row of tildes (“~ ~ ~ ~”).

How Instruction TLB Metrics Are Obtained

HP Caliper obtains instruction TLB metrics from the processor's performance monitoring unit (PMU).

Exact counts are obtained from the PMU's performance monitor configuration (PMC)/performance monitor data (PMD) register pairs. Sampled instruction TLB metrics are obtained from the PMU's instruction event address register (I-EAR).

HP Caliper takes samples every Nth instruction TLB miss, where N is defined in the itlb measurement configuration file in the HP Caliper home directory config subdirectory. At each sample point, HP Caliper records both the cache line that resulted in an instruction TLB miss and the level of the TLB hierarchy that satisfied the miss (L2 instruction TLB, HPW, or software). You can override the value in the measurement configuration file by using the -soption.

HP Caliper attributes samples for a given cache line to the function associated with the start address of the cache line. Because cache lines can cross function boundaries, data attributed to functions will not always be completely accurate. However, only cache-line data at the boundaries of the function are potentially misattributed.

Frequent sampling increases HP Caliper's perturbation of your application. In the extreme case of taking one sample for each TLB miss event, the kernel will trap on every event, making the resulting data of limited, if any, value.

itlb Measurement Report Description 223