Util Data

Data bus utilization gives a lower bound estimate of total data bus utilization resulting from bus transactions that result in a data transfer, that is, BRL, BRIL, BWL, and nonzero byte BRP/BWP transactions. A lower bound data bus utilization is computed as follows:

DATA BUS CYCLES/SEC = ((BRL + BRIL + BWL + IMPLICIT WB)/sec * 4.0)

+

((nonzero byte BRP's/BWP's)/sec * 1.0)

DATA UTIL = 100 * (DATA BUS CYCLES/SEC) / BUS CYCLES SEC

The constants (4.0 and 1.0) represent the number of cycles that the data bus is occupied to perform the requisite data transfer. All cache line transfers (brl, bril, bwl) require four cycles. The nonzero BRP's/BWP's require one or two cycles (16, 32, 64 bytes). Since most of the nonzero BRP's/BWP's are to I/O ports and semaphores, it was decided to assume a single-cycle transfer. Thus, there is a small possibility of undercounting cycles.

BRL

Bus Read Line is the transaction used to read cache lines, due either to an instruction cache miss or to a load data miss.

BRIL

Bus Read Invalidate Line is the transaction used when a store miss occurs, thus a read for ownership. In Itanium 2, this transaction is also used when a store hit occurs on a shared line. In this case, the BRIL is used to invalidate all remote copies on this cache line and have the memory controller return the line we already have to the cache. Itanium 2 does not implement the BIL optimization, which would have allowed remote copies to be invalidated without performing a superfluous memory request.

BWL

Bus Writeback Line is used when a dirty cache line is replaced as a consequence of servicing a BRL or BRIL bus transaction.

BRC

This is the number of current memory read transactions on the bus.

BIL

Bus Invalidate Line is used to cause lines to be flushed from the cache. Since Itanium 2 does not implement the BIL optimization, this can only be generated by the fc (flush cache) instruction. This is a zero-byte memory read transaction, although an implicit writeback will occur if the BIL hits a modified line.

Ccast Out

These zero-byte write transactions would normally only occur in systems that use directory-based cache coherence. The purpose of this transaction is to inform the coherency directory that a clean cache was evicted from the CPU's cache (that is, it is no longer an owner of the cache line). Snoopy-based cache coherency systems do not require this notification, because all caches are automatically interrogated on all memory cache line reads/writes.

PRTL

This is the number of partial (less than 128 byte) reads (BRP) or writes (BWP) per second. Partial transactions are normally due to reading/writing memory-mapped I/O control registers, semaphore operations, clean castouts (if monitoring a system with directory-based cache coherency), and sending interprocessor interrupts.

threadswitch Event Set

Available only on dual-core Itanium 2 and newer systems.

The threadswitch event set provides data about the impact of HyperThreading on the measured process. It provides a full statistical breakdown of thread switch activity.

threadswitch Event Set 263