Data Summary
% Total |
|
|
| Avg. | |||
Dcache | Cumulat | Sampled | Dcache | Dcache | L2 | ||
Latency | % of | Dcache | Latency | Laten. |
|
|
|
Cycles | Total | Misses | Cycles | Cycles | 7 | 14 | 64 150 250 350 450 > Data Entry |
66.82 | 66.82 | 42 | 580 | 13.8 | 62 | 29 | 7 | 0 | 0 | 2 | 0 | 0 | Heap |
7.72 | 74.54 | 10 | 67 | 6.7 | 80 | 20 | 0 | 0 | 0 | 0 | 0 | 0 | Memory mapped shared library |
5.65 | 80.18 | 5 | 49 | 9.8 | 40 | 60 | 0 | 0 | 0 | 0 | 0 | 0 | Process Text Region |
4.84 | 85.02 | 4 | 42 | 10.5 | 25 | 50 | 25 | 0 | 0 | 0 | 0 | 0 | libc.so.1::_arena_rmutex |
4.72 | 89.75 | 5 | 41 | 8.2 | 40 | 60 | 0 | 0 | 0 | 0 | 0 | 0 | Process Data Region |
The Data Entry column shows the global variable name, process region name, or unknown data address.
The process regions are:
•Process Text Region - the address space occupied by the process text/instructions
•Process Data Region - the address space occupied by initialized data and uninitialized data (.bss)
•Heap - the address space where dynamically allocated memory resides
•Data and Heap combined - when HP Caliper cannot discover the data and heap regions separately
•Process Stack Region - the user stack area
•Shared mem - all the shared memory areas mapped to the process
•RSE Stack - the RSE stack area
•Memory mapped shared library - the data area of the shared libraries mapped to the process
•Memory mapped region - all other memory mapped regions
If there is more than one region of the same type, they are combined and reported as a single entry.
The Data Summary report is generated
The Data Summary report can be merged or differenced across two databases that contain the Data Summary information.
If a process exec()s, HP Caliper does not discover the process regions. In this case, the data addresses are mapped to global variables, and any unassigned samples are reported as unknown samples. A diagnostics message is generated with the report.
How Data Cache Metrics Are Obtained
HP Caliper obtains data cache metrics from the processor's performance monitoring unit (PMU).
Exact counts are obtained from the PMU's set of performance monitor configuration (PMC)/performance monitor data (PMD) register pairs. Sampled data cache metrics are obtained from the PMU's data event address register
HP Caliper takes samples every Nth data cache miss, where N is defined in the dcache measurement configuration file in the HP Caliper home directory config subdirectory. At each sample point, HP Caliper records both the instruction that resulted in a data cache miss and the latency (number of clock cycles) incurred by the miss. You can override the value in the measurement configuration file by using the
For data cache miss sampling, the PMU can monitor only one data cache load at a time. Since there are likely to be multiple loads in progress at any given moment, the PMU can process only a subset of data cache misses. The PMU randomizes which loads it monitors.
This means that the number of data cache misses observed through
192 Descriptions of Measurement Reports