Table 26 Information in fprof Measurement Reports
Column | Description | |
|
| |
% Total IP Samples | Percent of the total IP samples attributable to a given program object. | |
|
| |
Cumulat % of Total | Running sum of the percent of total IP samples accounted for by the given program object and | |
| those listed above it. | |
|
| |
IP Samples | Total number of IP samples attributed to the given program object. | |
|
| |
Kernel Thread | Kernel Thread ID suffixed with the the name of the routine that the thread will execute once it | |
Identification Number | is created. | |
|
| |
Load Module | Shared library or the main executable. | |
|
| |
Function | Routine from your application. | |
|
| |
File | Source file associated with a function. | |
|
| |
Line | The column contains one of these: | |
Slot | • A | |
Col,Offset | ||
• An instruction slot number for rows showing instructions not on a bundle boundary | ||
| ||
| • A | |
| for rows showing instructions on a bundle boundary | |
| Column and line numbers are preceded by “~” when they are approximate due to optimization. | |
|
| |
>Statement | The column contains either a source statement preceded by “>” or a disassembled instruction. | |
Instruction | Statements that are out of order due to optimization are preceded by “*>”. | |
|
|
How fprof Metrics Are Obtained
HP Caliper obtains fprof metrics using the performance monitoring unit (PMU).
Exact counts are obtained from the PMU's performance monitor configuration (PMC)/performance monitor data (PMD) register pairs. Sampled IPs are obtained from the operating system.
HP Caliper takes samples by using the overflow of one of the PMU's event counters as a sampling trigger. Samples are taken every Nth PMU event, where both N and the sampling event are defined in the fprof measurement configuration file in the HP Caliper home directory in the config subdirectory. You can override the value in the measurement configuration file by using the
The list of processor metrics you can use for the sampling event are available from the file itanium2_cpu_counters.txt, located in the HP Caliper home directory in the doc/text subdirectory.
The IP collected at each sampling point is the IP recorded by the kernel (in the process's save state) when the PMU overflow trap is taken. The kernel does not record a instruction slot number. Thus, the lowest granularity HP Caliper reports is instruction bundles.
The IP that HP Caliper records is the address of the next instruction that will execute when the kernel resumes execution of your application. It is not the address of the instruction that caused the event that resulted in the PMU overflow trap. This is because of the delays associated with incrementing the PMU counter, detecting the overflow, and triggering the trap. This means that the instruction that caused the PMU overflow will have occurred some number of cycles, typically in the low tens, before the address being sampled. Thus, the address recorded might or might not point to the instruction causing the event, depending on pipeline stalls.
The latency between the event triggering the sample and the actual sample is not a problem if you are using fprof to find hot spots in your application. It is only an issue if you try to use fprof to find particular instructions that cause the events recorded by the PMU, in which case you must take the latency into account.
214 Descriptions of Measurement Reports