While executing those instructions will not cause an application to crash in the absence of HP Caliper, they will still have an impact on performance. Executing a break instruction causes a trap to the breakpoint handler in the kernel.

The presence of trigger macros may disable some optimization that the compiler could perform.

The trigger instructions are defined so that code will not be moved around them. This is done to ensure that code seen in the source between two sample points will not executed before or after those samples are taken.

This prevents the compiler from reordering statements while optimizing code, so the measured program results may be worse than it would be otherwise. For example, with sample points inside of a loop, this could mean that loop invariant promotion or other loop transformations become illegal or less effective. For sample points placed at the entrance and exit of functions, this could affect performance if the function is inlined.

Unfortunately, the only way to check for such issues is to check the code generated by the compiler with and without those macros, and estimate whether the program measurements are significantly affected.

Restricting PMU Measurements to Specific Code Regions

By default, HP Caliper measures PMU events for your entire program. You can, however, restrict measurements to performance-sensitive regions of code. This feature is enabled with the CALIPER_PMU_ENABLE and CALIPER_PMU_DISABLE macros and the --user-regionsoption.

You can use this feature with these measurements:

alat

branch

dcache

dtlb

ecount

fprof

icache

itlb

pmu_trace

scgprof

While you can also use this feature with the cgprof measurement, it might lead to inconsistent results. This is because the time statistics are collected using the PMU, while the call graph and function counts are collected using dynamic instrumentation.

Reasons to use this feature include:

Analyzing a particular loop or function. You can restrict measurements to a particular loop to get information such as:

ecount Number of events occurring in the loop fprof Hot spots in the loop

branch Analysis of the loop branches dcache Data cache misses in the loop

Analyzing a particular phase in an application.

For applications with important startup or shutdown phases, it is sometimes beneficial to limit measurements to the “in-between” phase. This technique allows you to use test cases that run

Taking PMU Samples in Your Code 163

Page 163
Image 163
HP UX IPFilter Software manual Restricting PMU Measurements to Specific Code Regions