This prevents the compiler from reordering statements while optimizing code, so the measured program results may be worse than it would be otherwise. For example, with sample points inside of a loop, this could mean that loop invariant promotion or other loop transformations become illegal or less effective. For sample points placed at the entrance and exit of functions, this could affect performance if the function is inlined.

Unfortunately, the only way to check for such issues is to check the code generated by the compiler with and without those macros, and estimate whether the program measurements are significantly affected.

Restricting PMU Measurements to Specific Code Regions

By default, HP Caliper measures PMU events for your entire program. You can, however, restrict measurements to performance-sensitive regions of code. This feature is enabled with the CALIPER_PMU_ENABLE and CALIPER_PMU_DISABLE macros and the --user-regionsoption.

You can use this feature with these measurements:

alat

branch

dcache

dtlb

ecount

fprof

icache

itlb

pmu_trace

scgprof

While you can also use this feature with the cgprof measurement, it might lead to inconsistent results. This is because the time statistics are collected using the PMU, while the call graph and function counts are collected using dynamic instrumentation.

Reasons to use this feature include:

Analyzing a particular loop or function. You can restrict measurements to a particular loop to get information such as:

ecount Number of events occurring in the loop fprof Hot spots in the loop

branch Analysis of the loop branches dcache Data cache misses in the loop

Analyzing a particular phase in an application.

For applications with important startup or shutdown phases, it is sometimes beneficial to limit measurements to the “in-between” phase. This technique allows you to use test cases that run for a shorter time, without having to worry about the effects caused by the startup and shutdown code.

Similarly, the data collection can be restricted to the startup or shutdown phases to target those for performance improvements.

Taking PMU Samples in Your Code 161