Intel IXP400 manual Using Intel XScale Core PMU to Determine Cache Efficiency

Models: IXP400

1 364
Download 364 pages 16.67 Kb
Page 260
Image 260

Intel® IXP400 Software

Access-Layer Components: Performance Profiling (IxPerfProfAcc) API

17.10.1.4Using Intel XScale® Core PMU to Determine Cache Efficiency

In this example, the user would like to monitor the instruction cache efficiency mode. The user would use the event counting process to count the total number of instructions that were executed and instruction cache misses requiring fetch requests to external memory.

The remaining two counters will not provide relevant results in this example. The counters may be set to the appropriate default event value.

1. To begin the counting, call the start function, with parameters:

ixPerfProfAccXscalePmuEventCounting (FALSE, 2,

IX_PERFPROF_ACC_XSCALE_PMU_EVENT_INST_EXEC, IX_PERFPROF_ACC_XSCALE_PMU_EVENT_CACHE_MISS, IX_PERFPROF_ACC_XSCALE_PMU_EVENT_MAX, IX_PERFPROF_ACC_XSCALE_PMU_EVENT_MAX)

2.Declare a results variable:

IxPerfProfAccXscalePmuResults results;

3.To end the counting, call the stop function, with parameters:

ixPerfProfAccXscalePmuEventCountStop ( IxPerfProfAccXscalePmuResults &results)

4. Print the total value (combining the upper and lower 32 bits) of all the counters:

printf(“total clk count = 0x%x%x\n”, results.clk_samples, results.clk_value); printf(“total event 1 count = 0x%x%x\n”, results.event1_samples, results.event1_value); printf(“total event 2 count = 0x%x%x\n”, results.event2_samples, results.event2_value); printf(“total event 3 count = 0x%x%x\n”, results.event3_samples, results.event3_value); printf(“total event 4 count = 0x%x%x\n”, results.event4_samples, results.event4_value);

Note: As only event counters one and two were configured to monitor events, the results of event counters 3 and 4 will remain at zero and will be irrelevant.

5.The appropriate statistics can be calculated from the results to determine the instruction cache efficiency. The instruction cache miss rate is the instruction cache misses (monitored by event counter two) divided by the total number of instructions executed (monitored by event counter one):

Instruction cache miss rate

=instruction cache misses/total number of instructions executed

= total event count 2 / total event count 1

6.The average number of cycles it took to execute an instruction (also known as cycles-per- instruction), is the total clock count (monitored by the clock counter) divided by the total number of instructions executed (monitored by event counter 1):

cycles-per-instruction = total clock count / total number of instructions executed = total clk count / total event count 1

April 2005

IXP400 Software Version 2.0

Programmer’s Guide

260

Document Number: 252539, Revision: 007

 

Page 260
Image 260
Intel IXP400 manual Using Intel XScale Core PMU to Determine Cache Efficiency