Texas Instruments SPRAA56 appendix RTA Techniques for Performance Measurement

Page 11

SPRAA56

4RTA Techniques for Performance Measurement

The RTA techniques described in this section are largely application-specific calls to DSP/BIOS RTA services via APIs in the run-time code. These API calls can be added to any application without modifying its logical structure.

In the case of the video application, performance overhead of the RTA tools is expected to be minimal because the calls are made at the frame rate of 30 or 25 Hz, or even in some cases every 30 or 25 frames, a very slow rate when compared to the speed of the DSP. In applications where the frame rate is faster than 30Hz—for example, voice or audio—less frequent calls to RTA services may be preferable. You might display benchmarking statistics only every N frames, where N results in a display period of about one half second.

See Appendix A: Performance Impact for information on measuring overhead.

4.1Measuring Function Execution Time with the UTL Module

The first technique for benchmarking uses the UTL module from Reference Frameworks. The UTL_stsStart and UTL_stsStop calls were inserted before and after functions of interest, and UTL_stsPeriod was used in each of the three data tasks to measure the period of one complete loop through each task. Because the UTL module acts as a wrapper for DSP/BIOS STS objects, the STS objects needed to be created during DSP/BIOS configuration. The following naming convention is used to create the statistics objects:

“sts” + task pseudonym + function benchmarked

The appInstrument.tci Tconf configuration script contains the following loop that creates these STS objects. For example, the stsProcCell0 STS object is created for the first processing function (cell 0) in the process task.

/* Array of string names to be used to create STS objects */ var stsNames = new Array("InVid", "OutVid", "Proc");

var stsStruct = new Array(

new Array("BusUtil", "Cell0", "Period", "Total", "Wait0"), new Array("BusUtil", "Cell0", "Period", "Total", "Wait0"),

new Array("BusUtil", "Cell0", "Cell1", "Period", "Total", "Nframes")

)

/* STS objects for use with UTL_sts* functions */ for (i = 0; i < APPSTSTIMECOUNT; i++) {

for (j = 0; j < stsStruct[i].length; j++) {

var stsTime = tibios.STS.create("sts" + stsNames[i] + stsStruct[i][j] ); if(stsStruct[i][j] != "BusUtil") {

stsTime.unitType = "High resolution time based"; stsTime.operation = "A * x";

} else {

stsTime.unitType = "Not time based"; stsTime.operation = "Nothing";

}

}

}

The Tconf scripts are used to generate the DSP/BIOS configuration CDB file at design time, which in turn links the appropriate kernel modules into the executable image during a build.

DSP/BIOS Real-Time Analysis (RTA) and Debugging Applied to a Video Application

11

Image 11
Contents References Appendix A. Performance Impact Modifications to the Base ExampleRTA Techniques for Performance Measurement Viewing Benchmarks in the Instrumented ApplicationFigures Important Benchmarks for Video ApplicationsSPRAA56 Base Application OverviewTskInput 1 LOG DSP/BIOS and RF5 Components Used4 UTL 2 STS3 TRC Requirements for Viewing RTA Benchmarks Modifications to the Base ExampleAdding the Control TSK and MBX Communication Splitting the Encode and Decode CELLsTskO utput Querying the H.263 Encoder for StatusTskInput Controlling the Frame Rate Measuring Function Execution Time with the UTL Module RTA Techniques for Performance MeasurementMeasuring End-to-End Latencies Measuring Task Scheduling LatenciesMeasuring the Frame Rate Programmatic Measurement of Total CPU Load Memory Bus Utilization External memory 720*480 = 345,600 B86,400 B 14,400 BBitrate and Frame Type Methods for Transmitting Measured Performance Data Requirements Viewing Benchmarks in the Instrumented ApplicationApplication-Specific Control via GEL Scripts in CCStudio Load the h263loopbackrta.out program Running the ApplicationSPRAA56 Interpreting the Benchmarks Expected Values for the STS Objects Expected and Measured STS Benchmarks Expected and Measured Logged Benchmarks Debug ModeExpected Values Delivered to the Message Log Controlling the Run-Time Parameters DynamicallyCapture and Display Task Benchmarking ReferencesMeasured Performance of Benchmarking Techniques Appendix A. Performance ImpactOverhead of Performance Measurement Techniques RTA Effects on CPU LoadMemory Footprint Details Memory FootprintImportant Notice