Texas Instruments SPRAA56 appendix 720*480 = 345,600 B, 86,400 B, 14,400 B, External memory

Page 16

SPRAA56

In video applications that handle the full resolution of 720x480, each from contains about 675 KB of data. Such applications must constantly move video frames from internal working memory buffers to external frame buffers and back. This often results in several MB of memory transfers through the external bus for each frame. At 30 frames per second, the memory transfer bandwidth requirement can be a significant CPU resource requirement. As resolutions increase to high-definition sizes of 1440x720 or even 1920x1080, and frame rates may be 60 frames per second, the memory bandwidth requirement can be even more of a limitation than CPU cycles.

The architecture of the video software framework often determines the amount of memory bandwidth required. Frameworks that repeatedly move video frames from external memory to internal working buffers and back introduce unnecessary memory bandwidth overhead that may limit the frame rate. Therefore, it is important to understand the memory bus utilization of the whole system and its components.

Data structures for measuring the memory bus utilization of the input, processing, and display tasks are included in the modified example. The actual values logged into the data structures are estimated, based on the defined size of the frames being moved to internal buffers for processing.

For the case of YUV4:2:0 to YUV4:2:2 color conversion, the external memory bus utilization and data flow is shown in Figure 5. A D1 frame (345,600 bytes of luminance data) and 2 chroma buffers of ¼ that size are copied to internal memory sections for processing, totaling 1.5 times a frame worth of data. The data copied back out to external memory after conversion has twice as many chroma samples, for a total of 2 times a D1 frame size in pixels. The estimated total bus utilization is therefore 1.5N + 2N bytes, where N is the frame size in pixels.

 

 

 

 

720*480 = 345,600 B

Y

Y

Y

Y

 

 

 

 

 

 

 

Y

Y

Y

Y

 

 

 

 

 

 

 

 

 

 

Y

Y

Y

Y

 

 

 

Y

Y

Y

Y

 

 

 

 

 

 

 

 

 

 

 

 

86,400 B

 

 

 

Cb

Cb

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Cb

Cb

 

 

 

 

 

 

 

 

 

 

 

 

Cr

Cr

86,400 B

 

 

internal L2

 

 

 

 

memory

Cr

Cr

 

 

 

 

 

 

 

 

 

 

 

 

 

14,400 B

external memory

 

 

scratch

 

 

 

 

 

 

 

 

 

 

 

 

 

345,600 B

172,800 B

172,800 B

Y

Y

Y

Y

 

 

 

 

 

Y

Y

Y

Y

 

 

 

 

 

Y

Y

Y

Y

Y

Y

Y

Y

 

 

 

 

 

 

 

 

 

 

Cb

Cb

Cb

Cb

 

 

 

 

 

 

Cb

Cb

Cb

Cb

 

 

 

 

 

 

 

 

 

 

 

Cr

Cr

Cr

Cr

 

 

 

 

 

 

Cr

Cr

Cr

Cr

 

 

 

 

 

 

external memory

Figure 5. External Internal Memory Transfers, YUV4:2:0 to 4:2:2 Conversion Function

16DSP/BIOS Real-Time Analysis (RTA) and Debugging Applied to a Video Application

Image 16
Contents Modifications to the Base Example RTA Techniques for Performance MeasurementViewing Benchmarks in the Instrumented Application References Appendix A. Performance ImpactImportant Benchmarks for Video Applications FiguresBase Application Overview SPRAA56TskInput DSP/BIOS and RF5 Components Used 1 LOG3 TRC 2 STS4 UTL Modifications to the Base Example Requirements for Viewing RTA BenchmarksSplitting the Encode and Decode CELLs Adding the Control TSK and MBX CommunicationTskInput Querying the H.263 Encoder for StatusTskO utput Controlling the Frame Rate RTA Techniques for Performance Measurement Measuring Function Execution Time with the UTL ModuleMeasuring Task Scheduling Latencies Measuring End-to-End LatenciesMeasuring the Frame Rate Programmatic Measurement of Total CPU Load Memory Bus Utilization 720*480 = 345,600 B 86,400 B14,400 B External memoryBitrate and Frame Type Methods for Transmitting Measured Performance Data Application-Specific Control via GEL Scripts in CCStudio Viewing Benchmarks in the Instrumented ApplicationRequirements Running the Application Load the h263loopbackrta.out programSPRAA56 Interpreting the Benchmarks Expected Values for the STS Objects Expected and Measured STS Benchmarks Debug Mode Expected Values Delivered to the Message LogControlling the Run-Time Parameters Dynamically Expected and Measured Logged BenchmarksReferences Capture and Display Task BenchmarkingAppendix A. Performance Impact Overhead of Performance Measurement TechniquesRTA Effects on CPU Load Measured Performance of Benchmarking TechniquesMemory Footprint Memory Footprint DetailsImportant Notice