Texas Instruments TMS320 DSP manual Data Memory

Page 19

www.ti.com

Data Memory

void PRE_filter1(int input[], int length, int *z)

{

int I, tmp;

for (I = 0; I < length; I++) {

tmp = input[i] - z[0] + (13 * z[1] + 16) / 32; z[1] = z[0];

z[0] = input[i]; input[i] = tmp;

}

}

This technique of replacing references to global data with references to parameters illustrates a general technique that can be used to make virtually any Code reentrant. One simply defines a "state object" as one that contains all of the state necessary for the algorithm; a pointer to this state is passed to the algorithm (along with the input and output data).

typedef struct

PRE_Obj { /* state obj for pre-emphasis alg */ int z0;

int z1; } PRE_Obj;

void

PRE_filter2(PRE_Obj *pre, int input[], int length)

{

int I, tmp;

for (I = 0; I < length; I++)

{

tmp = input[i] - pre->z0 + (13 * pre->z1 + 16) / 32;

pre->z1 = pre->z0; pre->z0 = input[i]; input[i] = tmp;

}

}

Although the C Code looks more complicated than our original implementation, its performance is comparable, it is fully reentrant, and its performance can be configured on a "per data object" basis. Since each state object can be placed in any data memory, it is possible to place some objects in on-chip memory and others in external memory. The pointer to the state object is, in effect, the function'sprivate "data page pointer." All of the function'sdata can be efficiently accessed by a constant offset from this pointer.

Notice that while performance is comparable to our original implementation, it is slightly larger and slower because of the state object redirection. Directly referencing global data is often more efficient than referencing data via an address register. On the other hand, the decrease in efficiency can usually be factored out of the time-critical loop and into the loop-setup Code. Thus, the incremental performance cost is minimal and the benefit is that this same Code can be used in virtually any system—independent of whether the system must support a single channel or multiple channels, or whether it is preemptive or non-preemptive.

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." —Donald Knuth "Structured Programming with go to Statements," Computing Surveys, Vol. 6, No. 4, December, 1974, page 268.

2.3Data Memory

The large performance difference between on-chip data memory and off-chip memory (even 0 wait-state SRAM) is so large that every algorithm vendor designs their Code to operate as much as possible within the on-chip memory. Since the performance gap is expected to increase dramatically in the next 3-5 years, this trend will continue for the foreseeable future. The TMS320C6000 series, for example, incurs a 25 wait state penalty for external SDRAM data memory access. Future processors may see this penalty increase to 80 or even 100 wait states!

SPRU352G –June 2005 –Revised February 2007

General Programming Guidelines

19

Image 19
Contents Rules and Guidelines Users GuideSubmit Documentation Feedback Contents Use of the DMA Resource Urls List of Figures Intended Audience Read This FirstDocument Overview Guideline n Related DocumentationText Conventions Rule nOverview Rules for TMS320C5x Rules for TMS320C6x Scope of the StandardRequirements of the Standard Rules and GuidelinesIntentional Omissions Goals of the StandardFrameworks System ArchitectureCore Run-Time Support AlgorithmsGeneral Programming Guidelines Rule Use of C LanguageThreads and Reentrancy ThreadsReentrancy Preemptive vs. Non-Preemptive MultitaskingExample Data Memory Data MemoryScratch versus Persistent Memory SpacesScratch vs Persistent Memory Allocation Guideline Algorithm versus ApplicationROM-ability Program MemorySection Name Purpose Use of Peripherals Use of PeripheralsInterfaces and Modules Algorithms Packaging Algorithm Component ModelImplementation Fir.h Interfaces and ModulesExternal Identifiers Module Initialization and Finalization Naming ConventionsModule Instance Objects Run-Time Object Creation and Deletion Design-Time Object CreationExample Module Module ConfigurationMultiple Interface Support Description Required Interface InheritanceSummary ElementAlgorithms AlgorithmsObject Code PackagingDebug Verses Release Header FilesModuleversvendorvariant.1arch Data Memory Program Memory Interrupt Latency Execution Time Algorithm Performance CharacterizationExternal Heap MemorySize Static Local and Global Data Memory Stack MemoryData Bss Object files Size Operation Interrupt LatencyExecution Time Mips Is Not EnoughExecution Timeline for Two Periodic Tasks Execution Time Model198000 Process59000 198000 Submit Documentation Feedback DSP-Specific Guidelines Register Types CPU Register TypesData Models Use of Floating PointTMS320C6xxx Rules and Guidelines Endian Byte OrderingCSR Field Use Type Register ConventionsStatus Register Register Use TypeTMS320C54xx Rules and Guidelines Interrupt LatencyProgram Models TMS320C54xx Rules and Guidelines ST0 Field Name Use Type Status RegistersST1 Field Name Use Type Stack Architecture TMS320C55x Rules and GuidelinesPmst Field Name Use Type Example RelocatabilitySSP ST2 Field Name Use Type Status BitsST3 Field Name Use Type Homy General TMS320C24xx GuidelinesTMS320C28x Rules and Guidelines TMS320C28x Rules and GuidelinesXAR0 M0M1MAP Submitting DMA Transfer Requests Use of the DMA ResourceAlgorithm and Framework OverviewLogical Channel Requirements for the Use of the DMA ResourceData Transfer Synchronization Data Transfer PropertiesDMA Guideline Abstract InterfaceDMA Rule Data Transfers bytes Frequency Resource CharacterizationAverage Maximum Strong Ordering of DMA Transfer Requests Runtime APIsDevice Independent DMA Optimization Guideline Submitting DMA Transfer Requests13 C6xxx Specific DMA Rules and Guidelines Cache Coherency Issues for Algorithm Producers14 C55x Specific DMA Rules and Guidelines Supporting Packed/Burst Mode DMA TransfersNon-Preemptive System Minimizing Logical Channel Reconfiguration OverheadAddressing Automatic Endianism Conversion Issues Inter-Algorithm SynchronizationPreemptive System Algorithm B Algorithm a Inter-Algorithm Synchronization Appendix a General Rules DMA Rules Performance Characterization RulesGeneral Guidelines DMA Guidelines Submit Documentation Feedback Core Run-Time APIs DSP/BIOS Run-time Support LibraryTI C-Language Run-Time Support Library DSP/BIOS Run-time Support LibraryBooks BibliographySubmit Documentation Feedback Glossary of Terms GlossaryGlossary of Terms Glossary of Terms Important Notice