Texas Instruments TMS320 DSP manual Memory Spaces, Scratch versus Persistent

Page 20

www.ti.com

Data Memory

While the amount of on-chip data memory may be adequate for each algorithm in isolation, the increased number of MIPS available on modern DSPs encourages systems to perform multiple algorithms concurrently with a single chip. Thus, some mechanism must be provided to efficiently share this precious resource among algorithm components from one or more third parties.

2.3.1 Memory Spaces

In an ideal DSP, there would be an unlimited amount of on-chip memory and algorithms would simply always use this memory. In practice, however, the amount of on-chip memory is very limited and there are even two common types of on-chip memory with very different performance characteristics: dual-access memory which allows simultaneous read and write operations in a single instruction cycle, and single access memory that only allows a single access per instruction cycle.

Because of these practical considerations, most DSP algorithms are designed to operate with a combination of on-chip and external memory. This works well when there is sufficient on-chip memory for all the algorithms that need to operate concurrently; the system developer simply dedicates portions of on-chip memory to each algorithm. It is important, however, that no algorithm assume specific region of on-chip memory or contain any "hard Coded" addresses; otherwise the system developer will not be able to optimally allocate the on-chip memory among all algorithms.

Rule 3

Algorithm data references must be fully relocatable (subject to alignment requirements). That is, there must be no “hard-coded”data memory locations.

Note that algorithms can directly access data contained in a static data structure located by the linker. This rule only requires that all such references be done symbolically; i.e., via a relocatable label rather than a fixed numerical address.

In systems where the set of algorithms is not known in advance or when there is insufficient on-chip memory for the worst-case working set of algorithms, more sophisticated management of this precious resource is required. In particular, we need to describe how the on-chip memory can be shared at run-time among an arbitrary number of algorithms.

2.3.2 Scratch versus Persistent

In this section, we develop a general model for sharing regions of memory among algorithms. This model is used to share the on-chip memory of a DSP, for example. This model is essentially a generalization of the technique commonly used by compilers to share CPU registers among functions. Compilers often partition the CPU registers into two groups: "scratch" and "preserve." Scratch registers can be freely used by a function without having to preserve their value upon return from the function. Preserve registers, on the other hand, must be saved prior to being modified and restored prior to returning from the function. By partitioning the register set in this way, significant optimizations are possible; functions do not need to save and restore scratch registers, and callers do not need to save preserve registers prior to calling a function and restore them after the return.

Consider the program execution trace of an application that calls two distinct functions, say a() and b().

Void main()

{

... /* use scratch registers r1 and r2 */

/* call function

 

a() */

 

 

a() {

 

 

... /* use scratch registers r0, r1, and r2 */

 

 

}

 

 

/* call function b()

 

 

*/

 

 

b() {

 

 

... /* use scratch registers r0 and r1*/

 

 

}

 

 

}

 

20

General Programming Guidelines

SPRU352G –June 2005 –Revised February 2007

Image 20
Contents Users Guide Rules and GuidelinesSubmit Documentation Feedback Contents Use of the DMA Resource Urls List of Figures Document Overview Read This FirstIntended Audience Related Documentation Text ConventionsRule n Guideline nOverview Scope of the Standard Rules for TMS320C5x Rules for TMS320C6xRules and Guidelines Requirements of the StandardGoals of the Standard Intentional OmissionsSystem Architecture FrameworksAlgorithms Core Run-Time SupportGeneral Programming Guidelines Use of C Language Threads and ReentrancyThreads Rule Preemptive vs. Non-Preemptive Multitasking ReentrancyExample Data Memory Data MemoryMemory Spaces Scratch versus PersistentScratch vs Persistent Memory Allocation Algorithm versus Application GuidelineSection Name Purpose Program MemoryROM-ability Use of Peripherals Use of PeripheralsAlgorithm Component Model Interfaces and Modules Algorithms PackagingInterfaces and Modules Implementation Fir.hExternal Identifiers Module Instance Objects Naming ConventionsModule Initialization and Finalization Design-Time Object Creation Run-Time Object Creation and DeletionModule Configuration Example ModuleMultiple Interface Support Interface Inheritance SummaryElement Description RequiredAlgorithms AlgorithmsPackaging Object CodeHeader Files Debug Verses ReleaseModuleversvendorvariant.1arch Algorithm Performance Characterization Data Memory Program Memory Interrupt Latency Execution TimeSize Heap MemoryExternal Stack Memory Static Local and Global Data MemoryData Bss Object files Size Interrupt Latency Execution TimeMips Is Not Enough OperationExecution Time Model Execution Timeline for Two Periodic Tasks59000 198000 Process198000 Submit Documentation Feedback DSP-Specific Guidelines CPU Register Types Register TypesUse of Floating Point TMS320C6xxx Rules and GuidelinesEndian Byte Ordering Data ModelsRegister Conventions Status RegisterRegister Use Type CSR Field Use TypeProgram Models Interrupt LatencyTMS320C54xx Rules and Guidelines TMS320C54xx Rules and Guidelines ST1 Field Name Use Type Status RegistersST0 Field Name Use Type Pmst Field Name Use Type TMS320C55x Rules and GuidelinesStack Architecture Relocatability ExampleSSP ST3 Field Name Use Type Status BitsST2 Field Name Use Type Homy TMS320C24xx Guidelines GeneralTMS320C28x Rules and Guidelines TMS320C28x Rules and GuidelinesXAR0 M0M1MAP Use of the DMA Resource Submitting DMA Transfer RequestsOverview Algorithm and FrameworkRequirements for the Use of the DMA Resource Logical ChannelData Transfer Properties Data Transfer SynchronizationDMA Rule Abstract InterfaceDMA Guideline Average Maximum Resource CharacterizationData Transfers bytes Frequency Runtime APIs Strong Ordering of DMA Transfer RequestsSubmitting DMA Transfer Requests Device Independent DMA Optimization GuidelineCache Coherency Issues for Algorithm Producers 13 C6xxx Specific DMA Rules and GuidelinesSupporting Packed/Burst Mode DMA Transfers 14 C55x Specific DMA Rules and GuidelinesMinimizing Logical Channel Reconfiguration Overhead Addressing Automatic Endianism Conversion IssuesInter-Algorithm Synchronization Non-Preemptive SystemPreemptive System Algorithm B Algorithm a Inter-Algorithm Synchronization Appendix a General Rules Performance Characterization Rules DMA RulesGeneral Guidelines DMA Guidelines Submit Documentation Feedback DSP/BIOS Run-time Support Library Core Run-Time APIsDSP/BIOS Run-time Support Library TI C-Language Run-Time Support LibraryBibliography BooksSubmit Documentation Feedback Glossary Glossary of TermsGlossary of Terms Glossary of Terms Important Notice