Texas Instruments TMS320 DSP manual Stack Memory, Static Local and Global Data Memory

Page 39

www.ti.com

Data Memory

In the example above, the algorithm requires 960 16-bit words of single-access on-chip memory, 720 16-bit words of external persistent memory, and there are no special alignment requirements for this memory. Note that the entries in this table are not required to be constants; they may be functions of algorithm instance creation parameters.

4.1.2 Stack Memory

In addition to bulk "heap" memory, algorithms often make use of the stack for very efficient allocation of temporary storage. For most real-time systems, the total amount of stack memory for a thread is set once (either when the program is built or when the thread is created) and never changes during execution of the thread. This is done to ensure deterministic execution of the thread. It is important, therefore, that the system integrator know the worst-case stack space requirements for every algorithm.

Rule 20

All algorithms must characterize their worst-case stack space memory requirements (including alignment).

Stack space requirements for an algorithm must be characterized using a table such as that shown below.

 

Size

Align

Stack Space

400

0

Both the size and alignment fields should be expressed in units of 8-bit bytes. If no special alignment is required, the alignment number should be set to zero.

In the example above, the algorithm requires 200 16-bit words of stack memory and there is no special alignment requirement for this memory. Note that the entry in this table are not required to be a constant; it may be function of the algorithm'sinstance creation parameters.

One way to achieve reentrancy in a function is to declare all scratch data objects on the local stack. If the stack is in on-chip memory this provides easy access to fast scratch memory.

The problem with this approach to reentrancy is that, if carried too far, it may require a very large stack. While this is not a problem for single threaded applications, traditional multi-threaded applications must allocate a separate stack for each thread. It is unlikely that more than a few these stacks will fit in on-chip memory. Moreover, even in a single threaded environment, an algorithm has no control over the placement of the system stack; it may end up with easy access to very slow memory.

These problems can be avoided by algorithms taking advantage of the IALG interface to declare their scratch data memory requirements. This gives the application the chance to decide whether to allocate the memory from the stack or the heap, which ever is best for the system overall.

Guideline 5

Algorithms should keep stack size requirements to a minimum.

4.1.3 Static Local and Global Data Memory

Static data memory is any data memory that is allocated and placed when the program is built and remains fixed during program execution. In many DSP architectures, there are special instructions that can be used to access static data very efficiently by encoding the address of the data in the instruction's opcode. Therefore, once the program is built, this memory cannot be moved.

Rule 21

Algorithms must characterize their static data memory requirements.

SPRU352G –June 2005 –Revised February 2007

Algorithm Performance Characterization

39

Submit Documentation Feedback

 

 

Image 39
Contents Rules and Guidelines Users GuideSubmit Documentation Feedback Contents Use of the DMA Resource Urls List of Figures Read This First Intended AudienceDocument Overview Guideline n Related DocumentationText Conventions Rule nOverview Rules for TMS320C5x Rules for TMS320C6x Scope of the StandardRequirements of the Standard Rules and GuidelinesIntentional Omissions Goals of the StandardFrameworks System ArchitectureCore Run-Time Support AlgorithmsGeneral Programming Guidelines Rule Use of C LanguageThreads and Reentrancy ThreadsReentrancy Preemptive vs. Non-Preemptive MultitaskingExample Data Memory Data MemoryScratch versus Persistent Memory SpacesScratch vs Persistent Memory Allocation Guideline Algorithm versus ApplicationProgram Memory ROM-abilitySection Name Purpose Use of Peripherals Use of PeripheralsInterfaces and Modules Algorithms Packaging Algorithm Component ModelImplementation Fir.h Interfaces and ModulesExternal Identifiers Naming Conventions Module Initialization and FinalizationModule Instance Objects Run-Time Object Creation and Deletion Design-Time Object CreationExample Module Module ConfigurationMultiple Interface Support Description Required Interface InheritanceSummary ElementAlgorithms AlgorithmsObject Code PackagingDebug Verses Release Header FilesModuleversvendorvariant.1arch Data Memory Program Memory Interrupt Latency Execution Time Algorithm Performance CharacterizationHeap Memory ExternalSize Static Local and Global Data Memory Stack MemoryData Bss Object files Size Operation Interrupt LatencyExecution Time Mips Is Not EnoughExecution Timeline for Two Periodic Tasks Execution Time ModelProcess 19800059000 198000 Submit Documentation Feedback DSP-Specific Guidelines Register Types CPU Register TypesData Models Use of Floating PointTMS320C6xxx Rules and Guidelines Endian Byte OrderingCSR Field Use Type Register ConventionsStatus Register Register Use TypeInterrupt Latency TMS320C54xx Rules and GuidelinesProgram Models TMS320C54xx Rules and Guidelines Status Registers ST0 Field Name Use TypeST1 Field Name Use Type TMS320C55x Rules and Guidelines Stack ArchitecturePmst Field Name Use Type Example RelocatabilitySSP Status Bits ST2 Field Name Use TypeST3 Field Name Use Type Homy General TMS320C24xx GuidelinesTMS320C28x Rules and Guidelines TMS320C28x Rules and GuidelinesXAR0 M0M1MAP Submitting DMA Transfer Requests Use of the DMA ResourceAlgorithm and Framework OverviewLogical Channel Requirements for the Use of the DMA ResourceData Transfer Synchronization Data Transfer PropertiesAbstract Interface DMA GuidelineDMA Rule Resource Characterization Data Transfers bytes FrequencyAverage Maximum Strong Ordering of DMA Transfer Requests Runtime APIsDevice Independent DMA Optimization Guideline Submitting DMA Transfer Requests13 C6xxx Specific DMA Rules and Guidelines Cache Coherency Issues for Algorithm Producers14 C55x Specific DMA Rules and Guidelines Supporting Packed/Burst Mode DMA TransfersNon-Preemptive System Minimizing Logical Channel Reconfiguration OverheadAddressing Automatic Endianism Conversion Issues Inter-Algorithm SynchronizationPreemptive System Algorithm B Algorithm a Inter-Algorithm Synchronization Appendix a General Rules DMA Rules Performance Characterization RulesGeneral Guidelines DMA Guidelines Submit Documentation Feedback Core Run-Time APIs DSP/BIOS Run-time Support LibraryTI C-Language Run-Time Support Library DSP/BIOS Run-time Support LibraryBooks BibliographySubmit Documentation Feedback Glossary of Terms GlossaryGlossary of Terms Glossary of Terms Important Notice