www.ti.com
Data Memory
While the amount of on-chip data memory may be adequate for each algorithm in isolation, the increased number of MIPS available on modern DSPs encourages systems to perform multiple algorithms concurrently with a single chip. Thus, some mechanism must be provided to efficiently share this precious resource among algorithm components from one or more third parties.
2.3.1 Memory Spaces
In an ideal DSP, there would be an unlimited amount of on-chip memory and algorithms would simply always use this memory. In practice, however, the amount of on-chip memory is very limited and there are even two common types of on-chip memory with very different performance characteristics: dual-access memory which allows simultaneous read and write operations in a single instruction cycle, and single access memory that only allows a single access per instruction cycle.
Because of these practical considerations, most DSP algorithms are designed to operate with a combination of on-chip and external memory. This works well when there is sufficient on-chip memory for all the algorithms that need to operate concurrently; the system developer simply dedicates portions of on-chip memory to each algorithm. It is important, however, that no algorithm assume specific region of on-chip memory or contain any "hard Coded" addresses; otherwise the system developer will not be able to optimally allocate the on-chip memory among all algorithms.
Rule 3
Algorithm data references must be fully relocatable (subject to alignment requirements). That is, there must be no “hard-coded”data memory locations.
Note that algorithms can directly access data contained in a static data structure located by the linker. This rule only requires that all such references be done symbolically; i.e., via a relocatable label rather than a fixed numerical address.
In systems where the set of algorithms is not known in advance or when there is insufficient on-chip memory for the worst-case working set of algorithms, more sophisticated management of this precious resource is required. In particular, we need to describe how the on-chip memory can be shared at run-time among an arbitrary number of algorithms.
2.3.2 Scratch versus Persistent
In this section, we develop a general model for sharing regions of memory among algorithms. This model is used to share the on-chip memory of a DSP, for example. This model is essentially a generalization of the technique commonly used by compilers to share CPU registers among functions. Compilers often partition the CPU registers into two groups: "scratch" and "preserve." Scratch registers can be freely used by a function without having to preserve their value upon return from the function. Preserve registers, on the other hand, must be saved prior to being modified and restored prior to returning from the function. By partitioning the register set in this way, significant optimizations are possible; functions do not need to save and restore scratch registers, and callers do not need to save preserve registers prior to calling a function and restore them after the return.
Consider the program execution trace of an application that calls two distinct functions, say a() and b().
Void main()
{
... /* use scratch registers r1 and r2 */
/* call function
| a() */ | |
| a() { | |
| ... /* use scratch registers r0, r1, and r2 */ | |
| } | |
| /* call function b() | |
| */ | |
| b() { | |
| ... /* use scratch registers r0 and r1*/ | |
| } | |
| } | |
20 | General Programming Guidelines | SPRU352G –June 2005 –Revised February 2007 |