Intel® IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor
September 2006 DM
Order Number: 252480-006US 183
Intel XScale® Processor—Intel® IXP42X product line and IXC1100 control plane processors
3.10.4.2.5 Mini-Data Cache
The mini-data cache is best used for data structures, which have short temporal lives,
and/or cover vast amounts of data space. Addressing these types of data spaces from
the Data cache would corrupt much if not all of the Data cache by evicting valuable
data. Eviction of valuable data will reduce performance. Placing this data instead in
Mini-data cache memory region would prevent Data cache corruption while providing
the benefits of cached accesses.
A prime example of using the mini-data cache would be for caching the procedure call
stack. The stack can be allocated to the mini-data cache so that it’s use does not trash
the main dcache. This would keep local variables from global data.
Following are examples of data that could be assigned to mini-dcache:
The stack space of a frequently occurring interrupt, the stack is used only during
the duration of the interrupt, which is usually very small.
Video buffers, these are usual large and can occupy the whole cache.
Over use of the Mini-Data cache will thrash the cache. This is easy to do because the
Mini-Data cache only has two ways per set. For example, a loop which uses a simple
statement such as:
Where A, B, and C reside in a mini-data cache memory region and each is array is
aligned on a 1-K boundary will quickly thrash the cache.
3.10.4.2.6 Data Alignment
Cache lines begin on 32-byte address boundaries. To maximize cache line use and
minimize cache pollution, data structures should be aligned on 32-byte boundaries and
sized to multiple cache line sizes. Aligning data structures on cache address boundaries
simplifies later addition of prefetch instructions to optimize performance.
Not aligning data on cache lines has the disadvantage of moving the prefetch address
correspondingly to the misalignment. Consider the following example:
In this case if tdata[] is not aligned to a cache line, then the prefetch using the address
of tdata[i+1].ia may not include element id. If the array was aligned on a cache line +
12 bytes, then the prefetch would halve to be placed on &tdata[i+1].id.
for (i=0; I< IMAX; i++)
{A[i] = B[i] + C[i];
}
struct {
long ia;
long ib;
long ic;
long id;
} tdata[IMAX];
for (i=0, i<IMAX; i++)
{PREFETCH(tdata[i+1]);
tdata[i].ia = tdata[i].ib + tdata[i].ic _tdata[i].id];
....
tdata[i].id = 0;
}