Intel® IXP42X product line and IXC1100 control plane processors—Intel XScale® Processor
Intel® IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor
DM September 2006
184 Order Number: 252480-006US
If the structure is not sized to a multiple of the cache line size, then the prefetch
address must be advanced appropriately and will require extra prefetch instructions.
Consider the following example:
In this case, the prefetch address was advanced by size of half a cache line and every
other prefetch instruction is ignored. Further, an additional register is required to track
the next prefetch address.
Generally, not aligning and sizing data will add extra computational overhead.
Additional prefetch considerations are discussed in greater detail in following sections.
3.10.4.2.7 Literal Pools
The IXP42X product line and IXC1100 control plane processors do not have a single
instruction that can move all literals (a constant or address) to a register. One
technique to load registers with literals in the IXP42X product line and IXC1100 control
plane processors is by loading the literal from a memory location that has been
initialized with the constant or address. These blocks of constants are referred to as
literal pools. See “Basic Optimizations” on page 173 for more information on how to do
this. It is advantageous to place all the literals together in a pool of memory known a
literal pool. These data blocks are located in the text or code address space so that
they can be loaded using PC relative addressing. However, references to the literal pool
area load the data into the data cache instead of the instruction cache. Therefore it is
possible that the literal may be present in both the data and instruction caches,
resulting in waste of space.
For maximum efficiency, the compiler should align all literal pools on cache boundaries
and size each pool to a multiple of 32 bytes (the size of a cache line). One additional
optimization would be group highly used literal pool references into the same cache
line. The advantage is that once one of the literals has been loaded, the other seven
will be available immediately from the data cache.
3.10.4.3 Cache Considerations
3.10.4.3.1 Cache Conflicts, Pollution, and Pressure
Cache pollution occurs when unused data is loaded in the cache and cache pressure
occurs when data that is not temporal to the current process is loaded into the cache.
struct {
long ia;
long ib;
long ic;
long id;
long ie;
} tdata[IMAX];
ADDRESS preadd = tdata
for (i=0, i<IMAX; i++)
{PREFETCH(predata+=16);
tdata[I].ia = tdata[I].ib + tdata[I].ic _tdata[I].id] +
tdata[I].ie;
....
tdata[I].ie = 0;
}