MCF548x Reference Manual, Rev. 3
7-16 Freescale Semiconductor

7.9.4.2 Cache Pushes

Cache pushes occur for line replacement and as required for the execution of the CPUSHL instruction. To
reduce the requested data’s latency in the new line, the modified line being replaced is temporarily placed
in the push buffer while the new line is fetched from memory. After the bus transfer for the new line
completes, the modified cache line is written back to memory and the push buffer is invalidated.
7.9.4.2.1 Push and Store Buffers
The 16-byte push buffer reduces latency for requested new data on a cache miss by holding a displaced
modified data cache line while the new data is read from memory.
If a cache miss displaces a modified line, a miss read reference is immediately generated. While waiting
for the response, the current contents of the cache location load into the push buffer. When the burst-read
bus transaction completes, the cache controller can generate the appropriate line-write bus transaction to
write the push buffer contents into memory.
In imprecise mode, the FIFO store buffer can defer pending writes to maximize performance. The store
buffer can support as many as four entries (16 bytes maximum) for this purpose.
Data writes destined for the store buffer cannot stall the core. The store buffer effectively provides a
measure of decoupling between the pipeline’s ability to generate writes (one per cycle maximum) and the
external bus’s ability to retire those writes. In imprecise mode, writes stall only if the store buffer is full
and a write operation is on the internal bus. The internal write cycle is held, stalling the data execution
pipeline.
If the store buffer is not used (that is, store buffer disabled or cache-inhibited precise mode), external bus
cycles are generated directly for each pipeline write operation. The instruction is held in the pipeline until
external bus transfer termination is received. Therefore, each write is stalled for 5 cycles, making the
minimum write time equal to 6 cycles when the store buffer is not used. See Section 3.2.1.2, “Operand
Execution Pipeline (OEP).”
The data store buffer enable bit, CACR[DESB], controls the enabling of the data store buffer. This bit can
be set and cleared by the MOVEC instruction. DESB is zero at reset and all writes are performed in order
(precise mode). ACRn[CM] or CACR[DDCM] generates the mode used when DESB is set. Cacheable
write-through and cache-inhibited imprecise modes use the store buffer.
The store buffer can queue data as much as 4 bytes wide per entry. Each entry matches the corresponding
bus cycle it generates; therefore, a misaligned longword write to a write-through region creates two entries
if the address is to an odd-word boundary. It creates three entries if it is to an odd-byte boundary—one per
bus cycle.
7.9.4.2.2 Push and Store Buffer Bus Operation
As soon as the push or store buffer has valid data, the internal bus controller uses the next available external
bus cycle to generate the appropriate write cycles. In the event that another cache fill is required (for
example, cache miss to process) during the continued instruction execution by the processor pipeline, the
pipeline stalls until the push and store buffers are empty, then generate the required external bus
transaction.
Supervisor instructions, the NOP instruction, and exception processing synchronize the processor core and
guarantee the push and store buffers are empty before proceeding. Note that the NOP instruction should
be used only to synchronize the pipeline. The preferred no-operation function is the TPF instruction. See
the ColdFire Programmers Reference Manual for more information on the TPF instruction.