Intel® IXP42X product line and IXC1100 control plane processors—Intel XScale® Processor
Intel® IXP42X Product Line of Network Processors and IXC1100 Control Plane Processor
DM September 2006
170 Order Number: 252480-006US
While instructions are issued in-order, the main execution pipeline, memory, and MAC
pipelines are not lock-stepped, and, therefore, have different execution times. This
means that instructions may finish out of program order. Short ‘younger’ instructions
may be finished earlier than long ‘older’ ones. (The term ‘to finish’ is used here to
indicate that the operation has been completed and the result has been written back to
the register file.)
3.10.2.1.4 Register Scoreboarding
In certain situations, the pipeline may need to be stalled because of register
dependencies between instructions. A register dependency occurs when a previous
MAC or load instruction is about to modify a register value that has not been returned
to the register file and the current instruction needs access to the same register. Only
the destination of MAC operations and memory loads are scoreboarded. The
destinations of ALU instructions are not scoreboarded.
If no register dependencies exist, the pipeline will not be stalled. For example, if a load
operation has missed the data cache, subsequent instructions that do not depend on
the load may complete independently.
3.10.2.1.5 Use of Bypassing
The IXP42X product line and IXC1100 control plane processors pipeline make extensive
use of bypassing to minimize data hazards. Bypassing allows results forwarding from
multiple sources, eliminating the need to stall the pipeline.
3.10.2.2 Instruction Flow Through the Pipeline
The IXP42X product line and IXC1100 control plane processors’ pipeline issues a single
instruction per clock cycle. Instruction execution begins at the F1 pipe stage and
completes at the WB pipe stage.
Although a single instruction may be issued per clock cycle, all three pipelines (MAC,
memory, and main execution) may be processing instructions simultaneously. If there
are no data hazards, then each instruction may complete independently of the others.
Each pipe stage takes a single clock cycle or machine cycle to perform its subtask with
the exception of the MAC unit.
3.10.2.2.1 ARM* V5TE Instruction Execution
Figure 29 on page 169 uses arrows to show the possible flow of instructions in the
pipeline. Instruction execution flows from the F1 pipe stage to the RF pipe stage. The
RF pipe stage may issue a single instruction to either the X1 pipe stage or the MAC unit
(multiply instructions go to the MAC, while all others continue to X1). This means that
M1 or X1 will be idle.
All load/store instructions are routed to the memory pipeline after the effective
addresses have been calculated in X1.
The ARM V5TE bx (branch and exchange) instruction, which is used to branch between
ARM and thumb code, causes the entire pipeline to be flushed (The bx instruction is not
dynamically predicted by the BTB). If the processor is in Thumb mode, then the ID pipe
stage dynamically expands each Thumb instruction into a normal ARM V5TE RISC
instruction and execution resumes as usual.