![](/images/backgrounds/120559/120559-377215x1.png)
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
6.3 Timing Considerations
The 750GX is a superscalar processor; as many as three instructions can be issued to the execution units (one branch instruction to the branch processing unit, and two instructions issued from the dispatch queue to the other execution units) during each clock cycle. Only one instruction can be dispatched to each execution unit.
Although instructions appear to the programmer to execute in program order, the 750GX improves performance by executing multiple instructions at a time, using hardware to manage dependencies. When an instruction is dispatched, the register file or a Rename Register from a previous instruction provides the source data to the execution unit. The register files and Rename Register have sufficient bandwidth to allow dispatch of two instructions per clock under most conditions.
The 750GX’s BPU decodes and executes branches immediately after they are fetched. When a conditional branch cannot be resolved due to a CR data (or any) dependency, the branch direction is predicted and execution continues on the predicted path. If the prediction is incorrect, the following steps are taken:
1.The instruction queue is purged and fetching continues from the correct path.
2.Any instructions behind (in program order) the predicted branch in the completion queue are allowed to complete.
3.Instructions fetched on the mispredicted path of the branch are purged.
4.Fetching resumes along the correct (other) path.
After an execution unit finishes executing an instruction, it places resulting data into the appropriate GPR or FPR Rename Register. The results are then stored into the correct GPR or FPR during the
Section 6.3.1 describes this process in greater detail.
6.3.1 General Instruction Flow
As many as four instructions can be fetched into the instruction queue (IQ) in a single clock cycle. Instructions enter the IQ and are issued to the various execution units from the dispatch queue. The 750GX tries to keep the IQ full at all times, unless
The number of instructions requested in a clock cycle is determined by the number of vacant spaces in the IQ during the previous clock cycle. This is shown in the examples in this section. Although the instruction queue can accept as many as four new instructions in a single clock cycle, if only one IQ entry is vacant, only one instruction is fetched. Typically, instructions are fetched from the L1 instruction cache, but they might also be fetched from the branch target instruction cache (BTIC) if a branch is taken. If the branch taken instruction request hits in the BTIC, it can usually present the first two instructions of the new instruction stream in the next clock cycle, giving enough time for the next pair of instructions to be fetched from the instruction L1 cache. This results in no idle cycles in the instruction stream (also known as a
gx_06.fm.(1.2) | Instruction Timing |
March 27, 2006 | Page 215 of 377 |