Architecture

The R3900 Processor Core provides Branch Likely instructions in addition to the normal Branch instructions that allow the instruction at the target branch address to be placed in the delay slot. If the branch condition of the Branch Likely instruction is met, the instruction in the delay slot is executed and the branch is taken. If the branch is not taken, the instruction in the delay slot is treated as a NOP. With the R3000A, which dose not support the Branch Likely instruction, the only instructions that can be placed in the delay slot are those unaffected if the branch is not taken.

If no instruction is placed in the delay slot, a NOP is placed just after the branch instruction.

4.3Nonblocking Load Function

The nonblocking load function prevents the pipeline from stalling when a cache miss occurs and a refill cycle is required to refill the data cache. Instructions after the load instruction that do not use registers affected by the load will continue to be executed. An example is shown in Figure 4-4. Here a cache miss occurs with the first load instruction. The two instructions following are executed prior to the load. The fourth instruction (ADD), must use a register that will be loaded by the load instruction, therefore the pipeline is stalled until the cache data becomes valid.

LW r3, 0(r0) ADD r6, r4, r2 ADD r7, r5, r2 ADD r8, r9, r3

F

D

E

M

R

R

R

R

W

 

 

 

F

D

E

M

W

 

 

r3

 

 

 

 

F

D

E

M

W

 

 

 

 

 

 

 

F

D

ES

ES

ES

E

M

W

 

 

 

R : Refill cycle, ES : Stall in E stage

 

 

 

 

Figure 4-4. Nonblocking load function

4.4 Multiply and Multiply/Add Instructions(MULT, MULTU, MADD, MADDU)

The R3900 Processor Core can execute multiply and multiply/add instructions continuously, and can use the results in the HI/LO registers in immediately following instructions, without pipeline stall (Figure 4-5(a)). The R3900 requires only one clock cycle to use the results of a general-purpose register (Figure 4-5(b)).

MADD r9, r5, r1 MADD r9, r6, r2 MADD r9, r7, r3 MADD r9, r8, r4 MFHI r10

MULT r3, r2, r1 ADD r5, r4, r3

F

D

E(M1)

M(M2)

W

 

 

 

 

 

F

D

E(M1)

M(M2)

W

 

 

 

 

 

F

D

E(M1)

M(M2)

W

 

 

 

 

 

F

D

E(M1)

M(M2)

W

 

 

 

 

 

F

D

E

M

W

M1 : First multiply stage ; M2 : Second multiply stage

(a) Continued execution of MADD

F

D

E(M1)

M(M2)

W

 

 

 

F

D

ES

E

M

W

(b) When there is data dependency in a general-purpose register

Figure 4-5. Pipeline operation with multiply instructions

41

Page 50
Image 50
Toshiba TX39 user manual Nonblocking Load Function, Nonblocking load function