Special Cases of Alpha Instruction Execution

Figure 2–9 Pipeline Timing for Integer Load Instructions

 

 

 

 

 

 

Hit

 

 

 

Cycle Number

1

2

 

4

 

5

 

6

7

3

 

 

ILD

Q

R

E

D

 

B

 

 

 

Instruction 1

 

 

 

Q

 

R

 

 

 

Instruction 2

 

 

 

 

 

Q

 

 

 

 

 

 

 

 

 

 

 

 

 

8

FM-05814.AI4

There are two cycles in which the IQ may speculatively issue instructions that use load data before Dcache hit information is known. Any instructions that are issued by the IQ within this 2-cycle speculative window are kept in the IQ with their requests inhibited until the load instruction’s hit condition is known, even if they are not dependent on the load operation. If the load instruction hits, then these instructions are removed from the queue. If the load instruction misses, then the execution of these instructions is aborted and the instructions are allowed to request service again.

For example, in Figure 2–9,instruction 1 and instruction 2 are issued within the specu- lative window of the load instruction. If the load instruction hits, then both instructions will be deleted from the queue by the start of cycle 7—one cycle later than normal for instruction 1 and at the normal time for instruction 2. If the load instruction misses, both instructions are aborted from the execution pipelines and may request service again in cycle 6.

IQ-issued instructions are aborted if issued within the speculative window of an integer load instruction that missed in the Dcache, even if they are not dependent on the load data. However, if software misses are likely, the 21264/EV67 can still benefit from scheduling the instruction stream for Dcache miss latency. The 21264/EV67 includes a saturating counter that is incremented when load instructions hit and is decremented when load instructions miss. When the upper bit of the counter equals zero, the integer load latency is increased to five cycles and the speculative window is removed. The counter is 4 bits wide and is incremented by 1 on a hit and is decremented by two on a miss.

Since load instructions to R31 do not produce a result, they do not create a speculative window when they execute and, therefore, never waste IQ-issue cycles if they miss.

Floating-point load instructions that hit in the Dcache have a latency of four cycles. Fig- ure 2–10shows the pipeline timing for floating-point load instructions. In Figure 2–10:

Symbol

Meaning

Q

Issue queue

R

Register file read

E

Execute

D

Dcache access

B

Data bus active

Alpha 21264/EV67 Hardware Reference Manual

Internal Architecture 2–25

Page 53
Image 53
Compaq 21264, EV67 Special Cases of Alpha Instruction Execution, Pipeline Timing for Integer Load Instructions