22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Floating-Point Execution Unit

The floating-point execution unit (FPU) is implemented as a coprocessor that has its own out-of-order control in addition to the data path. The FPU handles all register operations for x87 instructions, all 3DNow! operations, and all MMX operations. The FPU consists of a stack renaming unit, a register renaming unit, a scheduler, a register file, and three parallel execution units. Figure 3 shows a block diagram of the dataflow through the FPU.

Instruction Control Unit

Stack Map

Register Renamee

Scheduler (36--entry)

FPU Register File (88--entry)

 

FADD

FMUL

 

MMXALU

MMX ALU

FSTORE

MMX Mul

 

3DNow!™

3DNow!

 

 

 

 

Pipeline

Stage

7

8

9

10

11

12 to 15

Figure 3. Floating-Point Unit Block Diagram

As shown in Figure 3 on page 137, the floating-point logic uses three separate execution positions or pipes for superscalar x87, 3DNow! and MMX operations. The first of the three pipes is generally known as the adder pipe (FADD), and it contains 3DNow! add, MMX ALU/shifter, and floating-point add execution units. The second pipe is known as the multiplier (FMUL). It contains a 3DNow!/MMX multiplier/reciprocal unit, an MMX ALU and a floating-point multiplier/divider/square root unit. The third pipe is known as the floating-point load/store (FSTORE), which handles floating-point constant loads (FLDZ, FLDPI, etc.), stores, FILDs, as well as many OP primitives used in VectorPath sequences.

AMD Athlon™ Processor Microarchitecture

137

Page 153
Image 153
AMD x86 manual Floating-Point Execution Unit, 12 to