AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

AMD Athlon™ Processor Microarchitecture Summary

The AMD Athlon processor brings superscalar performance and high operating frequency to PC systems running industry-standard x86 software. A brief summary of the next -generation design features implemented in the AMD Athlon processor is as follows:

High-speed double-rate local bus interface

Large, split 128-Kbyte level-one (L1) cache

Dedicated backside level-two (L2) cache

Instruction predecode and branch detection during cache line fills

Decoupled decode/execution core

Three-way x86 instruction decoding

Dynamic scheduling and speculative execution

Three-way integer execution

Three-way address generation

Three-way floating-point execution

3DNow!™ technology and MMX™ single-instruction multiple-data (SIMD) instruction extensions

Super data forwarding

Deep out-of-order integer and floating-point execution

Register renaming

Dynamic branch prediction

The AMD Athlon processor communicates th rough a next-generation high-speed local bus that is beyond the current Socket 7 or Super7™ bus standard. The local bus can transfer data at twice the rate of the bus operating frequency by using b o t h t h e r i s i n g a n d fa l l in g e d g e s o f t h e c l o ck ( s e e “A M D A t h l o n ™ S y s t e m B u s ” o n p a g e 1 3 9 fo r m o re information).

To reduce on-chip cache miss penalties and to avoid subsequent data load or instruction fetch stalls, the AMD Athlon processor has a dedicated high-speed backside L2 cache. The large 128-Kbyte L1 on-chip cache and the backside L2 cache allow the

4

AMD Athlon™ Processor Microarchitecture Summary

Page 20
Image 20
AMD x86 manual AMD Athlon Processor Microarchitecture Summary