AMD x86 manual Accelerating Floating-Point Divides and Square Roots, Improved ordering Preferred

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 45
Image 45

22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

quadword alignment), so that quadword operands might be misaligned, even if this technique is used and the compiler does allocate variables in the order they are declared.

The following example demonstrates the reordering of local variable declarations:

Original ordering (Avoid):

short

ga, gu, gi;

long

foo, bar;

double

x,

y, z[3];

char

a,

b;

float

baz;

Improved ordering (Preferred):

double

z[3];

double

x, y;

long

foo, bar;

float

baz;

short

ga, gu, gi;

See “Sort Variables According to Base Type Size” on page 56 for more information from a different perspective.

Accelerating Floating-Point Divides and Square Roots

Divides and square roots have a much longer latency than other floating-point operations, even though the AMD Athlon processor provides significant acceleration of these two operations. In some codes, these operations occur so often as to s e r io u s ly i m p a c t p e rfo r m a n c e . I n t h e s e c a s e s , i t i s recommended to port the code to 3DNow! inline assembly or to use a compiler that can generate 3DNow! code. If code has hot spots that use single-precision arithmetic only (i.e., all computation involves data of type float) and for some reason cannot be ported to 3DNow!, the following technique may be used to improve performance.

The x87 FPU has a precision-control field as part of the FPU control word. The precision-control setting determines what precision results get rounded to. It affects the basic arithmetic operations, including divides and square roots. AMD Athlon and AMD-K6®family processors implement divide and square root in such fashion as to only compute the number of bits

Accelerating Floating-Point Divides and Square Roots

29

Page 45
Image 45
AMD x86 manual Accelerating Floating-Point Divides and Square Roots, Improved ordering Preferred