22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Floating-Point Subexpression Elimination

There are cases which do not require an FXCH instruction after every instruction to allow access to two new stack entries. In the cases where two instructions share a source operand, an FXCH is not required between the two instructions. When there is an opportunity for subexpression elimination, reduce the number of superfluous FXCH instructions by putting the shared source operand at the top of the stack. For example, using the function:

func( (x*y), (x+z) )

Example 1 (Avoid):

FLD

Z

FLD

Y

FLD

X

FADD

ST, ST(2)

FXCH

ST(1)

FMUL

ST, ST(2)

CALL

FUNC

FSTP

ST(0)

Example 2 (Preferred):

FLD

Z

FLD

Y

FLD

X

FMUL

ST(1), ST

FADDP

ST(2), ST

CALL

FUNC

Check Argument Range of Trigonometric Instructions Efficiently

The transcendental instructions FSIN, FCOS, FPTAN, and FSINCOS are architecturally restricted in their argument range. Only arguments with a magnitude of <= 2^63 can be evaluated. If the argument is out of range, the C2 bit in the FPU status word is set, and the argument is returned as the result. Software needs to guard against such (extremely infrequent) cases.

Floating-Point Subexpression Elimination

103

Page 119
Image 119
AMD x86 manual Floating-Point Subexpression Elimination, Example 1 Avoid