25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

9.16Complex-Number Arithmetic Using SSE, SSE2, and 3DNow!™ Instructions

Optimization

Use vectorizing SSE, SSE2 and 3DNow! instructions to perform complex number calculations.

Application

This optimization applies to:

32-bit software

64-bit software

Rationale

Complex numbers have a “real” part and an “imaginary” part (where the imaginary part is denoted by the letter i). For example, the complex number z1 might have a real part equal to 4 and an imaginary part equal to 3, written as 4 + 3i. Multiplying and adding complex numbers is an integral part of digital signal processing. Complex number addition is illustrated here using two complex numbers, z1 (4 + 3i) and z2 (5 + 2i):

z1 + z2 = (4 + 3i) + (5 + 2i) = [4+5] + [3+2]i = 9 + 5i

or:

sum.real = z1.real + z2.real sum.imag = z1.imag + z2.imag

Complex number addition is illustrated here using the same two complex numbers:

z1 + z2 = (4 + 3i)(5 + 2i) = [4 5 - 3 2] + [3 5 + 4 2]i = 14 + 23i

or:

product.real = z1.real * z2.real - z1.imag * z2.imag product.imag = z1.real * z2.imag + z1.imag * z2.real

Complex numbers are stored as streams of two-element vectors, the two elements being the real and imaginary parts of the complex numbers. Addition of complex numbers can be achieved using vectorizing SSE or 3DNow!instructions, such as PFADD, ADDPS, and ADDPD. Multiplication of complex numbers is more involved.

From the formulas for multiplication, the real and imaginary parts of one of the numbers needs to be interchanged, and, additionally, the products must be positively or negatively accumulated depending upon whether we are computing the imaginary or real portion of the product.

Chapter 9

Optimizing with SIMD Instructions

221

Page 237
Image 237
AMD 250 manual Optimization, 221