Chapter 3 General 64-Bit Optimizations 63
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
Each of the products of the components of a and b (for example, a1 *b1) is composed of 64 bits—an
upper 32 bits and a lower 32 bits. it is convenient to represent these individual products as d, e, f, and
g, as follows:
a0 * b0 = d1:d0 = d1 * 232 + d0
a1 * b0 = e1:e0 = e1 * 232 + e0
a0 * b1 = f1:f0 = f1 * 232 + f0
a1 * b1 = g1:g0 = g1 * 232 + g0
Substitution yields the following equation:
c = (g1 * 232 + g0) * 264 + (e1 * 232 + e0 + f1 * 232 + f0) * 232 + (d1 * 232 + d0)
Simplifying yields this equation:
c = g1 * 296 + (e1 + f1 + g0) * 264 + (d1 + e0 + f0) * 232 + d0
it is convenient to represent the terms that are multiplied by each power of 2 as c3, c2, c1, and c0, as
follows:
g1 = c3
e1 + f1 + g0 = c2
d1 + e0 + f0 = c1
d0 = c0
Substituting again yields:
c = c3 * 296 + c2 * 264 + c1 * 232 + c0