Floating-Point Addition and Subtraction
5-35
Data Formats and Floating-Point Operation
Example 5–14. Floating-Point Subtraction
A subtraction is performed in this example. Let:
α= 01.0000000000000000000000000000001 × 20
b
= 01.0000000000000000000000000000000 × 20
The operation performed is α
b
. The mantissas are already aligned because
the two numbers have the same exponent. The result is a large cancellation
of the upper bits, as shown below.
01.0000000000000000000000000000001 × 20
–01.0000000000000000000000000000000 × 20
00.0000000000000000000000000000001 × 20
The result must be normalized. In this case, a left shift of 31 is required. The
exponent of the result is modified accordingly. The result is:
01.0000000000000000000000000000001 × 20
–01.0000000000000000000000000000000 × 20
01.0000000000000000000000000000000×2–31
Example 5–15. Floating-Point Addition With a 32-Bit Shift
This example illustrates a situation where a full 32-bit shift is necessary to
normalize the result. Let:
α= 01.1111111111111111111111111111111 × 2127
b = 10.0000000000000000000000000000000 × 2127
The operation to be performed is α +
b
.
01.1111111111111111111111111111111 × 2127
+10.0000000000000000000000000000000 × 2127
11.1111111111111111111111111111111 × 2127
Normalizing the result requires a left shift of 32 and a subtraction of 32 from
the exponent. The result is:
01.1111111111111111111111111111111 × 2127
+10.0000000000000000000000000000000 × 2127
11.1111111111111111111111111111111 × 2127