General Operation

Thus, the 48-bit accumulator definition is a function of the EMAC operating mode. Given that each 48-bit accumulator is the concatenation of 16-bit accumulator extension register (ACCextn) contents and 32-bit ACCn contents, the specific definitions are as follows:

if MACSR[6:5] == 00/* signed integer mode */

Complete Accumulator[47:0] = {ACCextn[15:0], ACCn[31:0]} if MACSR[6:5] == -1/* signed fractional mode */

Complete Accumulator [47:0] = {ACCextn[15:8], ACCn[31:0], ACCextn[7:0]} if MACSR[6:5] == 10/* unsigned integer mode */

Complete Accumulator[47:0] = {ACCextn[15:0], ACCn[31:0]}

The four accumulators are represented as an array, ACCn, where n selects the register.

Although the multiplier array is implemented in a four-stage pipeline, all arithmetic MAC instructions have an effective issue rate of 1 cycle, regardless of input operand size or type.

All arithmetic operations use register-based input operands, and summed values are stored internally in an accumulator. Thus, an additional move instruction is needed to store data in a general-purpose register. One new feature found in EMAC instructions is the ability to choose the upper or lower word of a register as a 16-bit input operand. This is useful in filtering operations if one data register is loaded with the input data and another is loaded with the coefficient. Two 16-bit multiply accumulates can be performed without fetching additional operands between instructions by alternating the word choice during the calculations.

The EMAC has four accumulator registers versus the MAC’s single accumulator. The additional registers improve the performance of some algorithms by minimizing pipeline stalls needed to store an accumulator value back to general-purpose registers. Many algorithms require multiple calculations on a given data set. By applying different accumulators to these calculations, it is often possible to store one accumulator without any stalls while performing operations involving a different destination accumulator.

The need to move large amounts of data presents an obstacle to obtaining high throughput rates in DSP engines. New and existing ColdFire instructions can accommodate these requirements. A MOVEM instruction can move large blocks of data efficiently by generating line-sized burst references. The ability to simultaneously load an operand from memory into a register and execute a MAC instruction makes some DSP operations such as filtering and convolution more manageable.

The programming model includes a 16-bit mask register (MASK), which can optionally be used to generate an operand address during MAC + MOVE instructions. The application of this register with auto-increment addressing mode supports efficient implementation of circular data queues for memory operands.

MOTOROLA

Chapter 3. Enhanced Multiply-Accumulate Unit (EMAC)

3-5

Page 109
Image 109
Motorola MCF5281, MCF5282 user manual Motorola