Optimized 128-Bit Velocity Engine

	32-bit	128-bit
		Velocity
	Processor
		Engine

The Velocity Engine can manipulate 128 bits of data at a time, up to four times faster than the general processing units in 32-bit processors.

White Paper

PowerPC G5

The PowerPC G5 uses a dual-pipelined Velocity Engine optimized with two independent queues and dedicated 128-bit registers and data paths for efficient instruction and data flow. This 128-bit vector processing unit accelerates data manipulation by applying a sin- gle instruction to multiple data at the same time, known as SIMD processing. Originally implemented in the PowerPC G4, the Velocity Engine in the PowerPC G5 uses the same set of 162 instructions, enabling it to run—and accelerate—existing Mac OS X applica- tions that have been optimized for the Velocity Engine.

Vector processing is useful for transforming large sets of data, such as manipulating an image or rendering a video effect. For example, when a designer uses a filter to apply a motion blur to an image, each pixel of the image must be changed according to the same set of instructions—a highly repetitive processing task. Each Velocity Engine pipe- line speeds up this task by processing up to 128 bits of data, in four 32-bit integers, eight 16-bit integers, sixteen 8-bit integers, or four 32-bit single-precision floating-point values, all in a single clock cycle.

Two Double-Precision Floating-Point Units

Today’s powerful applications demand both precision and performance. That’s why the PowerPC G5 has two double-precision floating-point units, enabling it to complete at least two 64-bit mathematical calculations per clock cycle. In fact, each floating-point unit can perform both an add and a multiply with a single instruction. This dramatically accelerates highly complex computations that are critical in research simulations and in many of the applications used to manipulate or render 3D graphics and video content.

Weather prediction is one example of a highly iterative computing task made possible by floating-point math. Large-scale models simulate weather patterns over time by mea- suring multiple influences, such as atmospheric pressure and airflow, at various instants and recalculating the model every minute. The PowerPC G5 provides the precision and performance to deliver accurate results within a useful timeframe.

Two Integer Units

Integer units perform simple integer mathematics—such as add, subtract, and compare— which are commonly used in many basic computer functions, as well as in imaging, video, and audio applications. The PowerPC G5 has two integer units capable of a broad range of simple and complex instructions involving both 32-bit and 64-bit data. What’s more, they take full advantage of the processor’s 64-bit registers and data paths to com- plete 64-bit integer calculations in a single pass.

Two Load/Store Units

Load/store units manage data as it is processed, loading it into the registers of each functional unit and, after processing, storing the new data in L1 cache, L2 cache, or main memory, as appropriate. The PowerPC G5 is generously equipped with three large sets of registers: A general-purpose register file contains 64-bit registers for integer calculations; a floating-point register file contains 64-bit registers for floating-point calculations; and a vector register file contains 128-bit registers for the Velocity Engine. Each register file holds 32 registers for architected values, as well as 48 rename, or proxy, registers. With two load/store units, the PowerPC G5 is able to keep these registers filled with data for maximum processing efficiency.