Apple PowerPC G5 Optimized 128-Bit Velocity Engine, Two Double-Precision Floating-Point Units

Page 9

32-bit

128-bit

Velocity

Processor

Engine

 

The Velocity Engine can manipulate 128 bits of data at a time, up to four times faster than the general processing units in 32-bit processors.

White Paper

9

PowerPC G5

Optimized 128-Bit Velocity Engine

The PowerPC G5 uses a dual-pipelined Velocity Engine optimized with two independent queues and dedicated 128-bit registers and data paths for efficient instruction and data flow. This 128-bit vector processing unit accelerates data manipulation by applying a sin- gle instruction to multiple data at the same time, known as SIMD processing. Originally implemented in the PowerPC G4, the Velocity Engine in the PowerPC G5 uses the same set of 162 instructions, enabling it to run—and accelerate—existing Mac OS X applica- tions that have been optimized for the Velocity Engine.

Vector processing is useful for transforming large sets of data, such as manipulating an image or rendering a video effect. For example, when a designer uses a filter to apply a motion blur to an image, each pixel of the image must be changed according to the same set of instructions—a highly repetitive processing task. Each Velocity Engine pipe- line speeds up this task by processing up to 128 bits of data, in four 32-bit integers, eight 16-bit integers, sixteen 8-bit integers, or four 32-bit single-precision floating-point values, all in a single clock cycle.

Two Double-Precision Floating-Point Units

Today’s powerful applications demand both precision and performance. That’s why the PowerPC G5 has two double-precision floating-point units, enabling it to complete at least two 64-bit mathematical calculations per clock cycle. In fact, each floating-point unit can perform both an add and a multiply with a single instruction. This dramatically accelerates highly complex computations that are critical in research simulations and in many of the applications used to manipulate or render 3D graphics and video content.

Weather prediction is one example of a highly iterative computing task made possible by floating-point math. Large-scale models simulate weather patterns over time by mea- suring multiple influences, such as atmospheric pressure and airflow, at various instants and recalculating the model every minute. The PowerPC G5 provides the precision and performance to deliver accurate results within a useful timeframe.

Two Integer Units

Integer units perform simple integer mathematics—such as add, subtract, and compare— which are commonly used in many basic computer functions, as well as in imaging, video, and audio applications. The PowerPC G5 has two integer units capable of a broad range of simple and complex instructions involving both 32-bit and 64-bit data. What’s more, they take full advantage of the processor’s 64-bit registers and data paths to com- plete 64-bit integer calculations in a single pass.

Two Load/Store Units

Load/store units manage data as it is processed, loading it into the registers of each functional unit and, after processing, storing the new data in L1 cache, L2 cache, or main memory, as appropriate. The PowerPC G5 is generously equipped with three large sets of registers: A general-purpose register file contains 64-bit registers for integer calculations; a floating-point register file contains 64-bit registers for floating-point calculations; and a vector register file contains 128-bit registers for the Velocity Engine. Each register file holds 32 registers for architected values, as well as 48 rename, or proxy, registers. With two load/store units, the PowerPC G5 is able to keep these registers filled with data for maximum processing efficiency.

Image 9
Contents White Paper July PowerPC G5Contents Introduction Key FeaturesMemory Addressing up to 18 Exabytes An Exponential Leap in Computing PowerClock Speeds up to 2GHz High-Precision Calculations in a Single Clock CycleIndustry-Leading 1GHz Frontside Bus Full Support for Symmetric MultiprocessingNative Compatibility with 32-Bit Application Code Ultrafast Access to Data and Instructions Next-Generation PowerPC ArchitectureAggressive Queuing and Register Renaming Highly Parallel Execution CoreTwo Double-Precision Floating-Point Units Optimized 128-Bit Velocity EngineTwo Integer Units Two Load/Store UnitsCondition Register Three-Component Branch Prediction LogicState-of-the-Art Process Technology from IBM Spec CPU2000 Industry-Leading PerformanceSpec CPU2000 Single-Processor Speed Spec CPU2000 Dual Processor System Throughput Bit PowerPC processor architecture Technical SpecificationsFrontside bus Wide execution coreComparison of PowerPC G4 and PowerPC G5 Physical specificationsFor More Information Three-component branch prediction logic