5

The Evolution of Chip Multithreading (CMT)

Sun Microsystems, Inc.

a result, while these designs provide some additional throughput and scalability, they can consume considerable power and generate significant heat — without a commensurate increase in overall performance.

Chip Multithreading (CMT) with CoolThreadsTechnology

Sun engineers were early to recognize the disparity between processor speeds and memory access rates. While processor speeds continue to double every two years, memory speeds have typically doubled only every six years. As a result, memory latency now dominates much application performance, erasing even very impressive gains in clock rates. This growing disconnect is the result of memory suppliers focusing on density and cost as their design center, rather than speed.

Unfortunately, this relative gap between processor and memory speeds leaves ultra-fast processors idle as much as 85 percent of the time, waiting for memory transactions to complete. Ironically, as traditional processor execution pipelines get faster and more complex, the effect of memory latency grows — fast, expensive processors spend more cycles doing nothing. Worse still, idle processors continue to draw power and generate heat. It is easy to see that frequency (gigahertz) is truly a misleading indicator of real performance.

First introduced with the UltraSPARC T1 processor, chip multithreading takes advantage of CMP advances, but adds a critical capability — the ability to scale with threads rather than frequency. Unlike traditional single-threaded processors and even most current multicore (CMP) processors, hardware multithreaded processor cores allow rapid switching between active threads as other threads stall for memory. Figure 1 illustrates the difference between CMP, fine-grained hardware multithreading (FG-MT), and chip multithreading. The key to this approach is that each core in a CMT processor is designed to switch between multiple threads on each clock cycle. As a result, the processor’s execution pipeline remains active doing real useful work, even as memory operations for stalled threads continue in parallel.

Chip

Fine-Grained

Chip

Multiprocessing

Multithreading

Multithreading

(CMP)

(FG-MT)

(CMT)

(n cores

(m strands

(n x m threads

per processor)

per core)

per processor)

 

 

Memory Latency

 

Compute

 

 

 

 

 

 

 

 

Figure 1. Chip multithreading combines CMP and fine-grained hardware multithreading

Page 7
Image 7
Sun Microsystems T5220, T5120 manual Chip Multithreading CMT with CoolThreads Technology