Processor Bus

6.4GB/s (FSB400MHz)

10.6 GB/s (FSB667MHz)

Node Bandwidth

L3 Cache

Copy Tag

￿CC-Numa

￿Point to point

￿Low Latency

NDC

Node

Controller

NDC

Node

Controller

NDC

Node

Controller

NDC

Node

Controller

4.8GB/s (FSB400MHz)

5.3GB/s (FSB667MHz)

PCI Bus

2GB/s x3

PCI-Express (4 Lane)

PCI

Bridge

 

 

PCI

Slots

Memory Bus

4.8GB/s (FSB400MHz)

5.3GB/s (FSB667MHz)

DDR2

Memory

MC

Memory

Controller

MC

Memory

Controller

DDR2

Memory

Figure 6. Hitachi Node Controller connects multiple server blades

By dividing the SMP system across several server blades, the memory bus contention problem is solved by virtue of the distributed design. A processor’s access to its on-board memory incurs no penalty. The two processors (four cores) can access up to 64 GB at the full speed of local memory. When a processor needs data that is not contained in its locally attached memory, its node controller needs to contact the appropriate other node controller to retrieve the data. The latency for retrieving that data is therefore higher than retrieving data from local memory. Since remote memory takes longer to access, this is known as a non-uniform memory architecture (NUMA). The advantage of using non- uniform memory is the ability to scale to a larger number of processors within a single system image while still allowing for the speed of local memory access.

While there is a penalty for accessing remote memory, a number of operating systems are enhanced to improve the performance of NUMA system designs. These operating systems take into account where data is located when scheduling tasks to run on CPUs, using the closest CPU where possible. Some operating systems are able to rearrange the location of data in memory to move it closer to the processors where its needed. For operating systems that are not NUMA aware, the BladeSymphony 1000 offers a number of memory interleaving options that can improve performance.

The Node Controllers can connect to up to three other Node Controllers providing a point-to-point connection between each Node Controller. The advantage of the point-to-point connections is it eliminates a bus, which would be prone to contention, and eliminates the cross bar switch, which reduces contention as a bus, but adds complexity and latency. A remote memory access is streamlined because it only needs to pass through the two Node Controllers, this provides less latency when compared to other SMP systems.

www.hitachi.com

BladeSymphony 1000 Architecture White Paper 15

Page 15
Image 15
Hitachi 1000 manual Hitachi Node Controller connects multiple server blades