System Features
Memory latency is the amount of time required for a processor to retrieve data from memory.
Memory latency is lowest when a processor accesses local memory.
Distributed Shared I/O
Like DSM, I/O devices are distributed among the blade nodes within the IRUs (each base I/O blade node has two NUMAlink ports) and are accessible by all compute nodes within the SSI through the NUMAlink interconnect fabric.
ccNUMA Architecture
As the name implies, the
Cache Coherency
The Altix 450 systems use caches to reduce memory latency. Although data exists in local or remote memory, copies of the data can exist in various processor caches throughout the system. Cache coherency keeps the cached copies consistent.
To keep the copies consistent, the ccNUMA architecture uses
Each directory entry indicates the state of the memory block that it represents. For example, when the block is not cached, it is in an unowned state. When only one processor has a copy of the memory block, it is in an exclusive state. And when more than one processor has a copy of the block, it is in a shared state; a bit vector indicates which caches contain a copy.
When a processor modifies a block of data, the processors that have the same block of data in their caches must be notified of the modification. The Altix 450 server series use an invalidation method to maintain cache coherence. The invalidation method purges all unmodified copies of the block of data, and the processor that wants to modify the block receives exclusive ownership of the block.
73 |