AMD 64 manual A.5 Why Is 0 Hop-1 Hop Case Slower Than

Models: 64

1 48
Download 48 pages 55.63 Kb
Page 43
Image 43
A.5 Why Is 0 Hop-1 Hop Case Slower Than

40555 Rev. 3.00 June 2006

Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™

 

ccNUMA Multiprocessor Systems

A.5 Why Is 0 Hop-1 Hop Case Slower Than

0 Hop-0 Hop Case on a System under High Background Load (High Subscription) for Write- Only Threads?

When a 0 hop-0 hop scenario is subjected to a very high background load, the system sees the following traffic pattern, where each node gets memory requests from the threads as described:

Node 0: 2 foreground threads.

Node 1: 1 background thread.

Node 3: 1 background thread.

Node 2: 1 background thread.

In the 0 hop-1 hop case, the system sees the following traffic pattern:

Node 0: 1 foreground thread

Node 1: 1 foreground and 1 background threads.

Node 3: 1 background thread.

Node 2: 1 background thread.

The 0 hop-1 hop case suffers from a greater load imbalance than the 0 hop-0 hop case, with node 1 suffering the worst effect of this imbalance.

Each of the background threads, as before, asks for data at a rate of 4GB/s and each of the foreground threads asks for data at a rate of 2.98 GB/s.

Data shows that there is a total memory access rate of 4.78 GB/s on node 1 and several buffer queues on node 1 are saturated and cannot absorb the data provided by the memory controller any faster.

A.6 Support for a ccNUMA-Aware Scheduler for AMD64 ccNUMA Multiprocessor Systems

Developers should ensure that the OS is properly configured to support ccNUMA. All versions of Microsoft® Windows® XP for AMD64 and Windows Server for AMD64 support ccNUMA without any configuration changes. The 32-bit versions of Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition require the /PAE boot parameter to support ccNUMA. For 64-bit Linux®, there may be separate kernels supporting ccNUMA that should be selected. The 2.6.x Linux kernels feature NUMA awareness in the scheduler[11]. Most SuSE and Red Hat Enterprise distributions of 64-bit Linux have the ccNUMA aware kernel. Solaris 10 and subsequent versions of Solaris for AMD64 support ccNUMA without any changes.

Appendix A

43

Page 43
Image 43
AMD 64 manual A.5 Why Is 0 Hop-1 Hop Case Slower Than