40555 Rev. 3.00 June 2006 | Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ |
| ccNUMA Multiprocessor Systems |
Here the same two foreground threads as before were run though the cases as
•Node 0 (Core 1)
•Node 1 (Core 1)
•Node 2 (Core 0)
•Node 3 (Core 0)
Each of these background threads read a local 64 MB array and the rate of memory demand of each of these threads is varied from low to very high simultaneously. A low rate of memory demand implies that each of the background threads is demanding a memory bandwidth of 0.5 GB/s. A very high rate of memory demand implies that each of the background threads is demanding a memory bandwidth of 4 GB/s as shown in Table 1 on page 16.
Even with the background threads, there are still some free cores left in the system. We call this a highly subscribed condition.
This allows us to study the impact of the background load on the foreground threads.
As shown in Figure 7 and Figure 8 on page 28, under both low and very high loads and high subscription, we still observe that the worst performance scenario occurs when
LOW: Total Time for both threads
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136% | 144% |
|
|
|
|
|
| |
| 113% | 1 Hop | 1 Hop |
| |
|
|
|
| ||
|
| 0 Hop |
| ||
|
|
| |||
|
| 1 Hop | 1 Hop |
| |
|
| 0 Hop | NO | Xfire |
|
|
|
| |||
|
|
| Xfire |
|
|
|
|
|
|
| |
|
|
|
|
|
|
0.0.w.0 1.0.w.1 (0 Hops) (0 Hops)
0.0.w.1 1.0.w.3 (1 Hops) (1 Hops)
0.0.w.1 1.0.w.0 (1 Hops) (1 Hops)
Figure 7. Crossfire 1
Chapter 3 | Analysis and Recommendations | 27 |