Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ | 40555 Rev. 3.00 June 2006 |
ccNUMA Multiprocessor Systems |
|
VERY HIGH: Total Time for both threads
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
|
|
| 186% | 195% |
|
|
|
|
|
|
|
|
|
|
|
| |
| 158% |
|
|
| |
|
|
| 1 Hop | 1 Hop |
|
|
|
|
| ||
|
| 0 Hop | 1 Hop |
| |
|
|
| |||
|
| NO | 1 Hop |
| |
|
| 0 Hop | Xfire | Xfire |
|
|
|
| |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.0.w.0 1.0.w.1 (0 Hops) (0 Hops)
0.0.w.1 1.0.w.3 (1 Hops) (1 Hops)
0.0.w.1 1.0.w.0 (1 Hops) (1 Hops)
Figure 8. Crossfire 1
Next, we increase the number of background threads to six, running on:
•Node 0 (Core 1)
•Node 1 (Core 1)
•Node 2 (Cores 0 and 1)
•Node 3 (Cores 0 and 1)
Each of these background threads reads a local 64 MB array and the rate of memory demand of each thread is very high. A very high rate of memory demand implies that each of the background threads is demanding a memory bandwidth of 4GB/s, as shown in Table 1 on page 16.
No free cores are left in the system. This the fully subscribed condition.
As shown in Figure 9 on page 29, when the background load and level subscription are increased to the maximum possible, the no crossfire case becomes slower than the crossfire case.
28 | Analysis and Recommendations | Chapter 3 |