40555 Rev. 3.00 June 2006 | Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ |
| ccNUMA Multiprocessor Systems |
VERY HIGH: Total Time for both threads
2.4
2.2
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
|
|
|
|
|
|
|
|
| 216% | 202% |
|
|
|
|
|
| |
|
|
|
|
| |
| 156% |
|
|
| |
|
|
| 1 Hop | 1 Hop |
|
|
|
|
| ||
|
|
|
| ||
|
| 0 Hop | 1 Hop |
| |
|
|
| |||
|
| 1 Hop |
| ||
|
| NO |
| ||
|
| 0 Hop |
| ||
|
| Xfire |
| ||
|
| Xfire |
| ||
|
|
|
|
| |
|
|
|
|
|
|
0.0.w.0 1.0.w.1 (0 Hops) (0 Hops)
0.0.w.1 1.0.w.3 (1 Hops) (1 Hops)
0.0.w.1 1.0.w.0 (1 Hops) (1 Hops)
Figure 9. Crossfire 1
In the no crossfire case, the total memory bandwidth observed on the memory controller on node 3 is
4.5GB/s and several buffer queues on node 3 are saturated. For detailed analysis, refer to Section A.3 on page 42.
Thus, while, in general, all equal hop cases take equal time, there can be exceptions to this rule if some resources in the
3.4.2Myth: Greater Hop Distance Always Means Slower Time.
As a general rule, a 2 hop case will be slower than a 1 hop case, which, in turn, will be slower than a 0 hop case, if the only change between the cases is thread and memory placement.
For example, the synthetic test demonstrates how a given 0
0
Imagine yourself in the following situation: you are ready to check out at your favorite grocery store with a shopping cart full of groceries. Directly in front of you is a
Clearly most people would walk the 50 feet, suffer the latency and arrive at a
Chapter 3 | Analysis and Recommendations | 29 |