Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™ | 40555 Rev. 3.00 June 2006 |
ccNUMA Multiprocessor Systems |
|
In addition, three background threads are running on nodes 1, 2 and 3.
Each of these background threads access data locally. The rate of memory demand by each these threads is varied simultaneously from low to medium to high to very high as shown in Table 1 on page 16. This allows us to study the impact of the background load on the foreground threads and evaluate the performance of the two foreground threads.
Even with the background threads, there are still some free cores left in the system; this is called the highly subscribed condition.
As shown in Figures12, 13 below and Figure 14 on page 33, as the load increases from low to medium to high, the advantage of having the memory for one of the writer threads one or two hops away diminishes.
LOW: Total Time for both threads
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
| 145% |
|
| 136% |
|
|
|
|
|
| |
|
| 127% | 126% | ||
|
|
|
| ||
|
|
|
|
0 Hop |
| 0 Hop | 0 Hop | 0 Hop |
0 Hop | 1 Hop | 1 Hop | 2 Hop | |
|
|
|
|
|
0.0.w.0 0.1.w.0 (0 Hops) (0 Hops)
0.0.w.0 0.1.w.1 (0 Hops) (1 Hops)
0.0.w.0 0.1.w.2 (0 Hops) (1 Hops)
0.0.w.0 0.1.w.3 (0 Hops) (2 Hops)
Figure 12. Both
32 | Analysis and Recommendations | Chapter 3 |