Manuals
/
AMD
/
Computer Equipment
/
Computer Hardware
AMD
manual Performance Guidelines for AMD Athlon 64 and, Application Note, Publication #
Models:
64
1
1
48
48
Download
48 pages
55.63 Kb
1
2
3
4
5
6
7
8
<
>
Data placement tools can also come in handy when a thread needs more data than the amount of physical memory available on a node. Certain OSs also allow data migration with these tools or API. Using this feature, data can be migrated from the node where it was first touched to the node where it is subsequently accessed. There is a cost associated with this migration and it is not advised to use it frequently. For additional details on the tools and APIs offered by various OS for thread and memory placement refer to Section A.7 on page
A.8.4 Node Interleaving Configuration in the BIOS
Issue Date June
Experimental Setup
How to
Data Access Rate Qualifiers
Page 1
Image 1
Performance Guidelines for
AMD Athlon™ 64 and
AMD Opteron™ ccNUMA Multiprocessor Systems
Application Note
Publication #
40555
Revision: 3.00
Issue Date:
June 2006
Page 2
Page 1
Image 1
Page 2
Contents
Performance Guidelines for AMD Athlon 64 and
Issue Date June
AMD Opteron ccNUMA Multiprocessor Systems
Application Note
Trademarks
2006 Advanced Micro Devices, Inc. All rights reserved
Chapter 2 Experimental Setup
Contents
Contents
Chapter 3 Analysis and Recommendations
A.2.1 What Resources Are Used When a Single Read-Only or
List of Figures
List of Figures
Performance Guidelines for AMD Athlon 64 and AMD Opteron
List of Figures
ccNUMA Multiprocessor Systems
40555 Rev. 3.00 June
Revision History
Revision History
ccNUMA Multiprocessor Systems
Performance Guidelines for AMD Athlon 64 and AMD Opteron
Performance Guidelines for AMD Athlon 64 and AMD Opteron
Revision History
ccNUMA Multiprocessor Systems
40555 Rev. 3.00 June
Chapter
Chapter 1 Introduction
Introduction
10 http//msdn2.microsoft.com/en-us/library/ms186255SQL.90.aspx
1.1 Related Documents
Introduction
Chapter
14 http//msdn2.microsoft.com/en-us/library/tt15eb9t.aspx
Introduction
Performance Guidelines for AMD Athlon 64 and AMD Opteron
Introduction
ccNUMA Multiprocessor Systems
Chapter
Experimental Setup
Chapter 2 Experimental Setup
2.1 System Used
Chapter
Experimental Setup
Figure 1. Quartet Topology
Figure 2. Internal Resources Associated with a Quartet Node
2.2 Synthetic Test
Experimental Setup
Chapter
Experimental Setup
Data Access Rate Qualifiers
2.3.1 X-Axis Display
2.3 Reading and Interpreting Test Graphs
Experimental Setup
Chapter
2.3.3 Y-Axis Display
2.3.2 Labels Used
Experimental Setup
3.1 Scheduling Threads
Chapter 3 Analysis and Recommendations
3.1.1 Multiple Threads-Independent Data
Analysis and Recommendations
3.1.2 Multiple Threads-Shared Data
3.2 Data Locality Considerations
3.1.3 Scheduling on a Non-Idle System
Analysis and Recommendations
Analysis and Recommendations
Chapter
2 Hop
100%
Analysis and Recommendations
3.2.1 Keeping Data Local by Virtue of first Touch
Analysis and Recommendations
Chapter
Analysis and Recommendations
afterwords no longer needs the data structure and if only one of the worker threads needs the data structure. In other words, the data structure is not truly shared between the worker threads
3.3 Avoid Cache Line Sharing
Threads access local data
3.4 Common Hop Myths Debunked
3.4.1 Myth All Equal Hop Cases Take Equal Time
Analysis and Recommendations
Threads firing at each other crossfire
Chapter
Node 0 Core Node 1 Core Node 2 Core Node 3 Core
Analysis and Recommendations
Analysis and Recommendations
Next, we increase the number of background threads to six, running on
Chapter
3.4.2 Myth Greater Hop Distance Always Means Slower Time
Analysis and Recommendations
216%
Analysis and Recommendations
Both threads access memory on node
Chapter
0 hop-0 hop case for the write-only threads
Analysis and Recommendations
Analysis and Recommendations
In addition, three background threads are running on nodes 1, 2 and
Analysis and Recommendations
Chapter
Medium Total Time for both threads write-write
High Total Time for both threads write-write
Analysis and Recommendations
3.5 Locks
147%
158% 158%
Analysis and Recommendations
Chapter
Performance Guidelines for AMD Athlon 64 and AMD Opteron
Analysis and Recommendations
ccNUMA Multiprocessor Systems
Chapter
Conclusions
Chapter 4 Conclusions
Chapter
Conclusions
Data placement tools can also come in handy when a thread needs more data than the amount of physical memory available on a node. Certain OSs also allow data migration with these tools or API. Using this feature, data can be migrated from the node where it was first touched to the node where it is subsequently accessed. There is a cost associated with this migration and it is not advised to use it frequently. For additional details on the tools and APIs offered by various OS for thread and memory placement refer to Section A.7 on page
A.1 Description of the Buffer Queues
Appendix A
Figure 16. Internal Resources Associated with a Quartet Node
Appendix A
Page
Appendix A
A.2.3 What Role Do Buffers Play in the Throughput Observed?
0 Hop-1 Hop Case on an Idle System for Write- Only Threads?
A.4 Why Is 0 Hop-0 Hop Case Slower Than the
A.5 Why Is 0 Hop-1 Hop Case Slower Than
Controlling Process and Thread Affinity
A.7.1 Support Under Linux
A.7.3 Support under Microsoft Windows
A.7.2 Support under Solaris
Controlling Memory Affinity
Appendix A
A.8.2 Support under Solaris
A.8.1 Support under Linux
A.8.3 Support under Microsoft Windows
Appendix A
A.8.4 Node Interleaving Configuration in the BIOS
ccNUMA Multiprocessor Systems
Performance Guidelines for AMD Athlon 64 and AMD Opteron
Appendix A
40555 Rev. 3.00 June