Support under Solaris, Support under Microsoft Windows

40555 Rev. 3.00 June 2006	Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™
	ccNUMA Multiprocessor Systems

Controlling Memory Affinity

Both numactl and libnuma library functions can be used to set memory affinity[5]. Memory affinity set by tools like numactl applies to all the data accessed by the entire program (including child processes). Memory affinity set by libnuma or other library functions can be made to apply only to

specific data as determined by the program.

Both numactl and the libnuma API can be used to set a preferred memory affinity instead of forcibly binding it. In this case the binding specified is a hint to the OS; the OS may choose not to adhere to it.

At a high level, normal first touch binding, explicit binding and preferred binding are all available as memory policies on Linux.

By default, when none of the tools/API is used, Linux uses the first touch binding policy for all data. Once memory is bound, either by the OS, or by using the tools/API, the memory will normally remain resident on that node for its lifetime.

A.7.2 Support under Solaris

Sun Solaris provides several tools and API's for influencing thread/process and memory affinity[6].

Solaris provides a command line tool called pbind to set process affinity. There is also a shared library called liblgrp that provides an API that a program can call to set thread affinity.

Solaris provides a memory placement API to affect memory placement. A program can call the madvise( ) function to provide hints to the OS as to the memory policy to use. This API does not

allow binding of memory to an explicit node or set of nodes specified on the command line or in the program. But there are several policies other than the first touch policy that can be used.

For example, a thread can use madvise to migrate the data it needs to the node where it runs, instead of leaving it on a different node, on which it was first touched by another thread. There is, naturally, a cost associated with the migration.

Solaris provides a library called madv.so.1 which can interpose on memory allocation system calls and call the madvise function internally for the memory policy.

By default, Solaris uses the first touch binding policy for data that is not shared. Once memory is bound to a node it normally remains resident on that node for its lifetime.

Sun is also working on supporting several command line tools to control thread and memory placement. These are expected to be integrated in the upcoming versions of Solaris, but experimental versions are currently available[7].

A.7.3 Support under Microsoft® Windows®

In the Microsoft Windows environment, the function to bind a thread on particular core or cores is SetThreadAffinityMask( ). The function to run all threads in a process on particular core or cores is SetProcessAffinityMask( )[8].

Appendix A

64 specifications

AMD64 is a 64-bit architecture developed by Advanced Micro Devices (AMD) as an extension of the x86 architecture. Introduced in the early 2000s, it aimed to offer enhanced performance and capabilities to powering modern computing systems. One of the main features of AMD64 is its ability to address a significantly larger amount of memory compared to its 32-bit predecessors. While the old x86 architecture was limited to 4 GB of RAM, AMD64 can theoretically support up to 16 exabytes of memory, making it ideal for applications requiring large datasets, such as scientific computing and complex simulations.

Another key characteristic of AMD64 is its support for backward compatibility. This means that it can run existing 32-bit applications seamlessly, allowing users to upgrade their hardware without losing access to their existing software libraries. This backward compatibility is achieved through a mode known as Compatibility Mode, enabling users to benefit from both newer 64-bit applications and older 32-bit applications.

AMD64 also incorporates several advanced technologies to optimize performance. One such technology is the support for multiple cores and simultaneous multithreading (SMT). This allows processors to handle multiple threads concurrently, improving overall performance, especially in multi-tasking and multi-threaded applications. With the rise of multi-core processors, AMD64 has gained traction in both consumer and enterprise markets, providing users with an efficient computing experience.

Additionally, AMD64 supports advanced vector extensions (AVX), which enhance the capability of processors to perform single instruction, multiple data (SIMD) operations. This is particularly beneficial for tasks such as video encoding, scientific simulations, and cryptography, allowing these processes to be executed much faster, thereby increasing overall throughput.

Security features are also integrated within AMD64 architecture. Technologies like AMD Secure Execution and Secure Memory Encryption help protect sensitive data and provide an enhanced security environment for virtualized systems.

In summary, AMD64 is a powerful and versatile architecture that extends the capabilities of x86, offering enhanced memory addressing, backward compatibility, multi-core processing, vector extensions, and robust security features. These innovations have positioned AMD as a strong competitor in the computing landscape, catering to the demands of modern users and applications. The continuous evolution of AMD64 technology demonstrates AMD's commitment to pushing the boundaries of computing performance and efficiency.

Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™

Controlling Memory Affinity

A.7.2 Support under Solaris

A.7.3 Support under Microsoft® Windows®

Appendix A

64 specifications