Software Optimization Guide for AMD64 Processors

25112 Rev. 3.06 September 2005

AMD OpteronTM

Dual-Core Processor 0

Core 0

Core 1

 

 

Memory

HyperTransportTM

AMD OpteronTM

Dual-Core Processor 1

Core 2

Core 3

 

 

Memory

Figure 3. Dual-Core AMD Opteron™ Processor Configuration

OS Implications

An operating system running on an AMD Opteron platform will coordinate and manage the memory configuration so that an application does not have to be aware of this memory configuration. Thanks to the OS, the platform will simply appear to have one contiguous block of memory regardless of how many processors are in the platform.

Because of the difference in latencies in ccNUMA systems, the OS must make determinations that enable the best performance. It would be undesirable, for example, to spawn a thread on a processor while allocating the memory space for that thread on a different processor. For such reasons, it is important to be aware of the capabilities of the OS being used. Microsoft's Windows Server 2003 products are ccNUMA aware. The SUSE distribution of 64-bit Linux also has a ccNUMA aware kernel for AMD64 processors.

Windows applications that spawn several threads, where each thread operates on largely independent

data, might benefit from distributing those threads across several processors and allocating memory locally for each thread. This can be accomplished by using the SetThreadAffinityMask( ) function and by allocating memory blocks using VirtualAlloc( ) from within the thread that will be heavily

accessing that memory block. Memory is not actually committed until it is accessed and then it is

committed to the node that accesses it. For this reason, it is a good idea to initialize that memory block using memset() or other code which causes all the pages in the block to be accessed if there is a

chance another node could access it first. See the Microsoft documentation on MSDN for more details (search for SetThreadAffinityMask( )).

98

Cache and Memory Optimizations

Chapter 5

Page 114
Image 114
AMD 250 manual OS Implications, Dual-Core AMD Opteron Processor Configuration