AMD 64 manual Keeping Data Local by Virtue of first Touch, Analysis and Recommendations

Models: 64

1 48
Download 48 pages 55.63 Kb
Page 22
Image 22
3.2.1Keeping Data Local by Virtue of first Touch

Performance Guidelines for AMD Athlon™ 64 and AMD Opteron™

40555 Rev. 3.00 June 2006

ccNUMA Multiprocessor Systems

 

1.8

1.6

1.4

1.2

1

0.8

0.6

0.4

0.2

0

T im e for w rite

1 4 9 %

1 27 % 12 9%

113 %

0 Ho p

1 Ho p

1 Ho p 2 Ho p

 

 

 

Manual backgroundManual background 0 .0 .w .0 Manual backgroundManual background 0 .0 .w .1 Manual background 0 .0 .w .2 Manual backgroundManual background 0 .0 .w .3

Figure 5. Write-Only Thread Running on Node 0, Accessing Data from 0, 1 and 2 Hops Away on an Idle System

In this test case, a write access is similar to a read access as far as the coherent HyperTransport™ link traffic or the memory traffic generated, except for certain key differences. A write access brings data into the cache much like a read and then modifies it in the cache. However, in this particular synthetic test case, there are several successive write accesses to sequential cache line elements in a 64-MB array. This results in a steady state condition of cache line evictions or write-backs for each write access. This increases the memory and HyperTransport traffic that normally occurs for a write-only thread to almost twice that of a read-only thread. For our test bench, when a thread does local read- only accesses, it generates almost twice the memory bandwidth load of 1.64 GB/s, and when a thread performs local write-only accesses, it generates a memory bandwidth load of

2.98GB/s. Not only do writes take longer than reads for any given hop distance, but they slow down more quickly with hop distance as a result.

3.2.1Keeping Data Local by Virtue of first Touch

In order to keep data local, it is recommended that the following principles be observed.

As long as a thread initializes the data it needs (writes to it for the first time) and does not rely on any other thread to perform the initialization, a ccNUMA-aware OS keeps data local on the node where

the thread runs. This policy of keeping data local by writing to it for the first time is known as the local allocation policy by virtue of first touch. This is the default policy used by a ccNUMA-aware

OS.

A ccNUMA-aware OS ensures local allocation by taking a page fault at the time of the first touch to data. When the page fault occurs the OS maps the virtual pages associated with the data to zeroed out physical pages. Now the data is resident on the node where the first touch occurred and any subsequent accesses to the data will have to be serviced from that node.

22

Analysis and Recommendations

Chapter 3

Page 22
Image 22
AMD 64 manual Keeping Data Local by Virtue of first Touch, Analysis and Recommendations