25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

movdqa [rax+96], xmm6

 

movdqa [rax+112], xmm7

 

add rax, 128

;

Bump

up by 2 cache lines

add

rdi, 128

;

for

source and destination.

dec

rcx

 

 

 

jnz Block_WriteToFrameBuffer

ChunkOfImageCopied:

/* Set up for next block in image (if necessary) */ /* until image is transferred. */

D.4 Memory Optimizations

AGP memory is system memory that is partitioned from the same memory that the operating system and applications use. The AGP card plugged into the AGP bus is always considered the master when performing AGP memory accesses since it reads and writes the system memory. The AGP card uses AGP memory for a variety of “surfaces,” including:

Texture maps

3-D object geometry and vertex data streams

Command buffers for 2-D and 3-D graphics engines

Video-capture buffers

Frame buffer (cost-reduced implementations)

The system memory used for AGP mastering is attached to the processor that has one of its HyperTransport links connected to an AGP tunnel device, such as the AMD-8151 HyperTransport AGP 3.0 graphics tunnel. AGP card requests (reads/writes) come into the processor through the HyperTransport link input and are arbitrated with processor requests for system memory in the system request queue (SRQ). From here, the AGP request address is passed into the processor’s address map and GART (graphics aperture remapping table), where the AGP physical address is translated into a physical DRAM page address, which can then be presented to the processor’s memory controller. Therefore, host processor to system memory throughput directly affects AGP memory bandwidth and throughput, as the two compete for SRQ entries and memory bandwidth. Figure 10 shows the command flow from the HyperTransport links to the SRQ.

Appendix D

AGP Considerations

351

Page 367
Image 367
AMD 250 manual Memory Optimizations, 351