Software Optimization Guide for AMD64 Processors | 25112 Rev. 3.06 September 2005 |
REP string with low variable counts 168 unroll small loops 13
unrolling loops 145
M
memory
dynamic memory allocation 19 pushing memory data 157
MMX™ instructions PANDN instruction 137 PREFETCHNTA/T0/T1/T2 instructions 105
MOVZX and MOVSX instructions 153 multiplication
by constant 164
multiplies over division,
N
Nonuniform Memory Access 96
O
operands
largest possible operand size, repeated string 168
P
parallelism 35 PF2ID instructions 52 pointers
dereferenced arguments 44 use
determining distance 108 multiple 107
PREFETCH and PREFETCHW instructions 104, 106, 108 prototypes 29
R
recursive functions 132
register reads and writes, partial 81 REP prefix 168
S
scalar code translated into 3DNow! code 138 scheduling 144
SHLD instruction 85 SHR instruction 85
SSE2 193, 355
stack
alignment considerations 122
string instructions 167 structure (struct) 41, 117, 119 subexpressions, explicitly extract common 37 superscalar processor 251
switch statement 25, 28, 33
U
W
write combining 113, 260,
X
XOR instruction 169
368 | Index |