Chapter 6 Branch Optimizations 141
Software Optimization Guide for AMD64 Processors
25112 Rev. 3.06 September 2005
6.8 The LOOP Instruction
Optimization
Avoid using the LOOP instruction.
Application
This optimization applies to:
32-bit software
64-bit software
Rationale
The LOOP instruction has a latency of at least 8 cycles.
Example
Avoid code like this, which uses the LOOP instruction:
label:
...
loop label
Instead, replace the loop instruction with a DEC and a JNZ:
label:
...
dec rcx
jnz label