Cycle Timings and Interlock Behavior
ARM DDI 0363E Copyright ©2009 ARM Limited. All rights reserved. 14-11
ID013010 Non-Confidential, Unrestricted Access
14.6 Sum of Absolute Differences (SAD)

Table14-7 shows

SAD

instructions and gives their cycle timing behavior.

14.6.1 Example interlocks

Table14-8 shows interlock examples using

USAD8

and

USADA8

instructions.

Table14-7 Sum of absolute differences instruction timing behavior
Instructions Cycles Early Reg Result latency
USAD8
1
<Rn>, <Rm>
2a
a. Result latency is one fewer if the destination is the
accumulate for a subsequent
USADA8
.
USADA8
1
<Rn>, <Rm>
2a
Table14-8 Example interlocks
Instruction sequence Behavior
USAD8 R1,R2,R3
ADD R5,R6,R1
Takes three cycles because
USAD8
has a Result Latency of two, and the
ADD
requires
the result of the
USAD8
instruction.
USAD8 R1,R2,R3
MOV R9,R9
ADD R5,R6,R1
Takes three cycles. The
MOV
instruction is scheduled during the Result Latency of
the USAD8 instruction.
USAD8 R1,R2,R3
USADA8 R1,R4,R5,R1
Takes two cycles. The Result Latency is one less because the result is used as the
accumulate for a subsequent
USADA8
instruction.