AMD x86 manual Imul EAX, ECX INC ESI MOV, Add Edi, Ebx Shl, Inc Ebx Add Esi, Edx

Models: x86

1 256
Download 256 pages 58.62 Kb
Page 169
Image 169

22007E/0 — November 1999

AMD Athlon™ Processor x86 Code Optimization

Table 7.

Sample 1 – Integer Register Operations

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Instruction

 

 

Decode

Decode

 

 

 

 

 

 

 

Clocks

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Number

Instruction

Pipe

Type

 

1

 

2

 

3

 

4

5

6

7

8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

IMUL

EAX, ECX

0

VP

 

D

 

I

 

M

 

M

M

M

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2

 

INC

ESI

0

DP

 

 

 

D

 

I

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3

 

MOV

EDI, 0x07F4

1

DP

 

 

 

D

 

I

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

ADD

EDI, EBX

2

DP

 

 

 

D

 

 

 

I

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5

 

SHL

EAX, 8

0

DP

 

 

 

 

 

D

 

 

 

I

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6

 

OR

EAX, 0x0F

1

DP

 

 

 

 

 

D

 

 

 

 

I

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7

 

INC

EBX

2

DP

 

 

 

 

 

D

 

 

I

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8

 

ADD

ESI, EDX

0

DP

 

 

 

 

 

 

 

D

I

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Comments for Each Instruction Number

 

 

 

 

 

 

 

 

 

 

 

 

 

1.

The IMUL is a VectorPath instruction. It cannot be decode or paired with other operations and, therefore,

 

dispatches alone in pipe 0. The multiply latency is four cycles.

 

 

 

 

 

 

 

 

 

2.

The simple INC operation is paired with instructions 3 and 4. The INC executes in IEU0 in cycle 4.

 

3.

The MOV executes in IEU1 in cycle 4.

 

 

 

 

 

 

 

 

 

 

 

 

 

4.

The ADD operation depends on instruction 3. It executes in IEU2 in cycle 5.

 

 

 

 

 

5.

The SHL operation depends on the multiply result (instruction 1). The MacroOP waits in a reservation

 

station and is eventually scheduled to execute in cycle 7 after the multiply result is available.

 

 

6.

This operation executes in cycle 8 in IEU1.

 

 

 

 

 

 

 

 

 

 

 

 

 

7.

This simple operation has a resource contention for execution in IEU2 in cycle 5. Therefore, the operation

 

does not execute until cycle 6.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8.

The ADD operation executes immediately in IEU0 after dispatching.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Execution Unit Resources

153

Page 169
Image 169
AMD x86 manual Imul EAX, ECX INC ESI MOV, Add Edi, Ebx Shl, Inc Ebx Add Esi, Edx