User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Table 6-9shows load-and-store instruction latencies. Pipelined load/store instructions are shown with cycles of total latency and throughput cycles separated by a colon.

Table 6-9. Load-and-Store Instructions

(Page 1 of 4)

 

 

 

 

 

 

 

 

 

 

 

 

Instruction

Mnemonic

 

Primary

Extended

Unit

Cycles

Serialization

 

Opcode

Opcode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Cache Block Flush

dcbf

 

31

86

LSU

3:51

Execution

Data Cache Block

dcbi

 

31

470

LSU

3:31

Execution

Invalidate

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Cache Block Store

dcbst

 

31

54

LSU

3:51

Execution

Data Cache Block

dcbt

 

31

278

LSU

2:1

Touch

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Cache Block

dcbtst

 

31

246

LSU

2:1

Touch for Store

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Cache Block set to

dcbz

 

31

1014

LSU

3:612

Execution

Zero

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

External Control In

eciwx

 

31

310

LSU

2:1

Word Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

External Control Out

ecowx

 

31

438

LSU

2:1

Word Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Instruction Cache Block

icbi

 

31

982

LSU

3:41

Execution

Invalidate

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Byte and Zero

lbz

 

34

LSU

2:1

 

 

 

 

 

 

 

 

Load Byte and Zero with

lbzu

 

35

LSU

2:1

Update

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Byte and Zero with

lbzux

 

31

119

LSU

2:1

Update Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Byte and Zero

lbzx

 

31

87

LSU

2:1

Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfd

 

50

LSU

2:1

Double

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfdu

 

51

LSU

2:1

Double with Update

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfdux

 

 

 

 

 

 

Double with Update

 

31

631

LSU

2:1

Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfdx

 

31

599

LSU

2:1

Double Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfs

 

48

LSU

2:1

Single

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfsu

 

49

LSU

2:1

Single with Update

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Load Floating-Point

lfsux

 

 

 

 

 

 

Single with Update

 

31

567

LSU

2:1

Indexed

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1. For cache operations, the first number indicates the latency in finishing a single instruction; the second indicates the throughput for

back-to-back cache operations. Throughput might be larger than the initial latency, as more cycles might be needed to complete

the instruction to the cache, which stays busy keeping subsequent cache operations from executing.

 

2. The throughput number of six cycles for dcbz assumes it is to nonglobal (M = 0) address space. For global address space,

throughput is at least 11 cycles.

 

 

 

 

 

 

3. Load/store multiple/string instruction cycles are represented as a fixed number of cycles plus a variable number of cycles, where n

is the number of words accessed by the instruction.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Instruction Timing

 

 

 

 

 

 

gx_06.fm.(1.2)

Page 244 of 377

 

 

 

 

 

 

March 27, 2006