Table 6-9, Load-and-store instructions, Table 6-9 l – IBM POWERPC 750GL User Manual

Page 244

Advertising
background image


User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor

Instruction Timing

Page 244 of 377

gx_06.fm.(1.2)

March 27, 2006

Table 6-9 shows load-and-store instruction latencies. Pipelined load/store instructions are shown with cycles
of total latency and throughput cycles separated by a colon.

Table 6-9. Load-and-Store Instructions

(Page 1 of 4)

Instruction

Mnemonic

Primary
Opcode

Extended

Opcode

Unit

Cycles

Serialization

Data Cache Block Flush

dcbf

31

86

LSU

3:5

1

Execution

Data Cache Block

Invalidate

dcbi

31

470

LSU

3:3

1

Execution

Data Cache Block Store

dcbst

31

54

LSU

3:5

1

Execution

Data Cache Block

Touch

dcbt

31

278

LSU

2:1

Data Cache Block

Touch for Store

dcbtst

31

246

LSU

2:1

Data Cache Block set to

Zero

dcbz

31

1014

LSU

3:6

12

Execution

External Control In

Word Indexed

eciwx

31

310

LSU

2:1

External Control Out

Word Indexed

ecowx

31

438

LSU

2:1

Instruction Cache Block

Invalidate

icbi

31

982

LSU

3:4

1

Execution

Load Byte and Zero

lbz

34

LSU

2:1

Load Byte and Zero with

Update

lbzu

35

LSU

2:1

Load Byte and Zero with

Update Indexed

lbzux

31

119

LSU

2:1

Load Byte and Zero

Indexed

lbzx

31

87

LSU

2:1

Load Floating-Point

Double

lfd

50

LSU

2:1

Load Floating-Point
Double with Update

lfdu

51

LSU

2:1

Load Floating-Point
Double with Update

Indexed

lfdux

31

631

LSU

2:1

Load Floating-Point

Double Indexed

lfdx

31

599

LSU

2:1

Load Floating-Point

Single

lfs

48

LSU

2:1

Load Floating-Point

Single with Update

lfsu

49

LSU

2:1

Load Floating-Point

Single with Update

Indexed

lfsux

31

567

LSU

2:1

1. For cache operations, the first number indicates the latency in finishing a single instruction; the second indicates the throughput for

back-to-back cache operations. Throughput might be larger than the initial latency, as more cycles might be needed to complete
the instruction to the cache, which stays busy keeping subsequent cache operations from executing.

2. The throughput number of six cycles for dcbz assumes it is to nonglobal (M = 0) address space. For global address space,

throughput is at least 11 cycles.

3. Load/store multiple/string instruction cycles are represented as a fixed number of cycles plus a variable number of cycles, where n

is the number of words accessed by the instruction.

Advertising
This manual is related to the following products: