Intel ARCHITECTURE IA-32 User Manual

Page 527

Advertising
background image

IA-32 Instruction Latency and Throughput

C

C-13

MOVLHPS

3

xmm, xmm

4

4

2

2

MMX_SHFT

MOVMSKPS r32, xmm

6

6

2

2

FP_MISC

MOVSS xmm, xmm

4

4

2

2

MMX_SHFT

MOVUPS xmm, xmm

6

6

1

1

FP_MOVE

MULPS xmm, xmm

7

6

4+1

2

2

2

FP_MUL

MULSS xmm, xmm

7

6

2

2

FP_MUL

ORPS

3

xmm, xmm

4

4

2

2

2

2

MMX_ALU

RCPPS

3

xmm, xmm

6

6

2

4

4

2

MMX_MISC

RCPSS

3

xmm, xmm

6

6

1

2

2

1

MMX_MISC,
MMX_SHFT

RSQRTPS

3

xmm, xmm

6

6

2

4

4

2

MMX_MISC

RSQRTSS

3

xmm, xmm

6

6

4

4

1

MMX_MISC,
MMX_SHFT

SHUFPS

3

xmm, xmm,

imm8

6

6

2

2

2

2

MMX_SHFT

SQRTPS xmm, xmm

40

39

29+28

40

39

58

FP_DIV

SQRTSS xmm, xmm

32

23

30

32

23

29

FP_DIV

SUBPS xmm, xmm

5

4

4

2

2

2

FP_ADD

SUBSS xmm, xmm

5

4

3

2

2

1

FP_ADD

UCOMISS xmm, xmm

7

6

1

2

2

1

FP_ADD,
FP_MISC

UNPCKHPS

3

xmm,

xmm

6

6

3

2

2

2

MMX_SHFT

UNPCKLPS

3

xmm,

xmm

4

4

3

2

2

2

MMX_SHFT

XORPS

3

xmm, xmm

4

4

2

2

2

2

MMX_ALU

FXRSTOR

150

FXSAVE

100

See “Table Footnotes”

Table C-4

Streaming SIMD Extension Single-precision Floating-point
Instructions
(continued)

Instruction

Latency

1

Throughput

Execution Unit

2

Advertising