Figures – Intel ARCHITECTURE IA-32 User Manual

Page 19

Advertising
background image

xix

Figures

Figure 1-1

Typical SIMD Operations ................................................................... 1-3

Figure 1-2

SIMD Instruction Register Usage ...................................................... 1-4

Figure 1-3

The Intel NetBurst Microarchitecture ............................................... 1-10

Figure 1-4

Execution Units and Ports in the Out-Of-Order Core....................... 1-19

Figure 1-5

The Intel Pentium M Processor Microarchitecture ........................... 1-27

Figure 1-6

Hyper-Threading Technology on an SMP........................................ 1-35

Figure 1-7

Pentium D Processor, Pentium Processor Extreme Edition
and Intel Core Duo Processor ......................................................... 1-41

Figure 2-1

Cache Line Split in Accessing Elements in a Array ......................... 2-31

Figure 2-2

Size and Alignment Restrictions in Store Forwarding...................... 2-34

Figure 3-1

Converting to Streaming SIMD Extensions Chart ............................. 3-9

Figure 3-2

Hand-Coded Assembly and High-Level Compiler
Performance Trade-offs ................................................................... 3-13

Figure 3-3

Loop Blocking Access Pattern ......................................................... 3-36

Figure 4-2

Interleaved Pack with Saturation ....................................................... 4-9

Figure 4-1

PACKSSDW mm, mm/mm64 Instruction Example ............................ 4-9

Figure 4-4

Result of Non-Interleaved Unpack High in MM1.............................. 4-12

Figure 4-3

Result of Non-Interleaved Unpack Low in MM0 .............................. 4-12

Figure 4-5

pextrw Instruction ............................................................................ 4-14

Figure 4-6

pinsrw Instruction............................................................................. 4-15

Figure 4-7

pmovmskb Instruction Example....................................................... 4-17

Figure 4-8

pshuf Instruction Example ............................................................... 4-18

Figure 4-9

PSADBW Instruction Example ........................................................ 4-31

Figure 5-1

Homogeneous Operation on Parallel Data Elements ........................ 5-5

Figure 5-2

Dot Product Operation ....................................................................... 5-8

Figure 5-3

Horizontal Add Using movhlps/movlhps .......................................... 5-19

Figure 5-5

Horizontal Arithmetic Operation of the SSE3 Instruction
HADDPD ......................................................................................... 5-23

Figure 5-4

Asymmetric Arithmetic Operation of the SSE3 Instruction .............. 5-23

Figure 6-1

Effective Latency Reduction as a Function of Access Stride........... 6-22

Advertising