Intel ARCHITECTURE IA-32 User Manual

Page 16

Advertising
background image

xvi

Example 3-4

Identification of SSE2 with cpuid ..................................................... 3-5

Example 3-5

Identification of SSE2 by the OS ..................................................... 3-6

Example 3-6

Identification of SSE3 with cpuid ..................................................... 3-7

Example 3-7

Identification of SSE3 by the OS ..................................................... 3-8

Example 3-8

Simple Four-Iteration Loop ............................................................ 3-14

Example 3-9

Streaming SIMD Extensions Using Inlined Assembly Encoding ... 3-15

Example 3-10

Simple Four-Iteration Loop Coded with Intrinsics.......................... 3-16

Example 3-11

C++ Code Using the Vector Classes ............................................. 3-18

Example 3-12

Automatic Vectorization for a Simple Loop .................................... 3-19

Example 3-13

C Algorithm for 64-bit Data Alignment ........................................... 3-23

Example 3-14

AoS Data Structure ....................................................................... 3-27

Example 3-16

AoS and SoA Code Samples ........................................................ 3-28

Example 3-15

SoA Data Structure ....................................................................... 3-28

Example 3-17

Hybrid SoA Data Structure ............................................................ 3-30

Example 3-18

Pseudo-code Before Strip Mining.................................................. 3-32

Example 3-19

Strip Mined Code........................................................................... 3-33

Example 3-20

Loop Blocking ................................................................................ 3-35

Example 3-21

Emulation of Conditional Moves .................................................... 3-37

Example 4-1

Resetting the Register between __m64 and FP Data Types........... 4-5

Example 4-2

Unsigned Unpack Instructions......................................................... 4-7

Example 4-3

Signed Unpack Code ...................................................................... 4-8

Example 4-4

Interleaved Pack with Saturation ................................................... 4-10

Example 4-5

Interleaved Pack without Saturation .............................................. 4-11

Example 4-6

Unpacking Two Packed-word Sources in a Non-interleaved Way . 4-13

Example 4-7

pextrw Instruction Code................................................................. 4-14

Example 4-8

pinsrw Instruction Code................................................................. 4-15

Example 4-9

Repeated pinsrw Instruction Code ................................................ 4-16

Example 4-10

pmovmskb Instruction Code.......................................................... 4-17

Example 4-12

Broadcast Using 2 Instructions...................................................... 4-19

Example 4-11

pshuf Instruction Code .................................................................. 4-19

Example 4-13

Swap Using 3 Instructions............................................................. 4-20

Example 4-14

Reverse Using 3 Instructions......................................................... 4-20

Example 4-15

Generating Constants ................................................................... 4-21

Example 4-16

Absolute Difference of Two Unsigned Numbers ............................ 4-23

Example 4-17

Absolute Difference of Signed Numbers ....................................... 4-24

Example 4-18

Computing Absolute Value ............................................................ 4-25

Example 4-19

Clipping to a Signed Range of Words [high, low] .......................... 4-27

Advertising