General rules on simd integer code, General rules on simd integer code -2 – Intel ARCHITECTURE IA-32 User Manual

Page 222

Advertising
background image

IA-32 Intel® Architecture Optimization

4-2

For planning considerations of using the new SIMD integer instructions,
refer to “Checking for Streaming SIMD Extensions 2 Support” in
Chapter 3.

General Rules on SIMD Integer Code

The overall rules and suggestions are as follows:

Do not intermix 64-bit SIMD integer instructions with x87
floating-point instructions. See “Using SIMD Integer with x87
Floating-point” secti
on in this chapter. Note that all of the SIMD
integer instructions can be intermixed without penalty.

When writing SSE2 code that works with both integer and
floating-point data, use the subset of SIMD convert instructions or
load/store instructions to ensure that the input operands in XMM
registers contain properly defined data type to match the instruction.
Code sequences containing cross-typed usage will produce the same
result across different implementations, but will incur a significant
performance penalty. Using SSE or SSE2 instructions to operate on
type-mismatched SIMD data in the XMM register is strongly
discouraged.

Use the optimization rules and guidelines described in Chapter 2
and Chapter 3 that apply to the Pentium 4, Intel Xeon and
Pentium M processors.

Take advantage of hardware prefetcher where possible. Use prefetch
instruction only when data access patterns are irregular and prefetch
distance can be pre-determined. (for details, refer to Chapter 6,
“Optimizing Cache Usage”).

Emulate conditional moves by using masked compares and logicals
instead of using conditional branches.

Advertising