Optimize floating-point performance, Optimize instruction selection – Intel ARCHITECTURE IA-32 User Manual

Page 78

Advertising
background image

IA-32 Intel® Architecture Optimization

2-6

Minimize use of global variables and pointers.

Use the

const

modifier; use the

static

modifier for global

variables.

Use new cacheability instructions and memory-ordering behavior.

Optimize Floating-point Performance

Avoid exceeding representable ranges during computation, since
handling these cases can have a performance impact. Do not use a
larger precision format (double-extended floating point) unless
required, since this increases memory size and bandwidth
utilization.

Use FISTTP to avoid changing rounding mode when possible or use
optimized

fldcw

; avoid changing floating-point control/status

registers (rounding modes) between more than two values.

Use efficient conversions, such as those that implicitly include a
rounding mode, in order to avoid changing control/status registers.

Take advantage of the SIMD capabilities of Streaming SIMD
Extensions (SSE) and of Streaming SIMD Extensions 2 (SSE2)
instructions. Enable flush-to-zero mode and DAZ mode when using
SSE and SSE2 instructions.

Avoid denormalized input values, denormalized output values, and
explicit constants that could cause denormal exceptions.

Avoid excessive use of the

fxch

instruction.

Optimize Instruction Selection

Focus instruction selection at the granularity of path length for a
sequence of instructions versus individual instruction selections;
minimize the number of uops, data/register dependency in
aggregates of the path length, and maximize retirement throughput.

Advertising