Optimize floating-point performance, Optimize instruction selection – Intel ARCHITECTURE IA-32 User Manual

Page 78

Advertising

IA-32 Intel® Architecture Optimization

2-6

•

Minimize use of global variables and pointers.

•

Use the

const

modifier; use the

static

modifier for global

variables.

•

Use new cacheability instructions and memory-ordering behavior.

Optimize Floating-point Performance

•

Avoid exceeding representable ranges during computation, since
handling these cases can have a performance impact. Do not use a
larger precision format (double-extended floating point) unless
required, since this increases memory size and bandwidth
utilization.

•

Use FISTTP to avoid changing rounding mode when possible or use
optimized

fldcw

; avoid changing floating-point control/status

registers (rounding modes) between more than two values.

•

Use efficient conversions, such as those that implicitly include a
rounding mode, in order to avoid changing control/status registers.

•

Take advantage of the SIMD capabilities of Streaming SIMD
Extensions (SSE) and of Streaming SIMD Extensions 2 (SSE2)
instructions. Enable flush-to-zero mode and DAZ mode when using
SSE and SSE2 instructions.

•

Avoid denormalized input values, denormalized output values, and
explicit constants that could cause denormal exceptions.

•

Avoid excessive use of the

fxch

instruction.

Optimize Instruction Selection

•

Focus instruction selection at the granularity of path length for a
sequence of instructions versus individual instruction selections;
minimize the number of uops, data/register dependency in
aggregates of the path length, and maximize retirement throughput.

Advertising