Planning considerations, Planning considerations -2 – Intel ARCHITECTURE IA-32 User Manual

Page 264

Advertising

IA-32 Intel® Architecture Optimization

5-2

•

Use MMX technology instructions and registers or for copying data
that is not used later in SIMD floating-point computations.

•

Use the reciprocal instructions followed by iteration for increased
accuracy. These instructions yield reduced accuracy but execute
much faster. Note the following:

— If reduced accuracy is acceptable, use them with no iteration.

— If near full accuracy is needed, use a Newton-Raphson iteration.

— If full accuracy is needed, then use divide and square root which

provide more accuracy, but slow down performance.

Planning Considerations

Whether adapting an existing application or creating a new one, using
SIMD floating-point instructions to achieve optimum performance gain
requires programmers to consider several issues. In general, when
choosing candidates for optimization, look for code segments that are
computationally intensive and floating-point intensive. Also consider
efficient use of the cache architecture.

The sections that follow answer the questions that should be raised
before implementation:

•

Can data layout be arranged to increase control parallelism or cache
utilization?

•

Which part of the code benefits from SIMD floating-point
instructions?

•

Is the current algorithm the most appropriate for SIMD
floating-point instructions?

•

Is the code floating-point intensive?

•

Do either single-precision floating-point or double-precision
floating- point computations provide enough range and precision?

•

Does the result of computation affected by enabling flush-to-zero or
denormals-to-zero modes?

Advertising