Improving parallelism and the use of fxch, Improving parallelism and the use of fxch -68 – Intel ARCHITECTURE IA-32 User Manual

Page 140

Advertising
background image

IA-32 Intel® Architecture Optimization

2-68

Assembly/Compiler Coding Rule 33. (H impact, L generality) Minimize the
number of changes to the precision mode.

Improving Parallelism and the Use of FXCH

The x87 instruction set relies on the floating point stack for one of its
operands. If the dependence graph is a tree, which means each
intermediate result is used only once and code is scheduled carefully, it
is often possible to use only operands that are on the top of the stack or
in memory, and to avoid using operands that are buried under the top of
the stack. When operands need to be pulled from the middle of the
stack, an

fxch

instruction can be used to swap the operand on the top of

the stack with another entry in the stack.

The

fxch

instruction can also be used to enhance parallelism.

Dependent chains can be overlapped to expose more independent
instructions to the hardware scheduler. An

fxch

instruction may be

required to effectively increase the register name space so that more
operands can be simultaneously live.

Note, however, that

fxch

inhibits issue bandwidth in the trace cache. It

does this not only because it consumes a slot, but also because of issue
slot restrictions imposed on

fxch

. If the application is not bound by

issue or retirement bandwidth,

fxch

will have no impact.

The Pentium 4 processor’s effective instruction window size is large
enough to permit instructions that are as far away as the next iteration to
be overlapped. This often obviates the need to use

fxch

to enhance

parallelism.

The

fxch

instruction should be used only when it’s needed to express an

algorithm or to enhance parallelism. If the size of register name space is
a problem, the use of XMM registers is recommended (see the section).

Assembly/Compiler Coding Rule 34. (M impact, M generality) Use

fxch

only where necessary to increase the effective name space.

Advertising