Optimization with spin-locks, Optimization with spin-locks -25 – Intel ARCHITECTURE IA-32 User Manual

Page 371

Advertising
background image

Multi-Core and Hyper-Threading Technology

7

7-25

User/Source Coding Rule 21. (M impact, H generality) Insert the PAUSE
instruction in fast spin loops and keep the number of loop repetitions to a
minimum to improve overall system performance.

On IA-32 processors that use the Intel NetBurst microarchitecture core,
the penalty of exiting from a spin-wait loop can be avoided by inserting
a

PAUSE

instruction in the loop. In spite of the name, the

PAUSE

instruction improves performance by introducing a slight delay in the
loop and effectively causing the memory read requests to be issued at a
rate that allows immediate detection of any store to the synchronization
variable. This prevents the occurrence of a long delay due to memory
order violation.

One example of inserting the

PAUSE

instruction in a simplified spin-wait

loop is shown in Example 7-4(b). The

PAUSE

instruction is compatible

with all IA-32 processors. On IA-32 processors prior to Intel NetBurst
microarchitecture, the

PAUSE

instruction is essentially a

NOP

instruction.

Additional examples of optimizing spin-wait loops using the

PAUSE

instruction are available in Application Note AP-949 “Using
Spin-Loops on Intel Pentium 4 Processor and Intel Xeon Processor.”

Inserting the

PAUSE

instruction has the added benefit of significantly

reducing the power consumed during the spin-wait because fewer
system resources are used.

Optimization with Spin-Locks

Spin-locks are typically used when several threads needs to modify a
synchronization variable and the synchronization variable must be
protected by a lock to prevent un-intentional overwrites. When the lock
is released, however, several threads may compete to acquire it at once.
Such thread contention significantly reduces performance scaling with
respect to frequency, number of discrete processors, and
Hyper-Threading Technology.

Advertising