Example 7-4, Spin-wait loop and pause instructions -24 – Intel ARCHITECTURE IA-32 User Manual

Page 370

Advertising
background image

IA-32 Intel® Architecture Optimization

7-24

Example 7-4

Spin-wait Loop and PAUSE Instructions

(a) An un-optimized spin-wait loop experiences performance penalty when exiting
the loop. It consumes execution resources without contributing computational
work.

do {

// This loop can run faster than the speed of memory access,
// other worker threads cannot finish modifying sync_var until
// outstanding loads from the spinning loops are resolved.

} while( sync_var != constant_value);

(b) Inserting the PAUSE instruction in a fast spin-wait loop prevents
performance-penalty to the spinning thread and the worker thread

do {
_asm pause

// Ensure this loop is de-pipelined, i.e. preventing more than one
// load request to sync_var to be outstanding,
// avoiding performance penalty when the worker thread updates
// sync_var and the spinning thread exiting the loop.

}
while( sync_var != constant_value);

(c) A spin-wait loop using a “test, test-and-set” technique to determine the
availability of the synchronization variable. This technique is recommended when
writing spin-wait loops to run on IA-32 architecture processors.

Spin_Lock:

CMP lockvar, 0 ; // Check if lock is free.

JE Get_lock

PAUSE; // Short delay.
JMP Spin_Lock;

Get_Lock:

MOV EAX, 1;
XCHG EAX, lockvar; // Try to get lock.
CMP EAX, 0; // Test if successful.
JNE Spin_Lock;

Critical_Section:

<critical section code>
MOV lockvar, 0; // Release lock.

Advertising