The fence instructions, The sfence instruction, The fence instructions -15 – Intel ARCHITECTURE IA-32 User Manual

Page 305: The sfence instruction -15

Advertising
background image

Optimizing Cache Usage

6

6-15

The

maskmovq/maskmovdqu

(non-temporal byte mask store of packed

integer in an MMX technology or Streaming SIMD Extensions register)
instructions store data from a register to the location specified by the

edi

register. The most significant bit in each byte of the second mask

register is used to selectively write the data of the first register on a
per-byte basis. The instruction is implicitly weakly-ordered (that is,
successive stores may not write memory in original program-order),
does not write-allocate, and thus minimizes cache pollution.

The fence Instructions

The following fence instructions are available:

sfence

,

lfence

, and

mfence

.

The sfence Instruction

The

sfence

(

store fence

) instruction makes it possible for every

store

instruction that precedes the

sfence

instruction in program order

to be globally visible before any

store

instruction that follows the

sfence

. The

sfence

instruction provides an efficient way of ensuring

ordering between routines that produce weakly-ordered results.

The use of weakly-ordered memory types can be important under
certain data sharing relationships, such as a producer-consumer
relationship. Using weakly-ordered memory can make assembling the
data more efficient, but care must be taken to ensure that the consumer
obtains the data that the producer intended to see. Some common usage
models may be affected in this way by weakly-ordered stores. Examples
are:

library functions, which use weakly-ordered memory to write
results

compiler-generated code, which also benefits from writing
weakly-ordered results

hand-crafted code

Advertising