Streaming store instruction descriptions, Streaming store instruction descriptions -14 – Intel ARCHITECTURE IA-32 User Manual

Page 304

Advertising
background image

IA-32 Intel® Architecture Optimization

6-14

In case the region is not mapped as

WC

, the streaming might update

in-place in the cache and a subsequent

sfence

would not result in the

data being written to system memory. Explicitly mapping the region as

WC

in this case ensures that any data read from this region will not be

placed in the processor’s caches. A read of this memory location by a
non-coherent I/O device would return incorrect/out-of-date results. For
a processor which solely implements approach (b), page 11, above, a
streaming store can be used in this non-coherent domain without
requiring the memory region to also be mapped as

WB

, since any cached

data will be flushed to memory by the streaming store.

Streaming Store Instruction Descriptions

The

movntq/movntdq

(non-temporal store of packed integer in an

MMX technology or Streaming SIMD Extensions register) instructions
store data from a register to memory. The instruction is implicitly
weakly-ordered, does no write-allocate, and so minimizes cache
pollution.

The

movntps

(non-temporal store of packed single precision floating

point) instruction is similar to

movntq

. It stores data from a Streaming

SIMD Extensions register to memory in 16-byte granularity. Unlike

movntq

, the memory address must be aligned to a 16-byte boundary or a

general protection exception will occur. The instruction is implicitly
weakly-ordered, does not write-allocate, and thus minimizes cache
pollution.

CAUTION.

Failure to map the region as

WC

may allow

the line to be speculatively read into the processor
caches, that is, via the wrong path of a mispredicted
branch.

Advertising