Cacheability control, Cacheability control -9 – Intel ARCHITECTURE IA-32 User Manual

Page 299

Advertising
background image

Optimizing Cache Usage

6

6-9

Currently, the

prefetch

instruction provides a greater performance gain

than preloading because it:

has no destination register, it only updates cache lines.

does not stall the normal instruction retirement.

does not affect the functional behavior of the program.

has no cache line split accesses.

does not cause exceptions except when

LOCK

prefix is used; the

LOCK

prefix is not a valid prefix for use with the

prefetch

instructions

and should not be used.

does not complete its own execution if that would cause a fault.

The current advantages of the prefetch over preloading instructions are
processor-specific. The nature and extent of the advantages may change
in the future.

In addition, there are cases where a prefetch instruction will not perform
the data prefetch. These include:

the

prefetch

causes a DTLB (Data Translation Lookaside Buffer)

miss. This applies to Pentium 4 processors with CPUID signature
corresponding to family 15, model 0, 1 or 2. The prefetch
instruction resolves a DTLB miss and fetches data on Pentium 4
processors with CPUID signature corresponding to family 15,
model 3.

an access to the specified address causes a fault/exception.

the memory subsystem runs out of request buffers between

the

first-level cache

and the second-level cache.

the

prefetch

targets an uncacheable memory region, for example,

USWC and UC.

a

LOCK

prefix is used. This causes an invalid opcode exception.

Cacheability Control

This section covers the mechanics of the cacheability control
instructions.

Advertising