Intel ARCHITECTURE IA-32 User Manual

Page 71

Advertising

IA-32 Intel® Architecture Processor Family Overview

1-43

that the cache line that contains the memory location is owned by the
first-level data cache of the initiating core (that is, the line is in
exclusive or modified state). Then the processor looks for the cache line
in the cache and memory sub-systems. The look-ups for the locality of
load or store operation are in the following order:

First level cache of the initiating core

Second-level cache and the first-level cache of the other core

Memory

Table 1-5 lists the performance characteristics of generic load and store
operations in an Intel Core Duo processor.

Numeric values of

Table 1-5

are

in terms of processor core cycles

Throughput is expressed as the number of cycles to wait before the
same operation can start again. The latency of a bus transaction is
exposed in some of these operations, as indicated by entries
containing “+ bus transaction”. On Intel Core Duo processors, a
typical bus transaction may take 5.5 bus cycles. For a 667 MHz bus
and a core frequency of 2.167GHz, the total of 14 + 5.5 * 2167
/(667/4) ~ 86 core cycles.

Sometimes a modified cache line has to be evicted to make room for a
new cache line. The modified cache line is evicted in parallel to
bringing in new data and does not require additional latency. However,

Table 1-5

Characteristics of Load and Store Operations
in Intel Core Duo Processors

Data Locality

Load

Store

Latency

Throughput

Latency

Throughput

1st-level cache (L1)

L1 of the other core in
“Modified” state

14 + bus
transaction

~10

2nd-level cache

Memory

14 + bus
transaction

Bus read
protocol

14 + bus
transaction

Bus write
protocol

Advertising