Example 6-6, Spread prefetch instructions -33, Spr ead pr efe tch es – Intel ARCHITECTURE IA-32 User Manual

Page 323

Advertising
background image

Optimizing Cache Usage

6

6-33

Example 6-6

Spread Prefetch Instructions

NOTE.

To avoid instruction execution stalls due to the

over-utilization of the resource, prefetch instructions
must be interspersed with computational instructions.

top_loop:

prefetchnta [ebx+128]

prefetchnta [ebx+1128]

prefetchnta [ebx+2128]

prefetchnta [ebx+3128]

. . . .

. . . .

prefetchnta [ebx+17128]

prefetchnta [ebx+18128]

prefetchnta [ebx+19128]

prefetchnta [ebx+20128]

movps xmm1, [ebx]

addps xmm2, [ebx+3000]

mulps xmm3, [ebx+4000]

addps xmm1, [ebx+1000]

addps xmm2, [ebx+3016]

mulps xmm1, [ebx+2000]

mulps xmm1, xmm2

. . . . . . . .

. . . . . .

. . . . .

add ebx, 128

cmp ebx, ecx

jl top_loop

top_loop:

prefetchnta [ebx+128]

movps xmm1, [ebx]

addps xmm2, [ebx+3000]

mulps xmm3, [ebx+4000]

prefetchnta [ebx+1128]

addps xmm1, [ebx+1000]

addps xmm2, [ebx+3016]

prefetchnta [ebx+2128]

mulps xmm1, [ebx+2000]

mulps xmm1, xmm2

prefetchnta [ebx+3128]

. . . . . . .

. . .

prefetchnta [ebx+18128]

. . . . . .

prefetchnta [ebx+19128]

. . . . . .

. . . .

prefetchnta [ebx+20128]

add ebx, 128

cmp ebx, ecx

jl top_loop

spr

ead

pr

efe

tch

es

Advertising