Example 6-8 – Intel ARCHITECTURE IA-32 User Manual

Page 328

Advertising
background image

IA-32 Intel® Architecture Optimization

6-38

Without strip-mining, all the x,y,z coordinates for the four vertices must
be re-fetched from memory in the second pass, that is, the lighting loop.
This causes under-utilization of cache lines fetched during
transformation loop as well as bandwidth wasted in the lighting loop.

Now consider the code in Example 6-8 where strip-mining has been
incorporated into the loops.

With strip-mining, all the vertex data can be kept in the cache (for
example, one way of second-level cache) during the strip-mined
transformation loop and reused in the lighting loop. Keeping data in the
cache reduces both bus traffic and the number of prefetches used.

Example 6-8

Data Access of a 3D Geometry Engine with Strip-mining

while (nstrip < NUM_STRIP) {

/* Strip-mine the loop to fit data into one way of the second-level

cache */

while (nvtx < MAX_NUM_VTX_PER_STRIP) {

prefetchnta vertex

i

data

// v=[x,y,z,nx,ny,nz,tu,tv]

prefetchnta vertex

i+1

data

prefetchnta vertex

i+2

data

prefetchnta vertex

i+3

data

TRANSFORMATION code

nvtx+=4

}

while (nvtx < MAX_NUM_VTX_PER_STRIP) {

/* x y z coordinates are in the second-level cache, no prefetch

is

required */

compute the light vectors

POINT LIGHTING code

nvtx+=4

}

}

Advertising