Capacity limits and aliasing in caches, Capacity limits and aliasing in caches -43, Example 2-20 – Intel ARCHITECTURE IA-32 User Manual

Page 115: Dynamic stack alignment -43

Advertising
background image

General Optimization Guidelines

2

2-43

If for some reason it is not possible to align the stack for 64-bits, the
routine should access the parameter and save it into a register or known
aligned storage, thus incurring the penalty only once.

Capacity Limits and Aliasing in Caches

There are cases where addresses with a given stride will compete for
some resource in the memory hierarchy.

Typically, caches are implemented to have multiple ways of set
associativity, with each way consisting of multiple sets of cache lines (or
sectors in some cases). Multiple memory references that compete for the
same set of each way in a cache can cause a capacity issue. There are
aliasing conditions that apply to specific microarchitectures. Note that
first-level cache lines are 64 bytes. Thus the least significant 6 bits are
not considered in alias comparisons. For the Pentium 4 and Intel Xeon
processors, data is loaded into the second level cache in a sector of
128 bytes, so the least significant 7 bits are not considered in alias
comparisons.

Example 2-20 Dynamic Stack Alignment

prologue:

subl

esp, 4

; save frame ptr

movl

[esp], ebp

movl

ebp, esp

; new frame pointer

andl

ebp, 0xFFFFFFFC ; aligned to 64 bits

movl

[ebp], esp

; save old stack ptr

subl

esp, FRAMESIZE ; allocate space

; ... callee saves, etc.

epilogue:

; ... callee restores, etc.

movl

esp, [ebp]

; restore stack ptr

movl

ebp, [esp]

; restore frame ptr

addl

esp, 4

ret

Advertising