Per-thread stack offset, Per-thread stack offset -44 – Intel ARCHITECTURE IA-32 User Manual

Page 390

Advertising
background image

IA-32 Intel® Architecture Optimization

7-44

Per-thread Stack Offset

To prevent private stack accesses in concurrent threads from thrashing
the first-level data cache, an application can use a per-thread stack offset
for each of its threads. The size of these offsets should be multiples of a
common base offset. The optimum choice of this common base offset
may depend on the memory access characteristics of the threads; but it
should be multiples of 128 bytes.

One effective technique for choosing a per-thread stack offset in an
application is to add an equal amount of stack offset each time a new
thread is created in a thread pool.

7

Example 7-9 shows a code fragment

that implements per-thread stack offset for three threads using a
reference offset of 1024 bytes.

User/Source Coding Rule 35. (H impact, M generality) Adjust the private
stack of each thread in an application so that the spacing between these stacks
is not offset by multiples of 64 KB or 1 MB to prevent unnecessary cache line
evictions (when using IA-32 processors supporting Hyper-Threading
Technology).

7.

For parallel applications written to run with OpenMP, the OpenMP runtime library in
Intel KAP/Pro Toolset automatically provides the stack offset adjustment for each thread.

Advertising