Intel ARCHITECTURE IA-32 User Manual

Page 566

Advertising
background image

IA-32 Intel® Architecture Optimization

Index-6

R

reciprocal instructions, 5-2

rounding control option, A-6

S

sampling

event-based, A-10

Self-modifying code, 2-47

SFENCE Instruction, 6-15, 6-16

signed unpack, 4-7

SIMD integer code, 4-2

SIMD-floating-point code, 5-1

simplified 3D geometry pipeline, 6-22

simplified clipping to an arbitrary signed range,

4-28

single-pass versus multi-pass execution, 6-41

smart cache, 1-31

SoA format, 3-29

software write-combining, 6-43

spin loops, 9-9

spread prefetch, 6-33

Stack Alignment

Example of dynamic, 2-43

Stack alignment, 2-42

stack alignment, 3-22

stack frame, D-2

stack frame optimization, D-9

state transitions, 9-2

static branch prediction algorithm, 2-20

static power, 9-1

static prediction, 2-19

static prediction algorithm, 2-19

streaming stores

coherent requests, 6-13
non-coherent requests, 6-13

strip mining, 3-32, 3-34

strip-mining, 6-37, 6-38

Structs

Aligning, 2-39

swizzling data. See data swizzling.

System Bus Optimization, 7-33

T

targeting a processor option, A-3

time-based sampling, A-9

time-consuming innermost loops, 6-7

TLB. See transaction lookaside buffer

transaction lookaside buffer, 6-47

transcendental functions, 2-72

transfer latency, E-7, E-9

U

unpack instructions, 4-11

unsigned unpack, 4-6

using MMX code for copy or shuffling

functions, 5-17

V

vector class library, 3-17

vectorization, 3-12

vectorized code, 3-18

vectorizer switch options, A-5

vertical versus horizontal computation, 5-5

VTune analyzer, 3-10, A-1

VTune Performance Analyzer, 3-10

W

write-combining buffer, 6-43

write-combining memory, 6-43

Advertising