Intel ARCHITECTURE IA-32 User Manual

Page 10

Advertising
background image

x

Hardware Prefetch ......................................................................................................... 6-19
Example of Effective Latency Reduction with H/W Prefetch .......................................... 6-20
Example of Latency Hiding with S/W Prefetch Instruction ............................................ 6-22
Software Prefetching Usage Checklist ........................................................................... 6-24
Software Prefetch Scheduling Distance ......................................................................... 6-25
Software Prefetch Concatenation................................................................................... 6-26
Minimize Number of Software Prefetches ...................................................................... 6-29
Mix Software Prefetch with Computation Instructions .................................................... 6-32
Software Prefetch and Cache Blocking Techniques....................................................... 6-34
Hardware Prefetching and Cache Blocking Techniques ................................................ 6-39
Single-pass versus Multi-pass Execution ....................................................................... 6-41

Memory Optimization using Non-Temporal Stores................................................................ 6-43

Non-temporal Stores and Software Write-Combining..................................................... 6-43
Cache Management ....................................................................................................... 6-44

Video Encoder .......................................................................................................... 6-45
Video Decoder .......................................................................................................... 6-45
Conclusions from Video Encoder and Decoder Implementation .............................. 6-46
Optimizing Memory Copy Routines .......................................................................... 6-46
TLB Priming .............................................................................................................. 6-47
Using the 8-byte Streaming Stores and Software Prefetch....................................... 6-48
Using 16-byte Streaming Stores and Hardware Prefetch ......................................... 6-50
Performance Comparisons of Memory Copy Routines ............................................ 6-52

Deterministic Cache Parameters .......................................................................................... 6-53

Cache Sharing Using Deterministic Cache Parameters................................................. 6-55
Cache Sharing in Single-core or Multi-core.................................................................... 6-55
Determine Prefetch Stride Using Deterministic Cache Parameters ............................... 6-56

Chapter 7

Multi-Core and Hyper-Threading Technology

Performance and Usage Models............................................................................................. 7-2

Multithreading ................................................................................................................... 7-2
Multitasking Environment ................................................................................................. 7-4

Programming Models and Multithreading ............................................................................... 7-6

Parallel Programming Models .......................................................................................... 7-7

Domain Decomposition............................................................................................... 7-7

Functional Decomposition ................................................................................................ 7-8
Specialized Programming Models .................................................................................... 7-8

Producer-Consumer Threading Models.................................................................... 7-10

Tools for Creating Multithreaded Applications ................................................................ 7-14

Optimization Guidelines ........................................................................................................ 7-16

Key Practices of Thread Synchronization ...................................................................... 7-16

Advertising