Use available performance tools, Optimize performance across processor generations – Intel ARCHITECTURE IA-32 User Manual

Page 76

Advertising
background image

IA-32 Intel® Architecture Optimization

2-4

Use Available Performance Tools

Current-generation compiler, such as the Intel C++ Compiler:

— Set this compiler to produce code for the target processor

implementation

— Use the compiler switches for optimization and/or

profile-guided optimization. These features are summarized in
the “Intel® C++ Compiler” section. For more detail, see the
Intel® C++ Compiler User’s Guide.

Current-generation performance monitoring tools, such as VTune™
Performance Analyzer:

— Identify performance issues, use event-based sampling, code

coach and other analysis resource.

— Measure workload characteristics such as instruction

throughput, data traffic locality, memory traffic characteristics,
etc.

— Characterize the performance gain.

Optimize Performance Across Processor Generations

Use a

cpuid

dispatch strategy to deliver optimum performance for

all processor generations.

Use deterministic cache parameter leaf of cpuid to deliver scalable
performance that are transparent across processor families with
different cache sizes.

Use compatible code strategy to deliver optimum performance for
the current generation of IA-32 processor family and future IA-32
processors.

Use a low-overhead threading strategy so that a multi-threaded
application delivers optimal multi-processor scaling performance
when executing on processors that have hardware multi-threading
support, or deliver nearly identical single-processor scaling when
executing on a processor without hardware multi-threading support.

Advertising