6 instruction tlb efficiency mode, 7 data tlb efficiency mode, 2 multiple performance monitoring run statistics – Intel NETWORK PROCESSOR IXP2800 User Manual

Page 111: 9 performance considerations, Instruction tlb efficiency mode, Data tlb efficiency mode, Multiple performance monitoring run statistics, Performance considerations

Advertising
background image

Hardware Reference Manual

111

Intel

®

IXP2800 Network Processor

Intel XScale

®

Core

3.8.1.6

Instruction TLB Efficiency Mode

PMN0 totals the number of instructions that were executed, which does not include instructions

that were translated by the instruction TLB and never executed. This can happen if a branch

instruction changes the program flow; the instruction TLB may translate the next sequential
instructions after the branch, before it receives the target address of the branch.

PMN1 counts the number of instruction TLB table-walks that occurs when there is a TLB miss.

If the instruction TLB is disabled, PMN1 will not increment.

Statistics derived from these two events:

Instruction TLB miss-rate. This is derived by dividing PMN1 by PMN0.

The average number of cycles it took to execute an instruction or commonly referred to as

cycles-per-instruction (CPI).
CPI can be derived by dividing CCNT by PMN0, where CCNT was used to measure total
execution time.

3.8.1.7

Data TLB Efficiency Mode

PMN0 totals the number of data cache accesses, which includes cacheable and non-cacheable
accesses, mini-data cache access and accesses made to locations configured as data RAM.

Note that STM and LDM will each count as several accesses to the data TLB depending on the

number of registers specified in the register list. LDRD will register two accesses.

PMN1 counts the number of data TLB table-walks, which occurs when there is a TLB miss. If the

data TLB is disabled PMN1 will not increment.

The statistic derived from these two events is:

Data TLB miss-rate. This is derived by dividing PMN1 by PMN0.

3.8.2

Multiple Performance Monitoring Run Statistics

Even though only two events can be monitored at any given time, multiple performance monitoring

runs can be done, capturing different events from different modes. For example, the first run could

monitor the number of writeback operations (PMN1 of mode, Stall/Writeback) and the second run
could monitor the total number of data cache accesses (PMN0 of mode, Data Cache Efficiency).

From the results, a percentage of writeback operations to the total number of data accesses can be

derived.

3.9

Performance Considerations

This section describes relevant performance considerations that compiler writers, application
programmers, and system designers need to be aware of to efficiently use the Intel XScale

®

core.

Performance numbers discussed here include interrupt latency, branch prediction, and instruction

latencies.

Advertising