1 interrupt latency, 2 branch prediction, Interrupt latency – Intel NETWORK PROCESSOR IXP2800 User Manual

Page 112: Branch prediction, 26 branch latency penalty

Advertising
background image

112

Hardware Reference Manual

Intel

®

IXP2800 Network Processor

Intel XScale

®

Core

3.9.1

Interrupt Latency

Minimum Interrupt Latency is defined as the minimum number of cycles from the assertion of any

interrupt signal (IRQ or FIQ) to the execution of the instruction at the vector for that interrupt. The

point at which the assertion begins is TBD. This number assumes best case conditions exist when
the interrupt is asserted, e.g., the system isn’t waiting on the completion of some other operation.

A useful number to work with is the Maximum Interrupt Latency. This is typically a complex

calculation that depends on what else is going on in the system at the time the interrupt is asserted.

Some examples that can adversely affect interrupt latency are:

The instruction currently executing could be a 16-register LDM.

The processor could fault just when the interrupt arrives.

The processor could be waiting for data from a load, doing a page table walk, etc.

There are high core-to-system (bus) clock ratios.

Maximum Interrupt Latency can be reduced by:

Ensuring that the interrupt vector and interrupt service routine are resident in the instruction

cache. This can be accomplished by locking them down into the cache.

Removing or reducing the occurrences of hardware page table walks. This also can be

accomplished by locking down the application’s page table entries into the TLBs, along with

the page table entry for the interrupt service routine.

3.9.2

Branch Prediction

The Intel XScale

®

core implements dynamic branch prediction for the ARM* instructions B and

BL and for the Thumb instruction B. Any instruction that specifies the PC as the destination is
predicted as not taken. For example, an LDR or a MOV that loads or moves directly to the PC will

be predicted not taken and incur a branch latency penalty.

These instructions -- ARM B, ARM BL and Thumb B -- enter into the branch target buffer when

they are “taken” for the first time. (A “taken” branch refers to when they are evaluated to be true.)
Once in the branch target buffer, the Intel XScale

®

core dynamically predicts the outcome of these

instructions based on previous outcomes.

Table 26

shows the branch latency penalty when these

instructions are correctly predicted and when they are not. A penalty of 0 for correct prediction

means that the Intel XScale

®

core can execute the next instruction in the program flow in the cycle

following the branch.

Table 26. Branch Latency Penalty

Core Clock Cycles

Description

ARM*

Thumb

+0

+ 0

Predicted Correctly. The instruction is in the branch target cache and is
correctly predicted.

+4

+ 5

Mispredicted. There are three occurrences of branch misprediction, all of which
incur a 4-cycle branch delay penalty.

1. The instruction is in the branch target buffer and is predicted not-taken, but is

actually taken.

2. The instruction is not in the branch target buffer and is a taken branch.
3. The instruction is in the branch target buffer and is predicted taken, but is

actually not-taken

Advertising