Integer execution unit—clusters 0 and 1 – Compaq 21264 User Manual

Page 37

Advertising
background image

Alpha 21264/EV67 Hardware Reference Manual

Internal Architecture

2–9

21264/EV67 Microarchitecture

Figure 2–6 Integer Execution Unit—Clusters 0 and 1

Most instructions have 1-cycle latency for consumers that execute within the same clus-
ter. Also, there is another 1-cycle delay associated with producing a value in one cluster
and consuming the value in the other cluster. The instruction issue queue minimizes the
performance effect of this cross-cluster delay. The Ebox contains the following
resources:

Four 64-bit adders that are used to calculate results for integer add instructions
(located in U0, U1, L0, and L1)

The adders in the lower subclusters that are used to generate the effective virtual
address for load and store instructions (located in L0 and L1)

Four logic units

Two barrel shifters and associated byte logic (located in U0 and U1)

Two sets of conditional branch logic (located in U0 and U1)

Two copies of an 80-entry register file

One pipelined multiplier (located in U1) with 7-cycle latency for all integer multiply
operations

One fully-pipelined unit (located in U0), with 3-cycle latency, that executes the fol-
lowing instructions:

CTLZ, CTPOP, CTTZ

PERR, MINxxx, MAXxxx, UNPKxx, PKxx

L0

Register

U0

Load/Store Data

L1

Register

U1

Load/Store Data

iop_wr

iop_wr

eff_VA

eff_VA

iop_wr

iop_wr

FM-05643.AI4

Advertising