7 floating-point operate (fpop) instructions, 8 implementation-dependent instructions – FUJITSU Implementation Supplement Fujitsu SPARC64 V User Manual

Page 41

Advertising

SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.0, 1 July 2002

SPARC64 V implements

JMPL

and

CALL

return prediction hardware in a form of

special stack, called the Return Address Stack (RAS). Whenever a

CALL

JMPL

that

writes to

%o7

(

r[15]

) occurs, SPARC64 V “pushes” the return address (PC+8) onto

the RAS. When either of the synthetic instructions retl (

JMPL

[%o7+8]

) and ret (

JMPL

[%i7+8]

) are subsequently executed, the return address is predicted to be the

address stored on the top of the RAS and the RAS is “popped.” If the prediction in
the RAS is incorrect, SPARC64 V backs up and starts issuing instructions from the
correct target address. This backup takes a few extra cycles.

Programming Note –

For maximum performance, software and compilers must

take into account how the RAS works. For example, tricks that do nonstandard
returns in hopes of boosting performance may require more cycles if they cause the
wrong RAS value to be used for predicting the address of the return. Heavily nested
calls can also cause earlier entries in the RAS to be overwritten by newer entries,
since the RAS only has a limited number of entries. Eventually, some return
addresses will be mispredicted because of the overflow of the RAS.

6.3.7

Floating-Point Operate (FPop) Instructions

The complete conditions of generating an

fp_exception_other

exception with

FSR.ftt

unfinished_FPop

are described in Section B.6, Floating-Point Nonstandard

Mode on page 61.

The SPARC64 V-specific

FMADD

and

FMSUB

instructions (described below) are also

floating-point operations. They require the floating-point unit to be enabled;
otherwise, an

fp_disabled

trap is generated. They also affect the

FSR

, like FPop

instructions. However, these instructions are not included in the FPop category and,
hence, reserved encodings in these opcodes generate an

illegal_instruction

exception, as

defined in Section 6.3.9 of Commonality.

6.3.8

Implementation-Dependent Instructions

SPARC64 V uses the

IMPDEP2

instruction to implement the Floating-Point Multiply-

Add/Subtract and Negative Multiply-Add/Subtract instructions; these have an

op3

field = 37

(

IMPDEP2

). See Floating-Point Multiply-Add/Subtract on page 50 for fuller

definitions of these instructions. Opcode space is reserved in

IMPDEP2

for the quad-

precision forms of these instructions. However, SPARC64 V does not currently
implement the quad-precision forms, and the processor generates an

illegal_instruction

exception if a quad-precision form is specified. Since these instructions are not part
of the required SPARC V9 architecture, the operating system does not supply
software emulation routines for the quad versions of these instructions.

SPARC64 V uses the

IMPDEP1

instruction to implement the graphics acceleration

instructions.

Advertising