SPARC64 V implements JMPL and CALL return prediction hardware in a form of special stack, called the Return Address Stack (RAS). Whenever a CALL or JMPL that writes to %o7 (r[15]) occurs, SPARC64 V “pushes” the return address (PC+8) onto the RAS. When either of the synthetic instructions retl (JMPL [%o7+8]) and ret (JMPL [%i7+8]) are subsequently executed, the return address is predicted to be the address stored on the top of the RAS and the RAS is “popped.” If the prediction in the RAS is incorrect, SPARC64 V backs up and starts issuing instructions from the correct target address. This backup takes a few extra cycles.

Programming Note – For maximum performance, software and compilers must take into account how the RAS works. For example, tricks that do nonstandard returns in hopes of boosting performance may require more cycles if they cause the wrong RAS value to be used for predicting the address of the return. Heavily nested calls can also cause earlier entries in the RAS to be overwritten by newer entries, since the RAS only has a limited number of entries. Eventually, some return addresses will be mispredicted because of the overflow of the RAS.

6.3.7Floating-Point Operate (FPop) Instructions

The complete conditions of generating an fp_exception_other exception with FSR.ftt = unfinished_FPop are described in Section B.6, Floating-Point Nonstandard Mode on page 61.

The SPARC64 V-specific FMADD and FMSUB instructions (described below) are also floating-point operations. They require the floating-point unit to be enabled; otherwise, an fp_disabled trap is generated. They also affect the FSR, like FPop instructions. However, these instructions are not included in the FPop category and, hence, reserved encodings in these opcodes generate an illegal_instruction exception, as defined in Section 6.3.9 of Commonality.

6.3.8Implementation-Dependent Instructions

SPARC64 V uses the IMPDEP2 instruction to implement the Floating-Point Multiply- Add/Subtract and Negative Multiply-Add/Subtract instructions; these have an op3 field = 3716 (IMPDEP2). See Floating-PointMultiply-Add/Subtracton page 50 for fuller definitions of these instructions. Opcode space is reserved in IMPDEP2 for the quad- precision forms of these instructions. However, SPARC64 V does not currently implement the quad-precision forms, and the processor generates an illegal_instruction exception if a quad-precision form is specified. Since these instructions are not part of the required SPARC V9 architecture, the operating system does not supply software emulation routines for the quad versions of these instructions.

SPARC64 V uses the IMPDEP1 instruction to implement the graphics acceleration instructions.

30 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.0, 1 July 2002

Page 41
Image 41
Fujitsu Fujitsu SPARC64 V manual Floating-Point Operate FPop Instructions, Implementation-Dependent Instructions