Compaq ECQD2KCTE manual Figure A-1 Branch-Format BSR and BR Opcodes

Models: ECQD2KCTE

1 371
Download 371 pages 20.35 Kb
Page 277
Image 277

branch-takens. If the infrequent case is rare (5%), put it far enough away that it never comes into the I-cache. If the infrequent case is extremely rare (error message code), put it on a page of rarely executed code and expect that page never to be paged in.

4.There are two functionally identical branch-format opcodes, BSR and BR, as shown in Figure A–1.

Figure A–1: Branch-Format BSR and BR Opcodes

31

26 25

21 20

0

BSR

Ra

Displacement

Branch Format

BR

Ra

Displacement

Branch Format

Compilers should use the first one for subroutine calls, and the second for GOTOs. Some implementations may push a stack of predicted return addresses for BSR and not push the stack for BR. Failure to compile the correct opcode will result in mispredicted return addresses, and hence make subroutine returns slow.

5.The memory-format JSR instruction, shown in Figure A–2,has 16 unused bits. These should be used by the compilers to communicate a hint about expected branch-target behavior (see Section 4.3).

Figure A–2: Memory-Format JSR Instruction

31

16 15

0

JSR

Ra

Rb

Memory Format

If the JSR is used for a computed GOTO or a CASE statement, compile bits <15:14> as 00, and bits <13:0> such that (updated PC+Instr<13:0>*4) <15:0> equals (likely_target_addr) <15:0>. In other words, pick the low 14 bits so that a normal PC+displacement*4 calculation will match the low 16 bits of the most likely target longword address. (Implementations will likely prefetch from the matching cache block.)

If the JSR is used for a computed subroutine call, compile bits <15:14> as 01, and bits <13:0> as above. Some implementations will prefetch the call target using the prediction and also push updated PC on a return-prediction stack.

If the JSR is used as a subroutine return, compile bits <15:14> as 10. Some implementations will pop an address off a return-prediction stack.

If the JSR is used as a coroutine linkage, compile bits <15:14> as 11. Some implementations will pop an address off a return-prediction stack and also push updated PC on the return-prediction stack.

Implementors should give first priority to executing straight-line code with no branch-takens as

Software Considerations A–3

Page 277
Image 277
Compaq ECQD2KCTE manual Figure A-1 Branch-Format BSR and BR Opcodes