Prefetch Unit

5.2.2Branch predictor

Branch prediction in the processor is dynamic and is based around a global history prediction scheme. In addition, there is extra logic to handle predictions that thrash and to predict the end of long loops.

The global history scheme is an adaptive predictor that learns the behavior of branches during execution, based on the historical pattern of behavior of the preceding branches. For each pattern of branch behavior, the history table holds a 2-bit hint value. The 2-bit hint indicates if the next branch must be predicted taken or predicted not-taken based on the behavior of previous branches. The history table contains 256 entries.

For loops beyond a certain number of iterations, the branch history is not large enough to learn the history and predict the loop exit. The PFU includes logic to count the number of iterations (up to 31) of a loop, and thereby predict the not-taken branch that exits the loop. If the number of iterations taken exceeds 31, the loop branch is never predicted as not-taken.

If multiple branch histories index into the same hint value, this can cause thrashing in the history table and reduce accuracy of the branch predictor. Logic in the branch predictor detects these cases and provides some hysteresis for the hint value.

For direct branches, the target address is calculated statically from the instruction encoding and the program counter. For indirect branches, the hint value predicts if the branch is taken or not-taken, and the return stack can sometimes be used to predict the target address. When the destination of a branch cannot be calculated statically, or popped from the return stack, PFU assumes the branch to be not-taken.

The PFU updates the history for each occurrence of a branch when the DPU indicates how the branch was resolved.

Configuring the branch predictor

You can configure the branch predictor by setting bits in the Auxiliary Control Register:

Set bits [16:15] to b00 to enable prediction using the pattern history tables.

Set bits [16:15] to b01 to force branches to be always predicted taken.

Set bits [16:15] to b10 to force branches to be always predicted not-taken.

Set bit [21] to disable prediction using the dynamic branch predictor loop cache.

Set bit [20] to disable prediction using the dynamic branch predictor register extension cache.

For more information, see c1, Auxiliary Control Register on page 4-38

5.2.3Incorrect predictions and correction

The DPU resolves branches that the dynamic branch predictor predicts at the Wr-stage of the pipeline, see Figure 1-3 on page 1-17.A misprediction causes the PFU to flush the pipeline and fetch the correct instruction stream.

ARM DDI 0363E

Copyright © 2009 ARM Limited. All rights reserved.

5-4

ID013010

Non-Confidential, Unrestricted Access

 

Page 163
Image 163
ARM R4F, r1p3 manual Branch predictor, Configuring the branch predictor, Incorrect predictions and correction