21264/EV67 Microarchitecture

Integer operate

Integer conditional branch

Unconditional branch – both displacement and memory format

Integer and floating-point load and store

PAL-reserved instructions: HW_MTPR, HW_MFPR, HW_LD, HW_ST,

HW_RET

Integer-to-floating-point (ITOFx) and floating-point-to-integer (FTOIx)

Each queue entry asserts four request signals—one for each of the Ebox subclusters. A queue entry asserts a request when it contains an instruction that can be executed by the subcluster, if the instruction’s operand register values are available within the subclus- ter.

There are two arbiters—one for the upper subclusters and one for the lower subclusters. (Subclusters are described in Section 2.1.2.) Each arbiter picks two of the possible 20 requesters for service each cycle. A given instruction only requests upper subclusters or lower subclusters, but because many instructions can only be executed in one type or another this is not too limiting.

For example, load and store instructions can only go to lower subclusters and shift instructions can only go to upper subclusters. Other instructions, such as addition and logic operations, can execute in either upper or lower subclusters and are statically assigned before being placed in the IQ.

The IQ arbiters choose between simultaneous requesters of a subcluster based on the age of the request—older requests are given priority over newer requests. If a given instruction requests both lower subclusters, and no older instruction requests a lower subcluster, then the arbiter assigns subcluster L0 to the instruction. If a given instruction requests both upper subclusters, and no older instruction requests an upper subcluster, then the arbiter assigns subcluster U1 to the instruction. This asymmetry between the upper and lower subcluster arbiters is a circuit implementation optimization with negli- gible overall performance effect.

2.1.1.7 Floating-Point Issue Queue

The 15-entry floating-point issue queue (FQ) associated with the Fbox issues the fol- lowing instruction types:

Floating-point operates

Floating-point conditional branches

Floating-point stores

Floating-point register to integer register transfers (FTOIx)

Each queue entry has three request lines—one for the add pipeline, one for the multiply pipeline, and one for the two store pipelines. There are three arbiters—one for each of the add, multiply, and store pipelines. The add and multiply arbiters pick one requester per cycle, while the store pipeline arbiter picks two requesters per cycle, one for each store pipeline.

Alpha 21264/EV67 Hardware Reference Manual

Internal Architecture 2–7

Page 35
Image 35
Compaq 21264, EV67 specifications Hwret, Floating-Point Issue Queue