6947ch08.fm Draft Document for Review April 7, 2004 6:15 pm
220 IBM eServer zSeries 990 Technical Guide
Figure 8-16 Two-book system logical view
In this example, a local access is done by the first PU on Book 0 to its local L2 cache, and a
remote access is done by the last PU on Book 1 to the L2 cache on Book 1. As the access
time of the remote connection is higher than a local one, the performance of such a system
would not be as consistent as a single MCM system, being dependent on the remote access
rates. To avoid this effect, z990 has implemented some optimizations.
The L2 cache is implemented as a processor cache, not as a memory cache. This means that
data (and instructions) are normally residents in the L2 cache on the book where it is being
used by a PU, and not in the book where the associated memory address resides. So in the
previous example, the L2 cache in Book 1 will have the data/instructions used by its local PU
after the remote L2 cache access.
Along with the PU allocation and assignment algorithm during IML, described in “Processor
unit characterization” on page 51, the PR/SM has a major role in the z990 system
optimization.
The z990 PR/SM has changed to support the multi-book structure and to provide optimal
system performance. PR/SM is aware of the physical book structure, while the logical
partitions do not require awareness about this design. The PR/SM hypervisor manages and
optimizes allocation and dispatching for the underlying physical topology, providing a
transparent multi-book implementation to operating systems. The PR/SM main objective is to
allocate all processors and storage for a logical partition to the same book, and to redispatch
a logical processor back to the same physical processor.
This implementation provides optimal performance and a more linear scalability to the z990
server. The results can be observed in the LSPR’s ITR values from the uniprocessor to the
32-way server.

8.7.2 Superscalar processors

The z990 server is the first generation of zSeries servers that uses superscalar processors. A
superscalar processor can execute multiple instructions per cycle, potentially providing better
performance than a sequential processor running at the same cycle time (or processor
frequency).
...
L2 Cache
L1 L1
PU PU
Memory
MBA
...
L2 Cache
L1 L1
PU PU
Memory
MBA

Book 0 Book 1

Ring
Structure