6947ch02.fm Draft Document for Review April 7, 2004 6:15 pm
42 IBM eServer zSeries 990 Technical Guide
under way to update the C++ compiler and Java Virtual Machine for z/OS to better exploit t he
z990 microprocessor superscalar implementation. The intent is improve the performance
advantage for e-business workloads such as WebSphere and Java applications.
By the time the Java Virtual Machine (JVM) and compilers are available, more improv ement in
the throughput of the superscalar processor is expected. In order to create instruction
sequences that are least affected by interlock situations, instruction grouping rules are
enforced to create instruction streams that benefit most from the superscalar processor. It is
expected that e-business workloads will primarily benefit from this design since they tend to
use more computational instructions.
A WebSphere Application Server workload environment that runs a mix of Java and DB2®
code will greatly benefit from the superscalar processor design of the z990. Measurements
already show a larger than 20% performance improvement for these types of workloads, on
top of the improvements attributed to the cycle time decrease from 1.09 ns on a z900 Turbo
model to 0.83 ns on a z990.
The superscalar design of the z990 microprocessor means that some instructions are
processed immediately and that processing steps of other instructions may occur out of the
normal sequential order, called “pipelining”. The superscalar design of the z990 offers:
򐂰Decoding of two instructions per cycle
򐂰Execution of three instructions per cycle (given that the oldest instruction is a branch)
򐂰In-order execution
򐂰Out-of-order operand fetching
Other features of the microprocessor, aimed at improving the performance of the emerging
e-business application environment, are:
򐂰Floating point performance for IEEE Binary Floating Point arithmetic is improved to assist
further exploitation of Java application environments.
򐂰A secondary cache for Dynamic Address Translation, called the Secondary level
Translation Look aside Buffer (TLB), is provided for both L2 instruction and data caches,
increasing the number of buffer entries by a factor of eight.
򐂰The CP Assist for Cryptographic Function (CPACF) accelerates the encryption and
decryption of SSL transactions and VPN encrypted data transfers. The assist function
uses five new instructions for symmetrical clear key cryptographic encryption and
encryption operations.
Asymmetric mirroring for error detection
Each PU in the z990 servers uses mirrored instruction execution as a simple error detection
mechanism. The mirroring is dependent on a dual instruction processor design with dual
I-units, and E-units and floating point function. It is asymmetric because the mirrored
execution is delayed from the actual operation. The benefit of the asymmetric design is that
the mirrored units do not have to be closely located to the units where the actual operation
takes place, thus allowing for optimization for performance.(see Figure2-13).