Draft Document for Review April 7, 2004 6:15 pm

6947ch02.fm

IEEE Floating Point

The inclusion of the IEEE Standard for Binary Floating Point Arithmetic (IEEE 754-1985) in S/390 was made to further enhance the value of this platform for this type of calculation. The initial implementation had 121 floating-point instructions over prior S/390 CMOS models (Hexadecimal Floating Point had 54 instructions). Later, with the introduction of the 64-bit architecture, 12 additional instructions were added for IEEE Binary Floating Point Arithmetic 64-bit integer conversion.

The key point is that Java and C/C++ applications tend to use IEEE Binary Floating Point operations more frequently than legacy applications. This means that the better the hardware implementation of this set of instructions, the better the performance of e-business applications will be.

On earlier systems, the emphasis has been on the traditional hexadecimal floating point arithmetic. The z990 has a Binary Floating Point unit that matches the performance of the traditional hexadecimal floating point unit by halving the number of cycles required earlier.

Translation Lookaside Buffer

The Translation Lookaside Buffer (TLB) in the Instruction and Data L1 caches now have a secondary TLB to enhance performance. In addition, a translator unit is added to translate misses in the secondary TLB.

Instruction fetching and instruction decode

The superscalar design of the z990 microprocessor allows for the decoding of up to two instructions per cycle and the execution of three instructions per cycle. Execution takes place in order, but storage accesses for instruction and operand fetching may occur out of sequence.

Instruction fetching

Instruction fetch in non-z990 models tries to get as far ahead of instruction decode and execution as possible because of the relatively large instruction buffers available. In the z990 microprocessor, smaller instruction buffers are used. The operation code is fetched from the I-cache and put in instruction buffers that hold pre-fetched data awaiting decode.

Instruction decoding

The processor can decode one or two instruction per cycle. The result of the decoding process is queued and subsequently used to form a group.

Instruction grouping

From the instruction queue, one simple branch instruction and up to two general instructions can be issued every cycle. The instructions are taken from the instruction queue and grouped together. The instructions are assembled according to instruction grouping rules. A complete description of the rules is beyond the scope of this redbook.

It is the compiler’s responsibility to select instructions that best fit with the z990 superscalar microprocessor and abide by the grouping rules to create code that best exploits the superscalar implementation.

Extended Translation Facility

The Extended Translation Facility adds 10 instructions to the zSeries instruction set. They enhance the performance for data conversion operations for data encoded in Unicode, making applications enabled for Unicode and/or Globalization more efficient. These data encoding formats are used in Web Services, Grid, and on demand environments where XML,

Chapter 2. System structure and design 45

Page 59
Image 59
IBM 990 manual Ieee Floating Point, Translation Lookaside Buffer, Instruction fetching and instruction decode

990 specifications

The IBM 990 series, often referred to in the context of IBM's pioneering efforts in the realm of mainframe computing, represents a unique chapter in the history of information technology. Introduced in the late 1960s, the IBM 990 series was designed as a powerful tool for enterprise-level data processing and scientific calculations, showcasing the company's commitment to advancing computing capabilities.

One of the main features of the IBM 990 was its architecture, which was built to support a wide range of applications, from business processing to complex scientific computations. The system employed a 32-bit word length, which was advanced for its time, allowing for more flexible and efficient data handling. CPUs in the IBM 990 series supported multiple instructions per cycle, which contributed significantly to the overall efficiency and processing power of the machines.

The technology behind the IBM 990 was also notable for its use of solid-state technology. This provided a shift away from vacuum tube systems that were prevalent in earlier computing systems, enhancing the reliability and longevity of the hardware. The IBM 990 series utilized core memory, which was faster and more reliable than the magnetic drum memory systems that had been standard up to that point.

Another defining characteristic of the IBM 990 was its extensibility. Organizations could configure the machine to suit their specific needs by adding memory, storage, and peripheral devices as required. This modular approach facilitated the growth of systems alongside the technological and operational demands of the business environments they served.

In terms of software, the IBM 990 series was compatible with a variety of operating systems and programming environments, including FORTRAN and COBOL, enabling users to access a broader array of applications. This versatility was a significant advantage, making the IBM 990 an appealing choice for educational institutions, research facilities, and enterprises alike.

Moreover, the IBM 990 was engineered to support multiprocessing, which allowed multiple processes to run simultaneously, further increasing its effectiveness in tackling complex computing tasks.

In summary, the IBM 990 series represents a significant advancement in computing technology during the late 20th century. With a robust architecture, versatile configuration options, and a focus on solid-state technology, the IBM 990 facilitated substantial improvements in data processing capabilities, making it a cornerstone for many businesses and academic institutions of its time. Its impact can still be seen today in the continued evolution of mainframe computing.