In a relational database environment, the physical separation of logically related data results in little locality of reference. Data in memory techniques also minimize the re-referencing of data on disk, as this is ideally accomplished in processor memory.

Write caching requires that data integrity be preserved. Applications assume that an update written to disk is safe. When cache memory is used to improve the performance of writes, an additional level of protection is provided by either the NVS, which has battery protection, or by battery protection of the cache itself. In the first case, updates are written to both cache and NVS before I/O is signaled complete to the application. The copy in the NVS is marked as available for overwriting once the update is destaged to disk. The function of caching writes is called DASD fast write (DFW).

To maximize the efficiency of the cache, storage servers have a variety of caching algorithms to use the cache for data with good caching characteristics, but prevent poor cache candidates from swamping the cache. These caching algorithms are either invoked directly from the software or determined by the server itself.

Cache is managed on a least recently used (LRU) algorithm, where the oldest data is made available to be overwritten by new data. Large cache improves the residence time for a cache-unfriendly application. Caching is controlled by the hardware at the volume level or extent level. It is controlled at the subsystem level (through the IDCAMS SETCACHE command), and at the volume or the data set level through software.

9.3.1 Track Caching

Track caching assumes that once a record is accessed on a track, another will be accessed soon. This is the unique algorithm used by RVA.

When a track is accessed on disk, either the required record is passed back to the application and simultaneously copied into the cache and the remainder of the track is staged from the disk, or, for RVA, the whole compressed and compacted track is staged in the cache.

Good performance for track caching depends on good locality of reference. Random workloads often result in poor cache hits, that is, data is staged into cache but never re-referenced. Unproductive staging results in:

Keeping the disk busy while the track is staged into cache

Keeping the paths busy while staging

Using up space in the cache

A poor cache hit rate is likely to be less than 50-60% for reads. To gain the benefits of DFW, data with a poor cache hit rate will require a large cache.

9.3.2 Read Record Caching

Read record caching is suitable for data that has a poor cache hit rate and is therefore subject to unproductive staging. Where read record caching algorithms are invoked, the required record is returned to the application and copied into the cache, but the remainder of the track is not. Record caching avoids adversely impacting the performance of good cache candidates.

Disk Environment Overview

91

Page 113
Image 113
IBM 5695-DF1, 5655-DB2 manual Track Caching, Read Record Caching

5695-DF1, 5655-DB2 specifications

IBM 5655-DB2 and 5695-DF1 are significant components within the IBM software ecosystem, predominantly focusing on data management and integration solutions. These offerings cater primarily to enterprise environments that require robust database management systems and associated frameworks to maintain and manipulate data efficiently.

IBM 5655-DB2 is a well-known relational database management system (RDBMS) that excels in managing large volumes of structured data. Its architecture is designed to support high availability, scalability, and performance, crucial for businesses operating in today’s data-driven world. Some of its main features include advanced indexing capabilities, support for complex queries, and dynamic workload management. Additionally, it provides strong concurrency controls, which enable multiple users to access and manipulate data simultaneously without compromising data integrity.

One of the key characteristics of DB2 is its support for various data types, including JSON and XML, making it versatile for modern applications that generate data in diverse formats. It also features robust security mechanisms to protect sensitive data, aligning with compliance standards across industries. Integration with analytics tools further allows businesses to derive insights from their data, enhancing decision-making processes.

On the other hand, IBM 5695-DF1, also known as the InfoSphere DataStage, is a powerful data integration tool that facilitates the extraction, transformation, and loading (ETL) of data from various sources to target systems. It empowers organizations to streamline their data flows, ensuring that clean, consistent information is available for analysis and operational use. Key features of 5695-DF1 include a user-friendly graphical interface that enhances developer productivity and a rich set of connectors for numerous data sources, enabling seamless data integration.

DataStage also supports real-time data integration, allowing businesses to keep their data synchronized across multiple platforms. Its parallel processing capabilities dedicatedly optimize performance, enabling organizations to handle vast datasets efficiently. It incorporates data quality tools that help in validating and cleansing data before it is used for decision-making processes.

Both IBM 5655-DB2 and 5695-DF1 are part of a broader strategy to accommodate the evolving landscape of data management. Businesses leverage these technologies to enhance their data architectures, fostering agility and competitive advantage in their respective markets. Their integration capabilities, along with a focus on security and scalability, position them as vital assets in modern enterprise environments. Whether managing critical data within a database or ensuring seamless data flow across systems, these IBM offerings provide a comprehensive approach to handling complex data challenges.