Recovery from failures associated with the coupling facility

This topic deals with recovery from failures arising from the use of the coupling facility, and which affect CICS units of work.

It covers:

vSMSVSAM cache structure failures

vSMSVSAM lock structure failures (lost locks)

vConnection failure to a coupling facility cache structure

vConnection failure to a coupling facility lock structure

vMVS system recovery and sysplex recovery

Cache failure support

This type of failure affects only data sets opened in RLS mode.

SMSVSAM supports cache set definitions that allow you to define multiple cache structures within a cache set across one or more coupling facilities. To ensure against a cache structure failure, use at least two coupling facilities and define each cache structure, within the cache set, on a different coupling facility.

In the event of a cache structure failure, SMSVSAM attempts to rebuild the structure. If the rebuild fails, SMSVSAM switches data sets that were using the failed structure to use another cache structure in the cache set. If SMSVSAM is successful in either rebuilding or switching to another cache structure, processing continues normally, and the failure is transparent to CICS regions. Because the cache is used as a store-through cache, no committed data has been lost.

The support for rebuilding cache structures enables coupling facility storage to be used effectively. It is not necessary to reserve space for a rebuild to recover from a cache structure failure—SMSVSAM uses any available space.

If RLS is unable to recover from the cache failure for any reason, the error is reported to CICS when it tries to access a data set that is bound to the failed cache, and CICS issues message DFHFC0162 followed by DFHFC0158. CICS defers any activity on data sets bound to the failed cache by abending units of work that attempt to access the data sets. When “cache failed” responses are encountered during dynamic backout of the abended units of work, CICS invokes backout failure support (see “Backout-failed recovery” on page 79). RLS open requests for data sets that must bind to the failed cache, and RLS record access requests for open data sets that are already bound to the failed cache, receive error responses from SMSVSAM.

When either the failed cache becomes available again, or SMSVSAM is able to connect to another cache in a data set’s cache set, CICS is notified by the SMSVSAM quiesce protocols. CICS then retries all backouts that were deferred because of cache failures.

Whenever CICS is notified that a cache is available, it also drives backout retries for other types of backout failure, because this notification provides an opportunity to complete backouts that may have failed for some transient condition.

1.Cache structure. One of three types of coupling facility data structure supported by MVS. SMSVSAM uses its cache structure to perform buffer pool management across the sysplex. This enables SMSVSAM to ensure that the data in the VSAM buffer pools in each MVS image remains valid.

88CICS TS for z/OS 4.1: Recovery and Restart Guide

Page 100
Image 100
IBM SC34-7012-01 manual Recovery from failures associated with the coupling facility, Cache failure support