In general, forward recovery is applicable to data set failures, or failures in similar data resources, which cause data to become unusable because it has been corrupted or because the physical storage medium has been damaged.

Minimizing the effect of failures

An online system should limit the effect of any failure. Where possible, a failure that affects only one user, one application, or one data set should not halt the entire system.

Furthermore, if processing for one user is forced to stop prematurely, it should be possible to back out any changes made to any data sets as if the processing had not started.

If processing for the entire system stops, there may be many users whose updating work is interrupted. On a subsequent startup of the system, only those data set updates in process (in-flight) at the time of failure should be backed out. Backing out only the in-flight updates makes restart quicker, and reduces the amount of data to reenter.

Ideally, it should be possible to restore the data to a consistent, known state following any type of failure, with minimal loss of valid updating activity.

The role of CICS

The CICS recovery manager and the log manager perform the logging functions necessary to support automatic backout. Automatic backout is provided for most CICS resources, such as databases, files, and auxiliary temporary storage queues, either following a transaction failure or during an emergency restart of CICS.

If the backout of a VSAM file fails, CICS backout failure processing ensures that all locks on the backout-failed records are retained, and the backout-failed parts of the unit of work (UOW) are shunted to await retry. The VSAM file remains open for use. For an explanation of shunted units of work and retained locks, see “Shunted units of work” on page 13.

If the cause of the backout failure is a physically damaged data set, and provided the damage affects only a localized section of the data set, you can choose a time when it is convenient to take the data set offline for recovery. You can then use the forward recovery log with a forward recovery utility, such as CICS VSAM Recovery, to restore the data set and re-enable it for CICS use.

Note: In many cases, a data set failure also causes a processing failure. In this event, forward recovery must be followed by backward recovery.

You don't need to shut CICS down to perform these recovery operations. For data sets accessed by CICS in VSAM record-level sharing (RLS) mode, you can quiesce the data set to allow you to perform the forward recovery offline. On completion of forward recovery, setting the data set to unquiesced causes CICS to perform the backward recovery automatically.

For files accessed in non-RLS mode, you can issue a SET DSNAME RETRY command after the forward recovery, which causes CICS to perform the backward recovery online.

4CICS TS for z/OS 4.1: Recovery and Restart Guide

Page 16
Image 16
IBM SC34-7012-01 manual Minimizing the effect of failures, Role of Cics