Chapter 1. Recovery and restart facilities

Problems that occur in a data processing system could be failures with communication protocols, data sets, programs, or hardware. These problems are potentially more severe in online systems than in batch systems, because the data is processed in an unpredictable sequence from many different sources.

Online applications therefore require a system with special mechanisms for recovery and restart that batch systems do not require. These mechanisms ensure that each resource associated with an interrupted online application returns to a known state so that processing can restart safely. Together with suitable operating procedures, these mechanisms should provide automatic recovery from failures and allow the system to restart with the minimum of disruption.

The two main recovery requirements of an online system are:

vTo maintain the integrity and consistency of data

vTo minimize the effect of failures

CICS provides a facility to meet these two requirements called the recovery manager. The CICS recovery manager provides the recovery and restart functions that are needed in an online system.

Maintaining the integrity of data

Data integrity means that the data is in the form you expect and has not been corrupted. The objective of recovery operations on files, databases, and similar data resources is to maintain and restore the integrity of the information.

Recovery must also ensure consistency of related changes, whereby they are made as a whole or not at all. (The term resources used in this book, unless stated otherwise, refers to data resources.)

Logging changes

One way of maintaining the integrity of a resource is to keep a record, or log, of all the changes made to a resource while the system is executing normally. If a failure occurs, the logged information can help recover the data.

An online system can use the logged information in two ways:

1.It can be used to back out incomplete or invalid changes to one or more resources. This is called backward recovery, or backout. For backout, it is necessary to record the contents of a data element before it is changed. These records are called before-images. In general, backout is applicable to processing failures that prevent one or more transactions (or a batch program) from completing.

2.It can be used to reconstruct changes to a resource, starting with a backup copy of the resource taken earlier. This is called forward recovery. For forward recovery, it is necessary to record the contents of a data element after it is changed. These records are called after-images.

© Copyright IBM Corp. 1982, 2010

3

Page 15
Image 15
IBM SC34-7012-01 manual Recovery and restart facilities, Maintaining the integrity of data, Logging changes