Callback Failures

Figure 61 Callback Retraction Example

Appendix D Quality of Service Guide

Callbacks

The FSM must handle a case where a client does not respond to a callback within the specified timeout period (RtTokenTimeout). If a client does not respond to a callback, the FSM must assume the worst: that it is a rogue that could wreak havoc on real-time I/O. It must retract the tokens it just issued and return to the previous state.

As mentioned earlier, the original requestor will receive an error

(EREMOTE) and the IP address of the first client that did not respond to the callback. The FSM enters the token retraction state, and will not honor any real-time or token requests until it has received positive acknowledgement from all clients to which it originally sent the callbacks.

In Figure 61, Client A requests some amount of rtio as in Figure 60. However, assume that Client C did not respond to the initial callback in time (step 7). The FSM will return a failure to Client A for the initial rtio request, then send out callbacks to all clients indicating the stripe group is no longer real-time (steps 11-14). In the example, Client C responds to the second callback, so the FSM will not send out any more callbacks. The stripe group is back in non-real-time mode.

Note that this can have interesting repercussions with file systems that are soft mounted by default (such as Windows). When the caller times out because other clients are not responding and then gives up and returns an error to the application, if at some point the FSM is able to process the rtio request it may result in the stripe group being put into

StorNext 3.1.3 Installation Guide

147

Page 164
Image 164
Quantum 6-00360-15 manual Callback Failures