Cluster File System Architecture

About CFS

Table 2-1

Primary and Secondary Mount Options

 

 

 

 

 

 

 

Secondary:

Secondary:

Secondary:

 

 

ro

rw

ro, crw

 

 

 

 

 

 

Primary:

X

 

 

 

ro

 

 

 

 

 

 

 

 

 

Primary:

 

X

X

 

rw

 

 

 

 

 

 

 

 

 

Primary:

 

X

X

 

ro, crw

 

 

 

 

 

 

 

 

Mounting the primary node with only the -o cluster,ro option prevents the secondary nodes from mounting in the read-write mode. Note that mounting the primary node with the rw option implies read-write capability throughout the cluster.

Parallel I/O

Some distributed applications read and write to the same file concurrently from one or more nodes in the cluster; for example, any distributed application where one thread appends to a file and there are one or more threads reading from various regions in the file. Several high-performance computing (HPC) applications can also benefit from this feature, where concurrent I/O is performed on the same file. Applications do not require any changes to use this parallel I/O feature.

Traditionally, the entire file is locked to perform I/O to a small region. To support parallel I/O, CFS locks ranges in a file that correspond to an I/O request. Two I/O requests conflict, if at least one is a write request and it’s I/O range overlaps the I/O range of the other I/O request.

The parallel I/O feature enables I/O to a file by multiple threads concurrently, as long as the requests do not conflict. Threads issuing concurrent I/O requests can execute on the same node, or on a different node in the cluster.

An I/O request that requires allocation is not executed concurrently with other I/O requests. Note that when a writer is extending the file and readers are lagging behind, block allocation is not necessarily done for each extending write.

If the file size can be predetermined, the file can be preallocated to avoid block allocations during I/O. This improves the concurrency of applications performing parallel I/O to the file. Parallel I/O also avoids unnecessary page cache flushes and invalidations using range locking, without compromising the cache coherency across the cluster.

For applications that update the same file from multiple nodes, the -nomtimemount option provides further concurrency. Modification and change times of the file are not synchronized across the cluster, which eliminates the overhead of increased I/O and locking. The timestamp seen for these files from a node may not have the time updates that happened in the last 60 seconds.

18

Chapter 2

Page 18
Image 18
HP UX Serviceguard Storage Management Software manual Parallel I/O, Primary and Secondary Mount Options