2 Background
To understand the continuing popularity of portable storage, it is useful to review the strengths and weak- nesses of portable storage and distributed file systems. While there is considerable variation in the designs of distributed file systems, there is also a substantial de- gree of commonality across them. Our discussion be- low focuses on these common themes.
Performance: A portable storage device offers uni- form performance at all locations, independent of fac- tors such as network connectivity, initial cache state, and temporal locality of references. Except for a few devices such as floppy disks, the access times and band- widths of portable devices are comparable to those of local disks. In contrast, the performance of a dis- tributed file system is highly variable. With a warm client cache and good locality, performance can match local storage. With a cold cache, poor connectivity and low locality, performance can be intolerably slow.
Availability: If you have a portable storage device in hand, you can access its data. Short of device fail- ure, which is very rare, no other common failures pre- vent data access. In contrast, distributed file systems are susceptible to network failure, server failure, and a wide range of operator errors.
Robustness: A portable storage device can easily be lost, stolen or damaged. Data on the device be- comes permanently inaccessible after such an event. In contrast, data in a distributed file system continues to be accessible even if a particular client that uses it is lost, stolen or damaged. For added robustness, the operational staff of a distributed file system perform regular backups and typically keep some of the back- ups off site to allow recovery after catastrophic site failure. Backups also help recovery from user error: if a user accidentally deletes a critical file, he can re- cover a
Sharing/Collaboration: The existence of a com- mon name space simplifies sharing of data and collab- oration between the users of a distributed file system. This is much harder if done by physical transfers of de- vices. If one is restricted to sharing through physical
devices, a system such as PersonalRAID can be valu- able in managing complexity.
Consistency: Without explicit user effort, a dis- tributed file system presents the latest version of a file when it is accessed. In contrast, a portable device has to be explicitly kept up to date. When multiple users can update a file, it is easy to get into situations where a portable device has stale data without its owner being aware of this fact.
Capacity: Any portable storage device has finite capacity. In contrast, the client of a distributed file system can access virtually unlimited amounts of data spread across multiple file servers. Since local storage on the client is merely a cache of server data, its size only limits working set size rather than total data size.
Security: The privacy and integrity of data on portable storage devices relies primarily on physical se- curity. A further level of safety can be provided by encrypting the data on the device, and by requiring a password to mount it. These can be valuable as a sec- ond layer of defense in case physical security fails. De- nial of service is impossible if a user has a portable storage device in hand. In contrast, the security of data in a distributed file system is based on more fragile as- sumptions. Denial of service may be possible through network attacks. Privacy depends on encryption of network traffic.
Ubiquity: A distributed file system requires oper- ating system support. In addition, it may require en- vironmental support such as Kerberos authentication and specific firewall configuration. Unless a user is at a client that meets all of these requirements, he cannot access his data in a distributed file system. In contrast, portable storage only depends on widely- supported
3Lookaside Caching
Our goal is to exploit the performance and avail- ability advantages of portable storage to improve these same attributes in a distributed file system. The result- ing design should preserve all other characteristics of the underlying distributed file system. In particular,
2