Integrating Portable and Distributed Storage

Niraj Tolia†‡, Jan Harkes, Michael Kozuch, M. Satyanarayanan†‡

Carnegie Mellon University, Intel Research Pittsburgh

Abstract

We describe a technique called lookaside caching that combines the strengths of distributed file systems and portable storage devices, while negating their weaknesses. In spite of its simplicity, this tech- nique proves to be powerful and versatile. By unifying distributed storage and portable storage into a single abstraction, lookaside caching allows users to treat devices they carry as merely perfor- mance and availability assists for distant file servers. Careless use of portable storage has no catastrophic consequences.

1 Introduction

Floppy disks were the sole means of sharing data across users and computers in the early days of per- sonal computing. Although they were trivial to use, considerable discipline and foresight was required of users to ensure data consistency and availability, and to avoid data loss — if you did not have the right floppy at the right place and time, you were in trouble! These limitations were overcome by the emergence of dis- tributed file systems such as NFS [17], Netware [8], LanManager [24], and AFS [7]. In such a system, re- sponsibility for data management is delegated to the distributed file system and its operational staff.

Personal storage has come full circle in the recent past. There has been explosive growth in the avail- ability of USB- and Firewire-connected storage de- vices such as flash memory keychains and portable disk drives. Although very different from floppy disks in capacity, data transfer rate, form factor, and longevity, their usage model is no different. In other words, they are just glorified floppy disks and suffer from the same limitations mentioned above. Why then are portable storage devices in such demand today? Is there a way to use them that avoids the messy mistakes of the past, where a user was often awash in floppy disks trying to figure out which one had the latest version of a specific file? If loss, theft or destruction of a portable storage device occurs, how can one prevent catastrophic data loss? Since human attention grows ever more scarce, can we reduce the data management demands on atten- tion and discipline in the use of portable devices?

We focus on these and related questions in this pa- per. We describe a technique called lookaside caching

that combines the strengths of distributed file sys- tems and portable storage devices, while negating their weaknesses. In spite of its simplicity, this technique proves to be powerful and versatile. By unifying “stor- age in the cloud” (distributed storage) and “storage in the hand” (portable storage) into a single abstraction, lookaside caching allows users to treat devices they carry as merely performance and availability assists for distant file servers. Careless use of portable storage has no catastrophic consequences.

Lookaside caching has very different goals and de- sign philosophy from a PersonalRAID system [18], the only previous research that we are aware of on us- age models for portable storage devices. Our starting point is the well-entrenched base of distributed file sys- tems in existence today. We assume that these are suc- cessful because they offer genuine value to their users. Hence, our goal is to integrate portable storage devices into such a system in a manner that is minimally dis- ruptive of its existing usage model. In addition, we make no changes to the native file system format of a portable storage device; all we require is that the de- vice be mountable as a local file system at any client of the distributed file system. In contrast, PersonalRAID takes a much richer view of the role of portable storage devices. It views them as first-class citizens rather than as adjuncts to a distributed file system. It also uses a customized storage layout on the devices. Our design and implementation are much simpler, but also more limited in functionality.

We begin in Section 2 by examining the strengths and weaknesses of portable storage and distributed file systems. In Sections 3 and 4, we describe the design and implementation of lookaside caching. We quantify the performance benefit of lookaside caching in Sec- tion 5, using three different benchmarks. We explore broader use of lookaside caching in Section 6, and con- clude in Section 7 with a summary.

1

Page 2
Image 2
Intel IRP-TR-03-10 warranty Introduction, Abstract