Chapter 2 HPSS Planning
66 September 2002 HPSS Installation Guide
Release 4.5, Revision 2
storageclass reaches the threshold configured in the purge policy for that storage class. Remember
thatsimply adding migration and purge policies to a storage class will cause MPS to begin running
againstthe storage class, but it is also critical that the hierarchies to which that storage class belongs
be configured with proper migration targets in order for migration and purge to perform as
expected.
Thepurpose of disk migration is to make one or more copies of data stored in a disk storage class
tolower levels in the storage hierarchy. BFS uses a metadata queue to pass migration recordsto MPS.
Whena disk file needs to be migrated (because it has been created, modified, or undergone a class
of service change), BFS places a migration record on this queue. During a disk migration run on a
given storage class, MPS uses the records on this queue to identify files which aremigration
candidates.Migration records on this queue are ordered by storage hierarchy,file family, and record
create time, in that order. This ordering determines the order in which files are migrated.
MPS allows disk storage classes to be used atop multiple hierarchies (to avoid fragmenting disk
resources). To avoid unnecessary tape mounts, it is desirable to migrate all of the files in one
hierarchy before moving on to the next. At the beginning of each run MPS selects a starting
hierarchy. This is stored in the MPS checkpoint metadata between runs. The starting hierarchy
alternatesto ensure that, when errors are encountered or the migration target is not 100 percent, all
hierarchiesare served equally.For example, if a disk storage class is being used in three hierarchies,
1,2, and 3, successive runs will migrate the hierarchies in the following order: 1-2-3, 3-1-2, 2-3-1, 1-
2-3,etc. A migration run ends when either the migration target is reached or all of the eligible files
in every hierarchy are migrated. Files are ordered by file family for the same reason, although
families are not checkpoints as hierarchies are. Finally, the record create time is simply the time at
whichBFS adds the migration record to the queue, and so files in the same storage class, hierarchy,
andfamily tend to migrate in the order which they are written (actually the order in which the write
completes).
When a migration run for a given storage class starts work on a hierarchy, it sets a pointer in the
migration record queue to the first migration record for the given hierarchy and file family.
Followingthis, migration attempts to build lists of 256 migration candidates. Each migration record
read is evaluated against the values in the migration policy. If the file in question is eligible for
migration its migration record is added to the list. If the file is not eligible, it is skippedand it will
not be considered again until the next migration run. When 256 eligible files are found, MPS stops
readingmigration records and does the actual work to migrate these files. This cycle continues until
either the migration target is reached or all of the migration records for the hierarchy in question
are exhausted.
The purpose of disk purge is to maintain a given amount of free space in a disk storage class by
removing data of which copies exist at lower levels in the hierarchy. BFS uses another metadata
queue to passpurge records to MPS. A purge record is created for any disk file which may be
removedfrom a given level in the hierarchy (because it has been migrated or staged). During a disk
purge run on a given storage class, MPS uses the records on this queue to identify files which are
purge candidates. The order in which purge records are sorted may be configured on the purge
policy, and this determines the order in which files are purged. It should be noted that all of the
options exceptpurge record create time require additional metadata updates and can impose extra
overheadon SFS. Also, unpredictable purge behavior may be observed if the purge record ordering
ischanged with existing purge records in the system until these existing records are cleared. Purge
operatesstrictly on a storage class basis, and makes no consideration of hierarchies or file families.
MPSbuilds lists of 32 purge records, and each file is evaluated for purge at the point when its purge
recordis read. If a file is deemed to be ineligible, it will not be considered again until the next purge run.
A purge run ends when either the supply of purge records is exhausted or the purge target is
reached.