The Role of the RMS
Scheduling deciding when and where to run parallel jobs
Audit maintaining an audit trail of system state changes
From the user’s point of view,RMS provides tools for:
Information querying the resourcesof the system
Execution loading and running parallel programs on a given set of resources
Monitoring monitoring the execution of parallel programs
2.3.1 The Structure of the RMS
RMS is implemented as a set of UNIX commands anddaemons, programmed in C and
C++, using sockets for communications. All of the details of the system (its
configuration, its current state, usage statistics) are maintained in a SQL database, as
shown in Figure 2.3. See Section 2.3.4 for an overview and
Chapter 10 (TheRMS Database)fordetails of the database.
2.3.2 The RMS Daemons
A set of daemons provide the services required for managing the resources of the system.
Todo this, the daemons both query and update thedatabase (see Section 2.3.4).
The Database Manager,msqld, provides SQL database services.
The Machine Manager,mmanager, monitors the status of nodes in an RMS system.
The PartitionManager,pmanager, controls the allocation of resources to usersand
the scheduling of parallel programs.
The Switch NetworkManager,swmgr, supervisesthe operation of the Compaq
AlphaServer SC Interconnect, monitoring it for errors and collecting performance
data.
The Event Manager,eventmgr, runs handlers in response to system incidents and
notifies clients who have registered an interest in them.
The TransactionLog Manager,tlogmgr, instigates database transactionsthat have
been requested in the TransactionLog. Allclient transactions aremade through this
mechanism. This ensures that changes to the database are serialized and an audit
trail is kept.
The Process Manager,rmsmhd, runs on each node in the system. It starts the other
RMS daemons.
2-4 Overview of RMS