Chapter 2: Capacity and Availability Management 17
Decreasing Single Points of Failure
You can maintain availability in Exchange 2000, even in the event of a failure, provided
you ensure that it is not a single point of failure. In some areas, such as database corrup-
tion, it is not possible to eliminate single points of failure, but in many cases you can guard
against individual failures and still maintain reliability. An obvious example is the direc-
tory. By having multiple domain controllers and Global Catalog servers available in any
part of your network, you maintain availability of Exchange even in the event of failure of
a particular domain controller or Global Catalog server. Having local domain controllers
or Global Catalog servers keeps Exchange available in the event of a non-local network
failure.
Using front-end servers is another way to avoid single points of failure. The failure of a single
front-end server will have no effect on the availability of Exchange to non-MAPI clients.
The clients will simply be rerouted to another front-end server, with no loss of service.
Exchange 2000 routing can be modified to minimize single points of failure. In particular,
you can modify Routing Group connectors to ensure that there are multiple bridgeheads
available, and thus maintain delivery from one part of the organization to another. Y o u
can also set up Routing Group meshes, which consist of a series of fully interconnected
Routing Groups with multiple possible routes between them.
Multiple messaging routes between servers are useless if they all rely on the same net-
work connections and the network goes down. You should therefore ensure that there are
multiple network paths (using differing technologies) that Exchange and Windows 2000
can use.
One of the most significant single points of failure is a mailbox server. This can affect very
large numbers of users, depending on the server. Mailbox servers can be clustered to ensure
their continued high availability. If you are running Exchange 2000 on Windows 2000
Advanced Server, you can cluster over two nodes and you have two possible ways to cluster
the servers—active/passive and active/active. Active/passive clustering is the current recom-
mended clustering implementation for Exchange. If you choose to implement active/active
clustering, you should realize that it requires careful planning to ensure that Exchange can
fail over correctly to the other node. With Service Pack 1 of Exchange 2000 and Windows
2000 Datacenter server, you can have four nodes in your cluster. In this implementation
consider active/active/active/passive clustering.
In a standard clustered environment, however, the disk array is still the single point of
failure, so you should think seriously about using a storage area network (SAN) to maxi-
mize the availability of all your servers running Exchange.
If you are creating truly redundant Exchange 2000 servers, you shouldn’t stop at the disk
subsystem. Your servers should be equipped with redundant RAID controllers, network
interface cards (NICs), and power supplies. In fact, you should aim to have redundancy
everywhere.
Single points of failure can also be created by improper maintenance of systems. For
example, if you are using a RAID 5 array on a server running Exchange with a hot spare,