Microsoft Exchange 2000 Operations — Version 1.016
Of course, just because you have allowed for a certain amount of downtime per server per
month, this does not mean that you have to use it, and in most cases you will not. On the
other hand, just because you haven’t performed offline maintenance one month does not
mean that the hours can be carried over to the following month. Your user community will
be very unhappy if you take a system down for 2 days, even if it has been up solidly for 2
years!
You might wish to define different service hours for the different services available in
Exchange (mail, public folders, etc). This would depend on the amount of offline mainte-
nance that is typically required for each service. For example, you might determine that
your SMTP bridgehead servers and firewall servers never require offline maintenance and
so might set the level of service hours for mail delivery significantly higher than for mail-
box access. If you are prepared to spend the appropriate money on resources, it is very
possible to achieve extremely low levels of scheduled downtime, and this can be reflected
in your SLA.
Service Availability
Service availability is a measure of how available your Exchange services are during the
service hours you have defined. In other words, it defines the levels of unscheduled down-
time you can tolerate within your organization. Typically levels of availability in an SLA of
an enterprise are between 99.9 and 99.999 percent. This corresponds to a downtime of as
much as 525 and as few as 5 minutes per service per year.
Of course, ANY unscheduled downtime is inconvenient at best, and very costly at worst,
so you need to do your best to minimize it.
To ensure high levels of availability, you need to consider two key questions:
How often, on average, is there downtime for a service?
How long does it take to recover the service if there is downtime?
Once you have considered these questions, you can set about minimizing the number of
times a service fails and the time taken to recover that service.
Availability management is intrinsically linked with capacity management. If capacity
is not managed properly, then overloaded servers running Exchange might fail, causing
availability problems. A classic example of this would be running out of disk space on a
server running Exchange, which would result in the databases shutting down and in users
losing a number of services.
Minimizing System Failures
To minimize the frequency of failure in Exchange 2000, you need do the following:
Decrease single points of failure
Increase the reliability of Exchange 2000 itself.