High availability overview, Availability requirements, Availability evaluation – H3C Technologies H3C SR8800 User Manual

Page 10: Mtbf, Mttr

Advertising
background image

1

High availability overview

Communication interruptions can seriously affect widely-deployed value-added services such as IPTV

and video conference. Therefore, the basic network infrastructures must be able to provide high

availability.
There are three effective ways to improve availability:

Increasing fault tolerance

Speeding up fault recovery

Reducing impact of faults on services

Availability requirements

Availability requirements fall into three levels based on purpose and implementation.

Table 1 Availability requirements

Level Purpose

Implementation

1

Decrease system software and
hardware faults

Hardware—Simplifying circuit design, enhancing
production techniques, and performing reliability tests.

Software—Reliability design and test.

2

Protect system functions from being
affected if faults occur

Device and link redundancy and deployment of switchover
strategies.

3

Enable the system to recover as fast
as possible

Providing fault detection, diagnosis, isolation, and recovery
technologies.

The level 1 availability requirement should be considered during the design and production process of

network devices. The level 2 availability requirement should be considered during network design. The
level 3 availability requirement should be considered during network deployment according to the

network infrastructure and service characteristics.

Availability evaluation

Typically, Mean Time Between Failures (MTBF) and Mean Time to Repair (MTTR) are used to evaluate the

availability of a network.

MTBF

MTBF is the predicted elapsed time between inherent failures of a system during operation. It is typically

expressed in hours. A higher MTBF means a higher availability.

MTTR

MTTR is the average time required to repair a failed system. MTTR in a broad sense also involves spare

parts management and customer services.
MTTR = fault detection time + hardware replacement time + system initialization time + link recovery time

+ routing time + forwarding recovery time. A smaller value of each item, a smaller MTTR, and a higher
availability.

Advertising
This manual is related to the following products: