- Table of Contents
- Related Documents
-
Title | Size | Download |
---|---|---|
01-High Availability Overview | 65.40 KB |
Communication interruptions can seriously affect widely-deployed value-added services such as IPTV and video conference. Therefore, the basic network infrastructures must be able to provide high availability.
The following are the effective ways to improve availability:
· Increasing fault tolerance
· Speeding up fault recovery
· Reducing impact of faults on services
Availability requirements
Availability requirements fall into three levels based on purpose and implementation, as shown in Table 1.
Table 1 Availability requirements
Level |
Requirement |
Solution |
1 |
Decrease system software and hardware faults |
· Hardware: Simplifying circuit design, enhancing production techniques, and performing reliability tests. · Software: Reliability design and test |
2 |
Protect system functions from being affected if faults occur |
Device and link redundancy and deployment of switchover strategies |
3 |
Enable the system to recover as fast as possible |
Performing fault detection, diagnosis, isolation, and recovery technologies |
The level 1 availability requirement should be considered during the design and production process of network devices. The level 2 availability requirement should be considered during network design. The level 3 availability requirement should be considered during network deployment, according to the network infrastructure and service characteristics.
Availability evaluation
Mean Time Between Failures (MTBF) and Mean Time to Repair (MTTR) are used to evaluate the availability of a network.
MTBF
MTBF is the predicted elapsed time between inherent failures of a system during operation. It is typically in the unit of hours. A higher MTBF means a high availability.
MTTR
MTTR is the average time required to repair a failed system. MTTR in a broad sense also involves spare parts management and customer services.
MTTR = fault detection time + hardware replacement time + system initialization time + link recovery time + routing time + forwarding recovery time. A smaller value of each item means a smaller MTTR and a higher availability.
High availability technologies
Increasing MTBF or decreasing MTTR can enhance the availability of a network. The high availability technologies described in this section meet the level 3 high availability requirements in the aspect of decreasing MTTR.
High availability technologies can be classified as fault detection technologies or protection switchover technologies.
Fault detection technologies
Fault detection technologies enable detection and diagnosis of network faults. CFD, DLDP, and Ethernet OAM are data link layer fault detection technologies; NQA is used for diagnosis and evaluation of network quality; See Table 2 for the details of these technologies.
Table 2 Fault detection technologies
Technology |
Introduction |
Reference |
CFD |
Connectivity Fault Detection (CFD), which conforms to IEEE 802.1ag Connectivity Fault Management (CFM) and ITU-T Y.1731, is an end-to-end per-VLAN link layer Operations, Administration and Maintenance (OAM) mechanism used for link connectivity detection, fault verification, and fault location. |
CFD configuration in the High Availability Configuration Guide |
DLDP |
The Device link detection protocol (DLDP) deals with unidirectional links that may occur in a network. On detecting a unidirectional link, DLDP, as configured, can shut down the related port automatically or prompt users to take actions to avoid network problems. |
DLDP configuration in the High Availability Configuration Guide |
Ethernet OAM |
As a tool monitoring Layer 2 link status, Ethernet OAM is mainly used to address common link-related issues on the “last mile”. You can monitor the status of the point-to-point link between two directly connected devices by enabling Ethernet OAM on them. |
Ethernet OAM configuration in the High Availability Configuration Guide |
NQA |
Network Quality Analyzer (NQA) analyzes network performance, services and service quality through sending test packets, and provides you with network performance and service quality parameters such as jitter, TCP connection delay, FTP connection delay and file transfer rate. |
NQA configuration in the Network Management and Monitoring Configuration Guide |
Protection switchover technologies
Protection switchover technologies aim at recovering network faults. They back up hardware, link, routing, and service information for switchover in case of network faults to ensure continuity of network services. For more information about protection switchover technologies, see Table 3.
Table 3 Protection switchover technologies
Technology |
Introduction |
Reference |
Ethernet Link Aggregation |
Ethernet link aggregation, most often simply called link aggregation, aggregates multiple physical Ethernet links into one logical link to increase link bandwidth beyond the limits of any one single link. This logical link is an aggregate link. It allows for link redundancy because the member physical links can dynamically back up one another. |
Ethernet ink aggregation configuration in the Layer 2 – LAN Switching Configuration Guide |
Smart Link |
Smart Link is a feature developed to address the slow convergence issue with STP. It provides link redundancy as well as fast convergence in a dual uplink network, allowing the backup link to take over quickly when the primary link fails. |
Smart link configuration in the High Availability Configuration Guide |
MSTP |
As a Layer 2 management protocol, the Multiple Spanning Tree Protocol (MSTP) eliminates Layer 2 loops by selectively blocking redundant links in a network, and in the mean time, allows for link redundancy. |
Spanning tree configuration in the Layer 2 – LAN Switching Configuration Guide |
A single availability technology cannot solve all problems. Therefore, a combination of availability technologies, chosen on the basis of detailed analysis of network environments and user requirements, should be used to enhance network availability. For example, access-layer devices should be connected to distribution-layer devices over redundant links, and core-layer devices should be fully meshed. Also, network availability should be considered during planning prior to building a network.