- Released At: 07-04-2025
- Page Views:
- Downloads:
- Table of Contents
- Related Documents
-
ERPS Technology White Paper
Copyright © 2025 New H3C Technologies Co., Ltd. All rights reserved.
No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New H3C Technologies Co., Ltd.
Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are the property of their respective owners.
This document provides generic technical information, some of which might not be applicable to your products.
Contents
Technical implementation of the ERPS
Single-ring operating principle
Single-ring detection and switch-back mechanism
Single-ring multiple instance load sharing
Operating mechanism of polycyclic compounds
Multi-ring detection and switch-back mechanism
Multi-ring multiple instance load sharing
Manually configuration mechanism
Connecting multiple subrings to the primary ring
Overview
Technical background
In a Layer 2 Ethernet network, use redundant links, such as ring networks, to back up connections and enhance network reliability. However, using redundant links may cause loop issues, leading to a broadcast storm and other instability, which affects communication quality and may disrupt communication.
The STP protocol is a commonly used loop protection mechanism, but it has a long convergence time and is affected by network topology, making it unsuitable for high transmission quality data requirements. To address this issue, the ITU-T organization introduced ERPS (Ethernet Ring Protection Switching) technology. This technology shortens convergence time, eliminates the impact of network scale, and enhances the reliability and stability of Ethernet rings.
Benefits
· The ERPS protocol converges quickly, typically in under 50ms. It swiftly switches links when detecting a link fault or switchover in the loop, minimizing network disruption and ensuring continuity and reliability.
· Support multi-ring topology: ERPS technology supports multi-ring structures, ensuring that changes in one ring do not affect others. This maintains data transmission stability and network reliability. Support for multi-ring topologies enables more complex and flexible network designs that adapt to various network sizes and business needs.
· Load sharing: Support load sharing in a ring network to fully utilize the bandwidth of physical links, optimize network resources, and improve bandwidth utilization.
· High compatibility: Use ERPS, a standard protocol proposed by the ITU-T organization, to directly apply ERPS technology on existing network infrastructure without large-scale device replacement or network renovation. Existing devices support ERPS through software upgrades or configuration changes. This approach saves on upgrade costs and simplifies network design.
· Simplify network management: Compared to the more complex STP configuration, ERPS offers a simpler and more intuitive setup. This reduces complexity and operational costs for network administrators.
Technical implementation of the ERPS
Concepts
ERPS ring
As shown in Figure 1, ERPS uses a ring network topology.
Figure 1 Basic networking model of ERPS
An Ethernet network topology with a ring connection is called an ERPS ring. The ERPS ring consists of a primary ring and a secondary ring. By default, the ERPS ring is a primary ring, but you can manually configure it as a secondary ring. An ERPS ring can consist of a single primary ring or multiple ring networks. These ring networks can contain multiple primary rings and multiple secondary rings. As shown in Figure 1, the closed loop formed by nodes Device A, Device B, Device C, and Device D is the primary ring. The subring consists of three links: Device C to Device E, Device E to Device F, and Device F to Device D. It forms an open loop.
ERPS instances
In an ERPS network, one ring can support multiple instances, with each instance representing a logical ring. Each instance has its own protocol channel, data channel, and owner node. Each instance acts as an independent protocol entity, maintaining its own state and data.
Messages with different ring IDs are distinguished by the destination MAC address, where the last byte represents the ring ID. Messages with the same ring ID are differentiated by the VLAN ID, which identifies the corresponding ERPS instance. The combination of ring ID and VLAN ID uniquely identifies an instance.
Configure multiple ERPS instances on the same ring network. This setup allows different ERPS instances to send traffic for different VLANs. Achieve distinct topologies for each VLAN's data flow, enabling effective load sharing.
Control VLAN
Use the VLAN control to transmit ERPS protocol messages. Each ERPS instance has its own control VLAN.
Protected VLAN
Use the protected VLAN to transmit datagrams. Each ERPS instance has its own protected VLAN.
Node role
Each device on the ERPS ring is called a node. The node role depends on the user's configuration and includes the following types:
· Owner node: The primary node manages the blocking and releasing of ports on RPL to prevent loops and facilitate link changeover. Use RPL (Ring Protection Link) between the owner node and the neighbor node. In the example Figure 1, the link between Device A and Device B, as well as between Device E and Device F, is RPL.
· Neighbor node: This node connects to the Owner node in RPL. It collaborates with the Owner node to block and release the ports on RPL for changeover.
· Interconnection node: This node connects multiple rings in a multi-ring model. The interconnection node belongs to a sub-ring, while the primary ring has no interconnection nodes. In the link protocol message delivery mode between sub-ring interconnection nodes, the sub-ring protocol messages terminate at the interconnection nodes, while the datagrams do not terminate.
· Normal node: This refers to all nodes that are not one of the three types mentioned above. The normal node receives and forwards protocol messages and datagrams in the link.
Each node has two member ports, port0 and port1. These ports function identically and transmit protocol messages and datagrams on the ERPS ring.
As shown in Figure 1, in the primary ring, Device A is the Owner node, and Device B is the Neighbor node. In the secondary ring, Device E is the Owner node, and Device F is the Neighbor node. Device C and Device D are interconnection nodes.
Port type
The ERPS ring member port type depends on user configuration and includes the following three types:
· RPL ports: Ports at both ends of the RPL. In a healthy ring network, the RPL port is in a blockade state.
· Interconnection port: The interconnection node port connects the sub-ring and the primary ring.
· Normal ports serve as the default port type. They can receive (Rx) and forward protocol packets and datagrams without any special functions.
As shown in Figure 1, ports Port A1, Port B1, Port E1, and Port F1 on Device A, Device B, Device E, and Device F are RPL ports. Ports C3 on Device C and D3 on Device D serve as interconnection ports. The remaining ports are normal ports.
ERPS protocol packets
Ring Automatic Protection Switching (R-APS) messages are the protocol messages for ERPS.
R-APS message format
Figure 2 R-APS message format
The meanings of each field in the R-APS message are as shown in Table 1.
Table 1 Fields of the ERPS protocol message
Field |
Length |
Description |
MEL |
3bits |
The R-APS message level allows you to set the R-APS message level. The node does not process R-APS messages with a higher level than its own. Nodes within the same instance on the same ring must have the same R-APS message level. |
Version |
5bits |
The current version number of the ERPS protocol is 0x01. |
OpCode |
8bits |
Identify this message as an R-APS message with a fixed value of 0x28. |
Flags |
8bits |
Fixed value 0x00. |
TLV Offset |
8bits |
The fixed value 0x20 indicates the offset of the TLV in the message. |
Request/State |
4bits |
Indicate the R-APS message type: · 1101: FS (Forced Switch) RAPS · 1110: Flush message · 1011: SF (Signal Failed) RAPS · 0111: MS (Manual Switch) RAPS · 0000: NR (No Request) RAPS · Other: reserved field |
Sub-code |
4bits |
The current value is all zeros. |
Status |
8bits |
Indicate the status information of the R-APS message. · RB (RPL Blocked), 1 bit: Set RB to 1 to indicate the RPL link is blocked; set RB to 0 to indicate the RPL link is not blocked. Non-owner nodes do not have this status information, represented as 0. · DNF (Do Not Flush), 1 bit: set DNF to 1 to prevent refreshing the MAC address entry upon receiving the current information; set DNF to 0 to refresh the MAC address entry. · BPR (Blocked Port Reference), 1 bit: Set this field to 0 to block port 0, and set it to 1 to block port 1. If the R-APS message is sent from an interconnecting node of a sub-ring, set this field to 0. · Status Reserved: reserved field, 5 bits. |
Node ID |
48bits |
This indicates the MAC address of the node device. |
Reserved |
192bits |
reserved field |
R-APS message type
For more information about the meaning of each field in the R-APS message, see Table 2.
Table 2 R-APS message types and their functions.
Packet type |
Description |
(No Request, RPL Block, link recovery, RPL blocking) message |
The Owner node in Idle state sends a notification to other nodes that the RPL port is blocked. Other nodes release their fault-free ports after receiving the (NR, RB) message and update their MAC address entries. When the link stabilizes in the Idle state, the Owner node periodically sends (NR, RB) messages. |
NR (No Request, recovery) message |
After the link fault recovery, the node where the recovery port is located sends the NR message. The Owner node starts the WTR timing after receiving the NR message. It stops sending the NR message when the recovery port node receives the (NR, RB) message. |
Signal Fail (SF) message |
When the link experiences signal transmission failure, the node with the faulty port sends an SF message. The Owner and Neighbor nodes then release their respective RPL ports. Before the fault is resolved, the node where the faulty port is located periodically sends SF messages. |
MS (Manual Switch) message |
Send from nodes configured in MS mode. Block the ports configured in MS mode. Other nodes release their fault-free ports upon receiving the MS message and update their MAC address entries. In MS state, the link periodically sends MS messages. |
FS (Forced Switch) message |
Nodes configured in FS mode send messages. Block ports configured in FS mode. Other nodes receive FS messages, release all their ports, and update their MAC address entries. In FS state, the link sends FS messages periodically. |
Flush (flooding) message |
If the sub-ring topology changes, interconnected nodes broadcast Flush messages to notify the primary ring to refresh the MAC address entries. |
|
NOTE: · The primary ring and secondary ring protocol messages can only transmit within their own rings. Other protocol messages on the secondary ring, except for flush messages, terminate at interconnecting nodes. · The datagram from the subring can pass through to the primary ring. |
ERPS protocol status
ERPS defines the following six states:
· Init status: The system enters Init status when the number of ports on non-interconnected nodes is less than 2 or the number of ports on interconnected nodes is less than 1.
· Idle state: After the ring initializes and reaches a stable state, when the owner node enters the Idle state, other nodes also enter the Idle state. The Owner node and Neighbor node's RPL ports are in a blockade state, meaning RPL is unreachable. The Owner node regularly sends (NR, RB) messages.
· Protection status: This is the stable state reached after a changeover when a link in the loop fails. Open the RPL ports for the Owner node and Neighbor node to ensure the entire ring network remains connected. When a node in the link enters Protection status, other nodes also enter Protection status.
· MS status: You can manually changeover the traffic forwarding path in MS status. After performing the MS operation on a node in the link, other nodes enter the MS state.
· FS status: Use FS status to perform a forced switch of the traffic forwarding path. After performing an FS operation on a node in the link, other nodes enter the FS state.
· Pending status: The pending status is an unstable state and serves as a transitional state during status changes.
When the loop is normal, it remains in Idle state. After a link failure, it switches to Protection state.
ERPS timer
Hold-off timing timer
This timer activates when the port detects a link fault, delaying the speed of fault reporting. After a link failure, wait for the hold-off timeout. If the failure persists, report it again. This provides the service layer with opportunities to repair links and prevents unnecessary fault reports. The duration of this timing affects the speed of link fault reporting and the changeover performance during a fault.
Guard timing
This timer activates when the port detects a link recovery. Use it to prevent unnecessary flapping in the network caused by residual original R-APS messages during forwarding delays. Before this timing expires, the interface will not process any R-APS messages. This timing affects the link recovery performance during multipoint failures.
WTR Timer
In switch back mode, this timer starts when the Owner node receives an NR message in Protection status. It prevents frequent flapping caused by intermittent faulty links on the ring network. Before this timer times out, RPL continues forwarding while the switch-back point remains temporarily blocked. If the Owner node receives an SF message during this period, indicating a fault link still exists in the ring network, the timer closes immediately, and RPL continues forwarding. Otherwise, after the timer times out, the Owner node blocks the RPL port, sends an (NR, RB) message to notify the switch-back node, releases the temporarily blocked port, and refreshes the MAC address entry.
WTB Timer
In switch back mode, this timer activates when the Owner node receives an NR message in MS or FS state. It prevents RPL ports on the ring network from being repeatedly blocked and unblocked due to network flapping. Before this timing expires, RPL continues forwarding, and the faulty node sends an NR message. During this period, if the Owner node receives an SF message again, it indicates a faulty link in the ring network. The timer closes directly while RPL continues to forward. Otherwise, after the timer times out, the Owner node blocks RPL, sends an (NR, RB) message to notify the temporary blocking point to release, and refreshes the MAC address entries.
Single-ring operating principle
|
NOTE: When the RPL port is in a blockade state: · Do not forward ERPS protocol messages and datagrams in the primary ring. · In the sub-ring, forward only ERPS protocol messages; do not forward datagrams. |
Single-ring detection and switch-back mechanism
Ring network health monitoring and management mechanism
As shown in Figure 3, the specific process of the ring network health monitoring and handling mechanism is as follows:
1. The owner node sends (NR, RB) messages to other nodes every 5 seconds.
2. When a normal node receives the (NR, RB) message, it forwards the message to the adjacent node.
3. When all nodes receive the (NR, RB) message, the ring network indicates a healthy state.
When the ring network is healthy, the RPL port remains in a blockade state while other ports can forward data traffic normally.
Figure 3 ERPS ring in a healthy state
Ring network failure detection and handling mechanism
When a node in the link detects that any of its ports belonging to the ERPS ring is down, it blocks the faulty port, refreshes the MAC address entries, and immediately sends an SF message to notify other nodes in the link about the failure. Upon receiving this message, other nodes unblock non-faulty ports and refresh their MAC address entries. As shown in Figure 4, the specific process for detecting and handling network failures in a ring network is as follows:
1. When Device C and Device D detect a link fault, they block the faulty port, refresh the MAC address entries, and immediately send three SF messages continuously. Then, they periodically send SF messages at 5-second intervals.
2. When a normal node receives an SF message, it refreshes the MAC address entry and forwards the message to adjacent nodes.
3. When the Owner node and Neighbor node receive the message, open the RPL port and refresh the MAC address entry.
Figure 4 Network diagram of link fault handling process
Ring network failure detection and handling mechanism
When the failed link recovers, first block the ports that were in a failed state. Then, start the Guard timing and send an NR message to notify the Owner node that the failed link has recovered. The Owner node starts the WTR timing timer upon receiving the NR message. If it does not receive the SF message before the timer expires, it blocks the RPL port and periodically sends (NR, RB) messages. The switch-back node releases the temporarily blocked switch-back port upon receiving the (NR, RB) messages. The Neighbor node blocks the RPL port and restores the link after receiving the (NR, RB) messages. As shown in Figure 5, the specific process for network failure detection and recovery handling is as follows:
1. When Device C and Device D detect the recovery of their link, temporarily block the previously faulty ports, start the Guard timing, and send NR messages.
2. When a normal node receives an NR message, it forwards the message to the adjacent node.
3. Device A (Owner node) starts the WTR timing after receiving the NR message. Once the timer times out, block the RPL port and send out the (NR, RB) message.
4. After Devices C and D receive the (NR, RB) message, they will release the temporarily blocked switch-back port.
5. Device B (Neighbor node) blocks the RPL port after receiving the (NR, RB) message.
6. Restore the link to its state before the failure.
Figure 5 Network diagram of the link recovery process
The owner node has two methods for link recovery processing.
· Switch back mode: After the Owner node receives the NR message indicating fault clearance, it activates the WTR/WTB timing timer. Before the timer times out, if the owner node does not receive the SF message, change the port status, block the RPL port, clear the MAC address entries, and send the (NR, RB) message. Other nodes should unblock non-faulty ports and clear their respective MAC address entries. After the timer times out, change over to the Idle state.
· Non-revertive behaviour: After the owner receives the NR message, it maintains the previously set port state without executing any actions.
Single-ring multiple instance load sharing
In the same ring network, if multiple VLAN data flows exist simultaneously, configure multiple ERPS instances. Each ERPS instance forwards traffic for different VLANs, known as protected VLANs. This setup allows different VLAN data flows to have distinct forwarding paths in the ring network, achieving load sharing.
Figure 6 Single-ring multiple instance load sharing networking
As shown in Figure 6, Instance 1 and Instance 2 are two instances configured within an ERPS ring. The RPLs for the two instances differ. The link between Device A and Device B uses Instance 1's RPL, with Device A as Instance 1's owner node. The link between Device C and Device D uses Instance 2's RPL, with Device C as Instance 2's owner node. Configure different instances of RPL to block different VLANs, achieving load sharing in a single ring.
Operating mechanism of polycyclic compounds
|
NOTE: When the RPL port is in a blockade state: · Do not forward ERPS protocol messages and datagrams in the primary ring. · In the subring, forward only ERPS protocol messages; do not forward datagrams. |
Multi-ring detection and switch-back mechanism
Ring network health monitoring and management mechanism.
As shown in Figure 7, the specific process for ring network health monitoring and handling is as follows:
1. The owner nodes of the primary ring and secondary ring send (NR, RB) messages to other nodes in their respective rings every 5 seconds.
2. The primary ring and secondary ring normal nodes receive (NR, RB) messages and forward them to adjacent nodes.
3. When the Interconnection node receives the (NR, RB) message, it terminates the message and does not forward it to the primary ring.
4. When all nodes receive the (NR, RB) message, the ring network indicates a healthy status.
When the ring network is healthy, the RPL port remains in a blockade state while other ports can forward data traffic normally.
Figure 7 ERPS loop in a healthy state
Ring network failure detection and handling mechanism.
As shown in Figure 8, the specific process for detecting and handling subring faults is as follows:
1. When Device F and Device D in the subring detect a link fault, they block the faulty port, refresh the MAC address entries, and immediately send three SF messages continuously. After that, they periodically send SF messages at 5-second intervals.
2. When a Normal node receives an SF message, it refreshes the MAC address entry and forwards the message to the adjacent node.
3. When the Owner node and Neighbor node receive the SF message, open the RPL port and refresh the MAC address entry.
4. When Device C and Device D (interconnection nodes) detect a link fault, they send a Flush message to the primary ring to announce changes in the subring's network link.
5. The primary ring node receives the Flush message and refreshes the MAC address entry.
Figure 8 Network diagram of the sub-ring link fault handling process
Ring network failure detection and recovery mechanism
As shown in Figure 9, the specific process for subring switch-back detection and handling is as follows:
1. When Device F and Device D detect the recovery of their link, temporarily block the ports that were previously in a fault state. Start the Guard timing and send NR messages within the sub-ring.
2. When a normal node receives an NR message, it forwards the message to the adjacent node.
3. Device E (Owner node) receives the NR message, starts the WTR timing, and after the timer expires, blocks the RPL port and sends out the (NR, RB) message.
4. When the Normal node receives the (NR, RB) message, refresh the MAC address table and forward the message to the neighboring node.
5. After Device F and Device D receive the (NR, RB) message, they will release the temporarily blocked switch-back port, stop sending NR messages, and refresh the MAC address table. Device F (neighbor node) blocks the RPL port after receiving the (NR, RB) message.
6. When Device C and Device D (interconnection nodes) detect the link fault recovery, they send a Flush message to the primary ring to advertise changes in the sub-ring's network link.
7. The primary ring node refreshes the MAC address entry upon receiving the Flush message.
8. Restore the link to its state before the failure.
Figure 9 Network diagram of the recovery process for sub-ring links
In the recovery process of the link, the owner node has two methods.
· Switch back mode: The owner node activates the WTR/WTB timing timer after receiving the NR message once the fault is resolved. Before the timer times out, if the owner node does not receive the SF message, changeover the port status, block the RPL port, clear the MAC address entries, and send the (NR, RB) message. Other nodes should unblock their non-faulty ports and clear their respective MAC address entries. After the timer times out, change over to the Idle state.
· Non-revertive behavior: The owner receives the NR message and maintains the previously set port status without executing any actions.
Multi-ring multiple instance load sharing
The working mechanism is similar to a single ring. For more information about multiple instance load sharing, see "Single-ring multiple instance load sharing."
Manually configuration mechanism
ERPS supports two levels of manual configuration: MS and FS.
· MS allows users to select member ports of the configured MS mode in the current ring instance as blocking ports. After users configure the erps switch manual command on the node where the port is located, the node will send out MS messages. Other nodes actively release their ERPS ring member ports after receiving the MS message. Eventually, only the ports configured by the MS will remain blocked across the entire link. Note that the MS state can respond to link events, allowing changeover to the appropriate state. In the MS state, if another link fails, the affected node sends an SF message. Other nodes will then release their ERPS ring member ports, including the ports configured to block in the MS state. At this time, the link can normally changeover to the protection state.
· FS functions similarly to MS. However, in FS state, nodes do not respond to link fault events and always maintain the FS state.
Application scenarios
The normal operation of ERPS depends on users configuring it correctly. Here are a few typical networking methods.
Sing ring
As shown in Figure 10, the network topology has only one ring, so define one ERPS ring.
Single ring
As shown in Figure 11, the network topology contains two rings with two common nodes between them. Select one ring as the primary ring and the other as the secondary ring.
Figure 11 Network diagram of a single loop.
Connecting multiple subrings to the primary ring
As shown in Figure 12, the network topology contains three or more rings, with each sub-ring sharing two common nodes with the primary ring.
Figure 12 Network diagram of multiple subrings connecting the primary ring
Connect a single sub-ring
As shown in Figure 13, the network topology has three or more rings. Between multiple subrings and subring 1, there are two common nodes based on the single-ring topology.
Figure 13 Single subring network diagram
Connecting the sub-ring to the multi-ring
As shown in Figure 14 and Figure 15, the network topology contains three or more loops. Each sub-loop shares a common node with at least two other loops, divided into the following two cases.
ERPS function under M-LAG networking
To enhance device reliability in an Ethernet ring network, configure Ethernet rings under the M-LAG group. Virtualize devices at the aggregation level in the Ethernet ring to achieve inter-device link aggregation. This approach provides device-level redundancy protection and traffic load sharing.
Figure 16 Network diagram of ERPS under M-LAG