- Released At: 18-12-2024
- Page Views:
- Downloads:
- Table of Contents
- Related Documents
-
GIR Technology White Paper
Copyright © 2024 New H3C Technologies Co., Ltd. All rights reserved.
No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New H3C Technologies Co., Ltd.
Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are the property of their respective owners.
The content in this article is general technical information, some of which may not be applicable to the product you have purchased.
Contents
Mechanism for switching to GIR mode
Overview of GIR mode switchover
Switching from normal mode to maintenance mode
Switching back from maintenance mode to normal mode
Restrictions on GIR application
Overview
Technical background
As IP networks move towards cloud and virtualization, the configuration of network devices has become more diversified and complex, accelerating the generation of upper-layer application protocols. An increasing number of upper-layer applications pose a huge challenge to network devices that carry basic data forwarding services such as route protocols. The reliability and availability of these devices are becoming increasingly important.
Currently, when a device undergoes maintenance such as upgrading and debugging, network services cannot maintain. It requires isolating the device. However, traditional isolation methods have various defects. For example, protocols like GR or NSR in routing require each service module to configure independently, involving numerous steps. An absolute "active/standby switchover" style isolates all business traffic on the device, lacking flexibility in scenarios that only require partial service isolation. Traditional isolation methods have made it somewhat difficult for quick deployment and implementation of new services in continually growing networks.
In the above-mentioned circumstances, there was an urgent need for a new method to isolate devices, which is simple to configure, has the least impact on various network services, and can flexibly select isolation operations. Thus, GIR was born. Using the Graceful Insertion and Removal (GIR) technology, when a device needs to be maintained, users only need to configure a single command to isolate the device from the network. Users can also customize the operations of each service module through a custom-profile. In networks with redundancy paths, service modules that support GIR will boot up traffic switchover to the redundancy path in advance before the device is isolated to prevent out of service and reduce the impact on the network. After the device completes upgrades and other maintenance operations, users can use the GIR switchover function to put the device back into use and restore normal service processing.
Benefits
Compared to traditional isolation methods, device isolation through GIR offers the following advantages:
· Simple and automated device isolation and restoration
Users only need to configure a single command to isolate a device OR to reintegrate it into the network. After executing the GIR switchover or switchback command, GIR will automatically release the isolation or restoration command within relevant service modules, eliminating the cumbersome process of configuring each module separately.
· High flexibility
GIR can use a custom-profile to freely select the service modules to be isolated and the actions to be executed for each module.
GIR technology implementation
|
NOTE: In this document, switching the device from normal mode to maintenance mode is called "switchover," and switching from maintenance mode back to normal mode is called "switchback." |
GIR system mode
The operating mode of the device is divided into the following two types:
· Normal mode: This is the state of the device before executing the switchover function to GIR mode. In this mode, the device operates normally, allowing traffic to be forwarded and processed as usual.
· Maintenance mode: This is the state of the device after executing the GIR mode switchover function. In this mode, the device cannot forward traffic normally; traffic is switched to other devices. Users can perform maintenance OR upgrade operations on this device.
Using the Graceful Insertion and Removal mode switchover command, you can easily send isolation commands for various protocols to the corresponding service modules. Each service module will then redirect traffic to the redundancy path, after which the device enters maintenance mode. Once maintenance or upgrading operations are complete, the user can make the device exit maintenance mode using the same mode switchover command. Before the device returns to normal mode, GIR sends commands to each service module to cancel isolation, allowing them to resume normal operation. The maintained device is then reintegrated into the effective transfer path.
|
NOTE: Currently, the service modules supported by GIR include: BGP, ISIS, OSPF, OSPFv3, VRRP, M-LAG (traffic isolation for M-LAG interfaces is achieved by issuing aggregation traffic isolation commands under the LACP protocol), and S-Trunk. |
Mechanism for switching to GIR mode
Overview of GIR mode switchover
In practical applications, the type and method of GIR mode switchover supported by the device are shown as indicated in Table 1.
Table 1 Switching types and methods for GIR mode
Switchover Mode Type |
Switchover method for mode switching |
Switching from normal mode to maintenance mode. |
· By default, the device uses the isolate method. · Shutdown method · Custom-profile method. Both the isolate method and shutdown method will execute fixed actions for the service module. The custom-profile method allows users to define the commands to be issued for the service module. |
Switch back from maintenance mode to normal mode. |
· When the reversion method is not specified, it will return to the normal mode using the reverse operation corresponding to the tangential mode. If the custom-profile mode is used during rectification, the isolate mode's corresponding reverse operation will be used for reversion. Users can manually execute commands for reversion, or set up a delay function for automatic reversion. After the specified time, the device will automatically revert to the normal mode. · In the method of custom-profile. |
The basic principles of three switchover methods are as follows:
· Isolate method: During the switchover, isolation of the specified service module is achieved by methods such as increasing the route metric of the routing protocol; During the back cut-over, the isolation of the service module is cancelled by methods such as restoring the route metric of the routing protocol.
· Shutdown method: During a switchover, isolation is achieved by disconnecting the neighborhood relationships of the route protocol, and turning off all physical interfaces on the device except for the management interface to block traffic heading towards the isolated device in the network. During the restoration cut, the route protocol process is restored, and the physical interfaces on the device are switched back to the UP state, thus canceling the isolation of the service module.
· Custom-profile method: The user customizes the isolation and restoration configuration for the specified service modules in the custom-profile provided by GIR. When executing a GIR mode switchover, the device utilizing the custom-profile method will apply the actions written into the corresponding custom-profile.
Switching from normal mode to maintenance mode
Mechanism of the isolate method
When entering maintenance mode in isolate mode, the operation mechanisms of various service modules in the device are shown as indicated in Table 2.
Table 2 Mechanism of the isolate method
Protocol Type |
Operating mechanism |
BGP |
Revoke all routes (except for the directly connected routes introduced locally) released by this device to the peer. |
IS-IS |
Set the overload flag bit in the released LSP message, and adjust the link overhead value of the IS-IS interface to the maximum. |
OSPF/OSPFv3 |
Increase the metric value of the released route link. · In the released Type-1 LSA, when the link type value is 3, the link metric value remains unchanged. When the link type values are 1, 2, or 4, the link metric values are all adjusted to the maximum value of 65535. · The link metric value in the released Type-3 LSA has been increased to 16711680. · In the released external LSA, the link metric value is increased to 16711680. |
VRRP |
The privilege level of the device in all VRRP backup groups is set to 0. |
M-LAG |
Utilize the traffic isolation function of the LACP protocol to keep all member ports of the M-LAG interface in an unselected state on the device. |
S-Trunk |
Set the link state of all S-Trunk group member interfaces on the device to Down(GIR). |
Mechanism of the shutdown method
When entering maintenance mode through the shutdown method, the operating mechanism of each service module in the device is as shown in Table 3.
Table 3 Mechanism of the shutdown method
Protocol Type |
Operating mechanism |
BGP |
Disconnect all BGP sessions and close all physical interfaces except for the management port. |
IS-IS |
Disconnect all neighborhood relationships and shut down all physical interfaces except for the management port. |
OSPF/OSPFv3 |
Disconnect all neighborhood relationships and shut down all physical interfaces except for the management port. |
VRRP |
Shut down all physical interfaces except for the management interface. |
M-LAG |
Turn off all physical interfaces except for the management port. |
S-Trunk |
Close all physical interfaces except for the management port. |
|
NOTE: When entering maintenance mode through the shutdown method, even if all service modules supported by GIR are not running, the device will still close all physical interfaces except for the management port. |
Mechanism of the custom-profile method
Before entering maintenance mode through the custom-profile method, users need to go into the custom-profile view to customize the configuration of the device, dictating which protocols are isolated and which interfaces are shut down during maintenance mode. The custom-profile used at this time is called the active profile, which sequentially records all the commands needed to be issued by the system before entering maintenance mode.
Users can typically write the following content in the tangent profile to achieve custom configuration for service module isolation:
· The isolation command is the same as the isolate mode. Different from the isolate mode, the custom-profile mode issues the isolation command to the service module, which can be a superset or subset of the isolate mode.
· The isolation command is identical to the shutdown method. However, in contrast to the shutdown method, the custom-profile method issues the isolation command to service modules, which could be either a superset or a subset of the shutdown method.
· Meanwhile, issue the isolation commands that are identical to the isolate method and shutdown method. For example, deliver the isolation commands identical to the isolate method to the service modules that need to be highly safeguarded, and offer the isolation commands identical to the shutdown method to other service modules to save system resources.
Switching back from maintenance mode to normal mode
Mechanism of the isolate method
When returning to normal mode from isolate mode, the operation mechanisms of various service modules in the device are as shown in Table 4.
Table 4 Switchback mechanism of the isolate method
Protocol Type |
Operating mechanism |
BGP |
Resend all routes that were undone when tangent was applied again. |
IS-IS |
Clear the overload flag bit in the LSP message and restore the link overhead value of the interface. |
OSPF/OSPFv3 |
Restore the metric values of the link releasing the route. |
VRRP |
The privilege level of the device is restored to its original value in all VRRP backup groups. |
M-LAG |
All M-LAG interfaces have returned to the selected state. |
S-Trunk |
The link state of all the interface members in the S-Trunk group has been restored to UP. |
Mechanism of the shutdown method
When returning to normal mode through the shutdown method, the operation mechanisms of various service modules in the device are as shown in Table 5.
Table 5 Switchback mechanism for the shutdown method
Protocol type |
Operating mechanism |
BGP |
Restore the BGP process, allow the re-establishment of BGP sessions, and cancel the shutdown of all physical interfaces except for the management port. |
IS-IS |
Restore the IS-IS process, allowing for the re-establishment of neighborhood relationships, and unblock all physical interfaces except for the management port. |
OSPF/OSPFv3 |
Resume the OSPF process, allow the re-establishment of neighborhood relationships, and cancel the shutdown of all physical interfaces except the management port. |
VRRP |
Open all physical interfaces excluding the management one. |
M-LAG |
Enable all physical interfaces, except for the management port. |
S-Trunk |
Enable all physical interfaces except for the management port. |
|
NOTE: When returning to normal mode through the shutdown method, even if all service modules supported by GIR are not running, the device will unblock all physical interfaces except for the management interface |
Mechanism of the custom-profile method
When returning to normal mode through the custom-profile method, the device will switch back using the commands configured by the user in the rollback profile. The custom-profile used during the rollback is called the rollback profile, which sequentially records all the commands the system needs to issue before entering the normal mode.
The user can typically write the following content into the switch-back profile to implement a custom configuration that cancels the isolation of the service module:
· The command to cancel isolation is the same as in the isolate mode. The difference is that with the custom-profile mode, the service modules where the cancel isolation command is issued can either be a superset or a subset of those in isolate mode.
· The cancel isolation command is identical to the shutdown method. However, the difference is that for the custom-profile method, the service modules to which the cancel isolation command is delivered can be a subset or the complete set that corresponds to the shutdown method.
· Issue the cancel isolation command in the same way as the isolate and shutdown methods.
GIR snapshot
During maintenance of the device, configuration verification can be achieved through GIR snapshots. GIR snapshots are data files that save the state information of each service module. Users can manually create snapshots at any time during normal system operation to record the state of the device at various times. In addition, the device will automatically generate snapshots before and after mode switching.
· When switching the device from normal mode to maintenance mode, it first records a GIR snapshot prior to entering maintenance mode, then isolates the device traffic. The generated snapshot is named "before_maintenance".
· When switching the device from maintenance mode back to normal mode, the device first performs protocol restoration, and then records the GIR snapshot after exiting maintenance mode. The generated snapshot name is "after_maintenance."
GIR provides the command display gir snapshot compare to display snapshot comparison information. Users can validate the configuration differences of the device before and after the switchover using this command.
Restrictions on GIR application
· GIR relies on existing redundancy paths or backup paths in the network to achieve smooth traffic switchover. If there are no redundancy paths or backup paths in the network, using GIR to isolate service modules will still result in traffic loss.
· If the command issuance of the service module fails during the mode switchover, the device will display a failure prompt information OR generate a log, exit the mode switchover and return to the original mode, but the issued commands will not be rolled back.
· When the protocol runs in isolate mode during the active switchover, it consumes more system resources but can switch back to normal mode quickly without causing traffic loss. In shutdown mode during the active switchover, the protocol uses fewer system resources, but the speed to switch back to normal mode is relatively slow, and the traffic may be lost by directly disrupting the forwarding path through closing the interface. Please select the isolation method for GIR use flexibly according to the specific situation.
· In the M-LAG application scenario, when the device fails, it is necessary to ensure that the M-LAG interface is not in the "M-LAG MAD DOWN" state, otherwise, the traffic cannot be restored from forwarding through the local M-LAG interface.
Typical applications
In a network with redundant routes, GIR isolates devices requiring maintenance by affecting the selection of traffic paths on network equipment using an isolate method. As shown in Figure 3-1, Leaf 1 and Leaf 2, as well as Leaf 3 and Leaf 4, form M-LAG systems that provide access services for user virtual machines VM 1 and VM 2. Spine 1 and Spine 2 provide data aggregation services and connect to the external network through Border devices. Routing protocols guide data forwarding (DF) between Spine and Leaf layers, and between Spine layer and Border devices, creating load sharing paths from Border devices to the Spine layer and from Spine devices to the same user virtual machine's M-LAG system. In this network, Spine and Leaf layer devices have multiple routes when processing upstream and downstream traffic. When maintaining a Spine or Leaf device in one of the routes, applying GIR can guide traffic entirely towards another Spine or Leaf device on a different route.
For instance, the process of simultaneously maintaining Spine 1 and Leaf 1 is as follows:
1. Configure the GIR function on Spine 1, and GIR issues isolation operation. Spine 1 will undo released routes OR increase the metric value of released routes. At this time, Spine 1 enters maintenance mode, causing all upstream and downstream traffic in the Spine stratum to be migrated to Spine 2.
2. Configure the GIR function on Leaf 1, and GIR will issue isolation operations. Leaf 1 will undo the release of the route OR increase the metric value of the released route, causing all downstream traffic at the Spine stratum to be completely migrated to Leaf 2.
3. All M-LAG interfaces on Leaf 1 are in a non-selected state. When VM transmits traffic through the aggregation interface, it sends all traffic to Leaf 2, which has the selected interface. At this point, the device enters maintenance mode, and both upstream and downstream traffic no longer pass through Leaf 1.
4. The network administrator begins maintenance on Spine 1 and Leaf 1.
5. After the maintenance of Spine 1 and Leaf 1 is completed, the user configures GIR to switch back, allowing traffic to pass through Spine 1 and Leaf 1 again.
Figure 1 Typical network application of GIR