IS-IS GR Technology White Paper

 

Keywords: IS-IS, GR, LSDB

Abstract: As a high-availability technology, Graceful Restart (GR) enables a device to forward data uninterruptedly when it performs an active/standby switchover or restarts a routing protocol, thus ensuring the continuity of key services. Currently, GR is widely used in active/standby switchover and system upgrade of routers. This document describes the implementation and application scenarios of IS-IS GR.

Note: The term “router” in this document refers to a router in a generic sense or a Layer 3 switch running a routing protocol.

Acronyms:

Acronym

Full spelling

IS-IS

Intermediate System-to-Intermediate System intra-domain routing information exchange protocol

GR

Graceful Restart

PDU

Protocol Data Unit

IIH PDU

Intermediate System-to-Intermediate System Hello PDU

LSP

Link State Protocol Data Unit

LSDB

Link State Database

SNP

Sequence Numbers PDU

PSNP

Partial Sequence Numbers PDU

CSNP

Complete Sequence Numbers PDU

RR

Restart Request

RA

Restart Acknowledgement

SA

Suppress Adjacency

DIS

Designated Intermediate System

RIB

Routing Information Base

FIB

Forwarding Information Base

 



Overview

Graceful Restart (GR) enables a device to forward data uninterruptedly when it performs an active/standby switchover or restarts a routing protocol. When a GR-capable device restarts a routing protocol, it notifies the event to its neighbors, which then maintain adjacencies and the routing information of the device within a specified interval. After the protocol is restarted, the device retrieves the information (topology, routing and session information maintained by the protocol supporting GR) from the neighbors and restores the state before reboot. During the restart process, no route flapping occurs and no forwarding path is changed. Thus, the system can operate continuously.

IS-IS GR ensures the service continuity of an IS-IS-enabled router during active/standby switchover or IS-IS restart.

1.1  Background

Without GR, a router, after restarting IS-IS, sends hello packets to discover neighbors. The neighbor routers remove the former adjacency with the router upon receiving the hello packet, and notify other routers of the event. The restarting router and neighbors have to reestablish adjacencies and exchange routing information. This process brings route flapping and service interruption, which are unacceptable for a network requiring high reliability.

How to solve this problem?

We know that a distributed device has its control plane separated from its forwarding plane. That is, the main control board controls and manages the entire device, including protocol operation and route calculation, and the interface boards forward packets. Thus, data forwarding can be continuous during active/standby switchover or routing protocol restart; meantime, if neighbors can keep their adjacency with the restarting device, the device can retrieve route information from the neighbors and restore the routes it maintains before restart immediately.

To implement these ideas, IETF introduced IS-IS GR, which effectively avoids route flapping and service interruption during active/standby switchover or IS-IS restart.

1.2  Benefits

IS-IS GR ensures service continuity, and avoids route flapping and single point of failure during active/standby switchover or IS-IS restart, and thus enhances network reliability.

IS-IS GR Implementation

2.1  Terminology

l              GR Restarter: A GR-capable device that restarts a routing protocol.

l              GR Helper: Neighbors of a GR Restarter, which help the GR Restarter to retrieve the routing table in a GR process.

l              GR session: Refers to the GR capability negotiation process during IS-IS neighbor relationship establishment between two routers. If the two routers are GR capable, they will perform GR during protocol restart.

 

A distributed device can be configured as a GR Restarter and GR Helper, while a centralized device can be configured only as a GR Helper that helps the GR Restarter to complete a GR process.

 

2.2  Operating Mechanism

To support GR capability, IS-IS is extended as follows:

l              The Restart TLV (TLV 211) is added in IS-IS hello packets.

l              Three timers, T1, T2 and T3, are added.

2.2.1  Restart TLV

To ensure the GR Restarter to notify GR Helpers of a reboot, the Restart TLV with Type 211 is added in hello packets. The Length of the TLV depends on the padded contents and must be in the range of 1 to 3 + ID Length. The Value format is illustrated in the following figure.

Figure 1 Structure of the Restart TLV

1. Flags

The Flags field contains necessary state flag bits and has a length of 1 byte. Its format is illustrated in the following figure.

Figure 2 Flags field of the Restart TLV

Currently, only the last three bits are used as flag bits.

(1)        RR/RA

l              RR: Restart Request flag bit. If set to 1, it indicates that the sending router just restarted.

l              RA: Restart Acknowledgement flag bit. If set to 1, it indicates the packet is an acknowledgement packet sent to the GR Restarter.

After a GR Restarter restarts, it sets the RR flag bit of the first hello packet sent on each interface to 1, notifying the GR Helpers of the restart event; upon receiving the hello packet, each GR Helper replies with a hello packet that has the RA flag bit set to 1.

(2)        SA

The Suppress-Advertisement (SA) flag bit is an optional bit that aims to avoid the occurrence of black hole routes. If a GR Helper forwards packets to the GR Restarter during the GR process, a black hole occurs, resulting in severe packet loss. To avoid this, the SA bit of hello packets sent by the GR Restarter must be set to 1. Upon receiving such packets, GR Helpers will not put the GR Restarter into LSPs to be advertised, that is, the GR Restarter is hidden from the network for a period, during which, all devices will not forward packets to the GR Restarter to avoid black hole routing.

2. Remaining Time

The Remaining Time field indicates the remaining time (in seconds) before the neighbor ages out, that is, the maximum time for which the neighbor can act as a GR Helper. If this time expires, the adjacency between the GR Restarter and GR Helper terminates. When a GR Helper receives a hello packet with the RR bit set to 1 from the GR Restarter, it replies with a hello packet that has the RA bit set to 1 and has the Remaining Time field padded.

3. Restarting Neighbor ID

The Restarting Neighbor ID field indicates the system ID of a GR Restarter. Upon receiving a hello packet with the RR bit set to 1, a GR Helper puts the system ID in the hello packet to the Restarting Neighbor ID field of an RA packet and then sends the RA packet out. In this way, the receiver device is clearly specified in the RA packet. If multiple GR Restarters receive the RA packet, they compare the Restarting Neighbor ID with their system ID, and only the GR Restarter with the specified system ID will process the RA packet.

2.2.2  Timers

To support GR, IS-IS defines three timers, T1, T2, and T3.

l              Similar to the IIH timer of IS-IS, a T1 timer is set on each interface to define the interval for sending hello packets with the RR flag bit set. A device creates a T1 timer on each interface after reboot, and periodically sends RR hello packets with the flag bit set. After receiving a hello acknowledgement packet with the RA flag bit set and all CSNP packets, an interface removes the T1 timer. If an interface has no neighbor or has a neighbor that is not GR-capable, the interface cannot cancel the T1 timer because it cannot receive any hello acknowledgement packet with the RA flag bit set. To avoid this, IS-IS GR specifies the maximum expiration times for T1 timer. When the times a T1 timer expires exceed the specified value, the T1 timer is automatically cancelled.

l              A T2 timer defines the maximum wait time for the synchronization of a LSDB after device reboot. Each LSDB has a T2 timer. For example, a Level-1-2 router has two T2 timers: one for Level-1 LSDB synchronization, and the other for Level-2 LSDB synchronization. When the LSDB synchronization of a level completes, the corresponding T2 timer is cancelled. If LSDB synchronization does not complete within the T2 timer interval, the T2 timer is cancelled and the GR process fails.

l              The T3 timer defines the maximum time of a GR process on a device. An IS-IS router has only one T3 timer, which has an initial value of 65535 seconds. During a GR process, however, the T3 timer will be set to the minimum Remaining Time value among the hello acknowledgement packets with the RA flag bit set received on all interfaces. If the LSDB synchronization does not complete when the T3 timer expires, the T3 timer is cancelled and the GR process fails.

2.2.3  Work Process

As shown in Figure 3, both Router A and Router B run IS-IS.

Figure 3 Network diagram for IS-IS

Suppose Router A maintains stable IS-IS neighboring relationship with Router B and is GR-capable. After Router A restarts, it exchanges routing information with Router B as illustrated in Figure 4.

Figure 4 IS-IS GR work process

The work process of IS-IS GR is as follows:

(1)        After Router A restarts IS-IS globally, T2 and T3 timers are started. When Ethernet 1/1 on Router A is brought up and enabled with IS-IS again, it starts the T1 timer, and sends a hello packet that has the RR flag bit in the Flags field of the Restart TLV set to 1.

(2)        Upon receiving the hello packet, Router B keeps the adjacency with Router A, and returns a hello packet with the RA flag bit set. Then, Router B sends CSNP and LSP packets to Router A for LSDB synchronization.

(3)        If Ethernet 1/1 on Router A receives the hello packet with the RA flag bit set and all CSNP packets, it cancels the T1 timer; if not, it periodically sends hello packets with the RR flag bit set, and does not cancel the T1 timer until it receives a hello packet with the RA flag bit set and all CSNP packets from the peer, or until the T1 timer expiration times reach the threshold.

(4)        When the LSDB synchronization completes, Router A cancels the T2 timer.

(5)        When all T2 timers are canceled, the T3 timer is canceled and the GR process completes. Then, the normal IS-IS process starts. All interfaces start their IIH timer to periodically send normal hello packets (with all fields in the Restart TLV set to 0).

(6)        After recollecting all routing information, Router A performs route calculation and refreshes the FIB table.

Application Scenarios

3.1  Network Diagram

Figure 5 Network diagram for IS-IS GR configuration

3.2  Network Requirements

l              As shown in Figure 5, all routers run IS-IS. Router A and Router B are connected to the backbone. Router F, Router G, Router H, Router I, Router J, and Router K are branch nodes, which are connected to the backbone nodes through core nodes Router C, Router D, and Router E.

l              Enable GR to ensure service continuity on the backbone nodes and core nodes upon protocol restart, and to avoid route flapping.

l              The backbone nodes and core nodes act as GR Restarters (which serve as GR Helpers as well by default), and the branch nodes serve as GR Helpers. If a backbone node performs an active/standby switchover or restarts IS-IS, the core nodes can serve as GR Helpers for LSDB synchronization and ensure service continuity. If a core node performs an active/standby switchover or restarts IS-IS, both the backbone nodes and branch nodes can serve as GR Helpers for LSDB synchronization to ensure service continuity.

References

l              ISO 10589: ISO IS-IS Routing Protocol

l              RFC 1195: Use of OSI IS-IS for Routing in TCP/IP and Dual Environments

l              RFC 3847: Restart Signaling for Intermediate System to Intermediate System (IS-IS)

 

Copyright ©2008 Hangzhou H3C Technologies Co., Ltd. All rights reserved.

No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of Hangzhou H3C Technologies Co., Ltd.

The information in this document is subject to change without notice.