MSTP Technology White Paper(VLAN Igore not Supported)

Key words: STP, RSTP, MSTP, rapid transition, multiple instances, redundancy loop, redundancy link, load sharing

Abstract: This article introduces basic MSTP terms, MSTP algorithm and implementation, MSTP implementations delivered by Comware, and typical MSTP applications.

Acronyms:

Acronym

Full Spelling

STP

Spanning Tree Protocol

RSTP

Rapid Spanning Tree Protocol

MSTP

Multiple Spanning Tree Protocol

CST

Common Spanning Tree

IST

Internal Spanning Tree

CIST

Common and Internal Spanning Tree

MSTI

Multiple Spanning Tree Instance

 



Overview

1.1  Background

In a Layer-2 switched network, loops can cause proliferation and infinite cycling of packets, which incurs broadcast storms. Broadcast storms can cause all network bandwidth to be occupied, making the whole network unavailable.

To address the problem, the Spanning Tree Protocol (STP) was introduced. STP operates on Layer 2. It eliminates Layer 2 loops in a network by selectively blocking specific links. STP also enables link redundancy.

Like other protocols, STP evolves with the development of network technologies. STP first took its form in IEEE 802.1D, from which IEEE 802.1W (RSTP) and IEEE 802.1s (MSTP) are derived.

1.1.1  IEEE 802.1D STP

The idea of STP is to eliminate loops in a network by cutting the network into loop-free tree shape. To achieve this, the concepts of root bridge, root port, designated port, and path cost are introduced. They also help achieve link redundancy and path optimization. The algorithm used by STP to create a tree-shape topology is called the spanning tree algorithm.

Bridges implement STP by sending messages between them. These messages carry information needed for spanning tree calculation and are encapsulated in bridge protocol data units (BPDUs). STP BPDUs are Layer 2 packets received and processed by all STP-enabled bridges. They carry the destination MAC address 01-80-C2-00-00-00, a multicast MAC address.

STP operates as follows:

l              The STP-enabled bridges in the network elect the root bridge. Each bridge has a bridge ID, which consists of bridge priority and bridge MAC address. The one with the lowest bridge ID is elected the root bridge. All the ports of the root bridge are designated ports. The devices attached to the root bridge are downstream bridges.

l              Each bridge attached to the root bridge selects a link, usually the most robust one, connected to the root bridge as the path to the root bridge. The connecting port becomes the root port on the downstream bridge. When all the bridges in the network determine their root ports and designated ports, a spanning tree is established.

l              After the spanning tree converges (usually 30 seconds after the spanning tree is established), the designated ports and the root ports turn to the forwarding state and the rest are blocked.

l              The bridges send STP BPDUs through designated ports periodically to maintain link state. Once network topology changes, spanning tree calculation starts causing port state changes.

Despite all its benefits, STP has some drawbacks, one of which is slow convergence.

There is a delay before configuration BPDUs can propagate throughout the network. The delay is known as the forward delay. With STP, it defaults to 15 seconds. During this period, transient loops may exist because some ports that need to be blocked may be still in the forwarding state. To eliminate transient loops, an intermediate state, that is, the learning state, is introduced. In this state, a port learns MAC addresses but does not forward packets. Because each state transition must undergo a period the same as the forward delay, the mechanism eliminates transient loops that may occur when network topology changes. A convergence delay that is at least two times the forward delay, however, is intolerable for real-time services such as voice and video.

1.1.2  IEEE 802.1w RSTP

To overcome slow convergence of STP, the IEEE released IEEE 802.1w in 2001. IEEE 802.1w defines RSTP. RSTP makes three improvements (as listed below) based on STP to increase the convergence speed remarkably to up to one second.

(1)          Two port roles, Alternate port and Backup port, are introduced for root ports and designated ports. That is, when a root port fails, its alternate port becomes the new root port and transits to the forwarding state without delay; when a designated port fails, its backup port becomes the new designated port and transits to the forwarding state without delay.

(2)          For a point-to-point link connecting to only two switching ports, only one handshake is needed between the designated port and the downstream bridge for the former to transit without delay to the forwarding state. For a shared link connecting to three or more bridges, the downstream bridges do not respond to the handshake requests from the upstream bridge. In this case, a downstream bridge must undergo a period two times the Forward Delay to transit from the blocked state to the forwarding state.

(3)          The edge port is introduced, which refers to the port directly connected to a terminal rather than a bridge. An edge port can transit to the forwarding state without any delay. As a bridge cannot determine whether a port is directly connected to a terminal, edge ports must be specified manually.

RSTP is compatible with STP. You can employ STP and RSTP in the same network. However, as RSTP results in a single spanning tree (SST) in a network just like STP, it suffers from the same drawbacks as any SST protocols do, as described below.

(1)          In a large switched network, adopting a single spanning tree causes a relatively long convergence time.

(2)          With a single spanning tree adopted, all the VLANs in the network share the same spanning tree. You need to make sure that data communications in each VLAN can be carried out along the spanning tree.

(3)          Blocked links do not forward traffic and therefore, do not participate in load balancing. This causes inefficient use of bandwidth.

These drawbacks limit the application of SST protocols and result in the emergence of MSTP, which takes VLANs into account.

1.2  Benefits of MSTP

MSTP is defined in IEEE 802.1S. MSTP enjoys remarkable advantages over STP and RSTP, as listed below.

l              MSTP introduces the term region. MSTP divides a switched network into multiple regions, each containing multiple spanning trees independent of one another. MSTP uses the CIST to exchange information between regions and thus prevents loops in the network.

l              MSTP introduces the term instance. In MSTP, instances are independent spanning trees each corresponding to a group of VLANs. By mapping multiple VLANs to an instance, you can reduce transmission overheads and network resources required. Load balancing is implemented by instances.

l              MSTP allows for rapid port state transition, just like RSTP.

l              MSTP is compatible with STP and RSTP.

MSTP Implementation

2.1  Concepts

Assume that all the switches in Figure 1  are running MSTP. This section explains some basic concepts in MSTP based on the figure.

Figure 1  Basic concepts in MSTP

(1)          MST region

An MST region consists of multiple devices in a switched network and the network segments between them. These devices have the following characteristics:

l           All are MSTP-enabled.

l           They have the same region name.

l           They have the same VLAN-to-instance mapping configuration.

l           They have the same MSTP revision level configuration.

l           They are physically linked with one another.

In Figure 1 , A0 is an MST region.

(2)          VLAN-to-instance mapping table

As an attribute of an MST region, the VLAN-to-instance mapping table describes the mapping relationships between VLANs and MSTIs. In Figure 1 , for example, the VLAN-to-instance mapping table of region A0 is: VLAN 1 is mapped to MSTI 1, VLAN 2 to MSTI 2, and the rest to CIST.

(3)          IST

An internal spanning tree (IST) is a spanning tree that runs in an MST region. ISTs in all MST regions and the common spanning tree (CST) jointly constitute the common and internal spanning tree (CIST) of the entire network. An IST is the section of the CIST in an MST region, as shown in Figure 1 .

(4)          CST

The CST is a single spanning tree that connects all MST regions in a switched network. If you regard each MST region as a “device”, the CST is a spanning tree calculated by these devices through STP or RSTP. For example, the red lines in Figure 1  represent the CST.

(5)          CIST

Jointly constituted by ISTs and the CST, the CIST is a single spanning tree that connects all devices in a switched network.

As shown in Figure 1 , the ISTs in all MST regions plus the inter-region CST constitute the CIST of the entire network.

(6)          MSTI

Multiple spanning trees can be generated in an MST region through MSTP, one spanning tree being independent of another. Each spanning tree is referred to as a multiple spanning tree instance (MSTI). As shown in Figure 1 , multiple spanning trees can exist in an MST region, each spanning tree corresponding to the specified VLANs. These spanning trees are called MSTIs.

(7)          Boundary port

A boundary port is a port that is located on an MST region boundary and is used to connect an MST region to another MST region, or to a single spanning tree region running STP/RSTP.

(8)          Bridge ID

A bridge ID consists of the priority of a bridge and the MAC address of the bridge.

(9)          Common root bridge

The common root bridge is the root bridge of the CIST.

(10)      External root path cost

External root path cost refers to the cost of the shortest path for a packet to travel to the common root bridge.

(11)      Regional root

The root bridge of the IST or an MSTI within an MST region is the regional root bridge of the IST or the MSTI. Based on the topology, different spanning trees in an MST region may have different regional roots.

(12)      Internal root path cost

Internal root path cost refers to the cost of the shortest path for a packet to travel to the regional root bridge.

(13)      Designated bridge ID

A designated bridge ID consists of the priority of a designated bridge and the MAC address of the designated bridge.

(14)      Designated port ID

A designated port ID consists of the priority of a designated port and the port number.

(15)      Port role

MSTP calculation involves these port roles: root port, designated port, master port, alternate port, and backup port. A port can play different roles in different MSTIs. Figure 2  helps understand these concepts.

Figure 2  Port roles

l           Root port: a port responsible for forwarding data to the root bridge.

l           Designated port: a port responsible for forwarding data to the downstream network segment or device.

l           Master port: a port on the shortest path from the current region to the common root bridge, connecting the MST region to the common root bridge.

l           Alternate port: the standby port of a root port or a master port. When the root port or master port is blocked, the alternate port becomes the new root port or master port.

l           Backup port: the backup port of a designated port. When the designated port is blocked, the backup port becomes a new designated port and starts forwarding data without delay. Because two interconnected ports on the same MSTP device can cause loops, the device will block either of the two ports. The one blocked is the backup port.

(16)      Port state

In MSTP, a port stays in one of the following three states depending on whether it learns MAC addresses and forwards user traffic:

l           Forwarding: the port learns MAC addresses and forwards user traffic;

l           Learning: the port learns MAC addresses but does not forward user traffic;

l           Discarding: the port neither learns MAC addresses nor forwards user traffic.

2.2  Algorithm Implementation

2.2.1  Initial state

In this state, each port on each device generates the configuration BPDUs of its own, with the root bridge being the local device, the common root and regional root being the local bridge ID, the internal/external root path cost being 0, the designated bridge ID being the local bridge ID, the designated port being the local port, and the BPDU receiving port being 0.

2.2.2  Port Role Selection Rules

The following table shows the port role selection rules:

Port role

Selection rule

Root port

The port priority vector of the port is superior to its designated priority vector, and the root priority vector of the device is the same as the root path priority vector of the port.

Designated port

The designated priority vector of the port is superior to port priority vector.

Master port

The role of a root port at the region boundary is a master port on an MSTI

Alternate port

The port priority vector of the port is superior to its designated priority vector, but the root priority vector of the device is not the same as the root path priority vector of the port.

Backup port

The port priority vector of the port is superior to its designated priority vector, but the designated bridge ID in the port priority vector is the local bridge ID.

 

&  Note:

l      Port role selection rules involve multiple types of priority vectors. For detailed information about these priority vectors, refer to Determining the Priority Vectors.

l      Once the message priority vector received by a port is superior to the port priority vector, all types of priority vectors will be recalculated and the role of each port involved will be recalculated.

 

2.2.3  Determining the Priority Vectors

The MSTP role of each bridge is calculated based on the information carried in BPDUs. The most important information carried in BPDUs is the spanning tree priority vector. The following part introduces how to calculate the CIST priority vectors and MSTI priority vectors.

1. Determining the CIST priority vectors

The CIST priority vector consists of common root bridge, external root path cost, regional root, internal root path cost, designated bridge ID, designated port ID, and the BPDU-receiving port ID.

For the ease of description, we assume that:

l              In the initial state, the BPDUs that port PB of bridge B sends out carry common root bridge RB, external root path cost ERCB, regional root RRB, internal root path cost IRCB, designated bridge ID B, designated port ID PB, and the BPDU-receiving port ID PB, that is, the value set of {RB, ERCB, RRB, IRCB, B, PB, PB}.

l              The BPDUs that port PB of bridge B receives from port PD of bridge D carry common root bridge RD, external root path cost ERCD, regional root RRD, internal root path cost IRCD, designated bridge ID D, designated port ID PD, and the BPDU-receiving port ID PB, that is, the value set of {RD, ERCD, RRD, IRCD, D, PD, PB}.

l              The BPDUs that port PB of bridge B receives from port PD of bridge D are of higher priority.

The following part describes how to calculate each priority vector based on the assumptions above.

(1)          Message priority vector

Message priority vectors are those carried in MSTP BPDUs. According to the assumptions, the message priority vector of port PB on bridge B is: {RD : ERCD : RRD : IRCD : D : PD : PB}. If bridge B and bridge D are in different regions, the internal root path cost is insignificant to bridge B and will be set to 0.

(2)          Port priority vector

In the initial state, the port priority vector takes the local port as the root. The port priority vector of port PB on bridge B is: {RB  :  ERCB  :  RRB  :  IRCB  :  B  :  PB  :  PB}.

The port priority vector is updated as new BPDUs are received on the port. If the port receives BPDUs superior to the BPDUs that it generates, the port updates its port priority vector according to the received BPDUs; otherwise, the port priority vector of the port does not change. As the priority vector of the BPDUs received on port PB is superior to the port priority vector, the port priority vector is updated into {RD  :  ERCD  :  RRD  :  IRCD  :  D  :  PD  :  PB}.

(3)          Determining the root path priority vector

Root path priority vector is derived from the port priority vector.

l              If the port priority vector received is from another region, then the external root path cost of the root path priority vector is the sum of the path cost of the port and the external root path cost of the port priority vector; the regional root of the root path priority vector is the local regional root; the internal root path cost is 0. Suppose the path cost of port PB is PCPB. Then, the root path priority vector of port PB on bridge B is {RD : ERCD + PCPB : B : 0 : D : PD : PB}.

l              If the port priority vector received is from the same region, then the internal root path cost of the root path priority vector is the sum of the path cost of the port and the internal root path cost of the port priority vector. Thus, the root path priority vector of port PB on bridge B is {RD : ERCD : RRD : IRCD + PCPB : D : PD : PB}.

(4)          Bridge priority vector

In the bridge priority vector, these elements are all 0: common root ID, regional root ID, external root path cost, internal root path cost, designated port ID, and receiving port ID. Both the regional root ID and designated bridge ID are the local bridge ID. Thus, the bridge priority vector of bridge B is {B : 0 : B : 0 : B : 0 : 0}.

(5)          Root priority vector

The root priority vector is the optimal one among bridge priority vector and all the root path priority vectors with their designated bridge ID not the same as the local bridge ID. If the local bridge priority vector is the optimal one, the local bridge is the root of the CIST. Suppose that the bridge priority vector of bridge B is optimal. Then, the root priority vector of bridge B is {B :  0 :  B :  0 :  B :  0 :  0}.

(6)          Designated priority vector

By setting the designated bridge ID and designated port ID of the root priority vector to B and PB, you can obtain the designated priority vector of port PB on bridge B, that is, {B :  0 :  B :  0 :  B :  PB :  0}.

2. Determining the MSTI priority vectors

The way to determine MSTI priority vectors is the same as the way to determine the CIST priority vectors except that:

l              An MSTI priority vector only contains the regional root, internal root path cost, designated bridge ID, designated port ID, and BPDU-receiving port ID, without the common root or external root path cost.

l              Message priority vectors are processed only if they are from the same region.

2.2.4  Determining MSTP Roles

This section describes how to calculate the CIST. In the network shown in Figure 3 , assume that the bridge priority of Switch A is higher than that of Switch B; and the bridge priority of Switch B is higher than that of Switch C. The costs of the links in the network are 4, 5, and 10, as shown in the figure. Switch A and Switch B are in the same region; Switch C is in another region.

Figure 3  A network with MSTP employed

Table 1  lists the initial message priority vectors of the involved ports.

Table 1  Initial message priority vectors of the involved ports

Device

Port

Initial message priority vector

Switch A

AP1

{A : 0 : A : 0 : A : AP1 : 0}

AP2

{A : 0 : A : 0 : A : AP2 : 0}

Switch B

BP1

{B : 0 : B : 0 : B : BP1 : 0}

BP2

{B : 0 : B : 0 : B : BP2 : 0}

Switch C

CP1

{C : 0 : C : 0 : C : CP2 : 0}

CP2

{C : 0 : C : 0 : C : CP2 : 0}

 

Note that in the initial state, the port priority vector and the message priority vector of a port are the same.

In the initial state, all the ports are designated ports. They propagate message priority vectors of their own, with the root bridge being the local device.

1. Role selection on Switch A

Port AP1 and AP2 on Switch A receive packets from Switch B and Switch C and process the priority vectors carried in the packets. As the port priority vectors of AP1 and AP2 are superior to the message priority vectors received, AP1 and AP2 remain designated ports; Switch A acts as the common root bridge and the root of the region to which Switch A and Switch B belong. From then on, the port sends messages periodically, with the local device as the root bridge.

 

&  Note:

The port priority vector is compared with the message priority vector following these steps:

l      Compare each element in the port priority vector with the corresponding element in the message priority vector. A smaller value wins out. If each element in the port priority vector is equal to the corresponding one in the message priority vector, the two vectors are equal.

l      When the message priority vector is superior to the port priority vector or when the designated bridge MAC address and designated port ID of the message priority vector are equal to those in the port priority vector, the message priority vector replaces the port priority vector.

 

2. Role selection on Switch B

After port BP1 on Switch B receives packets from port CP1 on Switch C, Switch B compares the received message priority vector with the port priority vector of BP1. As the port priority vector is superior to the message priority vector, the role of BP1 remains unchanged.

Switch B operates as follows when receiving packets through port BP2 from port AP2 on Switch A:

(1)          Compares port priority vector of BP2 with the message priority vector received. As the message priority vector is superior to the port priority vector, the message priority vector {A:0:A:0:A:AP2:BP2} is used as the port priority vector of BP2.

(2)          Determines the root path priority vector of BP2. As Switch A and Switch B are in the same region, the root path priority vector of BP2 is {A : 0 : A : 10 : A : AP2 : BP2}.

(3)          Determines the root priority vector of Switch B. As the root path priority vector of port BP2, which is {A : 0 : A : 10 : A : AP2 : BP2} (as determined in step 2), is superior to the bridge priority vector of Switch B, the root priority vector of Switch B is {A : 0 : A : 10 : A : AP2 : BP2}.

(4)          Determines designated priority vectors. The designated priority vector of port BP1 is {A : 0 : A : 10 : B : BP1 : BP2}, and that of BP2 is {A : 0 : A : 10 : B : BP2 : BP2}.

(5)          Determines port roles. Through designated priority vector and port priority vector comparison, BP1 operates as a designated port and sends messages periodically with Switch A as the common root and the regional root; BP2 operates as the root port.

3. Role selection on Switch C

In the beginning, Switch C receives from port BP1 the message priority vector {B : 0 : B : 0 : B : BP1 : CP1} on port CP1 and from port AP1 the message priority vector {A : 0 : A : 0 : A : AP1 : CP2} on port CP2. As the two message priority vectors are superior to the port priority vectors of the receiving ports, the port priority vectors are updated. As Switch C is not in the region where Switch A and Switch B reside, the root path priority vector of port CP1 is updated as {B : 5 : C : 0 : B : BP1 : CP1}, and that of port CP2 is updated as {A : 4 : C : 0 : A : AP1 : CP2}. Accordingly, the root priority vector is {A : 4 : C : 0 : A : AP1 : CP2}, and the designated priority vectors of port CP1 and CP2 are {A : 4 : C : 0 : C : CP1 : CP2} and {A : 4 : C : 0 : C : CP2 : CP2}. As a result, port CP1 is selected as a designated port, and port CP2 as the root port.

When port CP1 receives the updated message priority vector {A : 0 : A : 10 : B : BP1 : CP1}, it replaces the original port priority vector with the message priority vector because the latter is superior and calculates that the root path priority vector is {A : 5 : C : 0 : B : BP1 : CP1}. As the message priority received on CP2 does not change, the root path priority vector of CP2 remains {A : 4 : C : 0 : A : AP1 : CP2}. As the root path priority vector of CP2 is superior to that of CP1, the root priority vector becomes {A : 4 : C : 0 : A : AP1 : CP2}. The designated priority vectors of CP1 and CP2 are {A : 4 : C : 0 : C : CP1 : CP2} and {A : 4 : C : 0 : C : CP2 : CP2}. Even though the port priority vector of CP1 is superior to its designated priority vector, CP1 still operates as an alternate port and CP2 remains the root port, because the root priority vector is not derived from the root path priority vector of CP1.

2.2.5  Calculation Result

As the role of each device and port has been determined, the whole tree topology is established. The traffic forwarding path is as shown in Figure 4 .

Figure 4  Resulting traffic forwarding path

STP Implementation in Comware

3.1  MSTP Work Modes

MSTP and RSTP are mutually compatible and thus can recognize each other’s BPDUs. STP, however, is unable to recognize MSTP packets. For hybrid networking with legacy STP devices and for full interoperability with RSTP-enabled devices, MSTP supports three work modes: STP-compatible mode, RSTP mode, and MSTP mode.

l           In STP-compatible mode, all ports of the device send out STP BPDUs,

l           In RSTP mode, all ports of the device send out RSTP BPDUs. When the device detects that it is connected to a legacy STP device, the connecting port automatically migrates to the STP-compatible mode.

l           In MSTP mode, all ports of the device send out MSTP BPDUs. When the device detects that it is connected to a legacy STP device, the connecting port automatically migrates to the STP-compatible mode.

In a switched network, when a port on the device running MSTP (or RSTP) is connected to a device running STP, this port automatically migrates to the STP-compatible mode. However, when the device running STP is removed, the port cannot automatically migrate back to the MSTP (or RSTP) mode and will remain working in STP-compatible mode. In this case, you can perform an mCheck operation to force the port to migrate to the MSTP (or RSTP) mode.

You can create multiple instances on devices operating in STP mode and RSTP mode. In this case, the state of a port in an MSTI is the same as that in the CIST. To avoid excessive CPU utilization, do not create multiple instances on devices that operate in STP mode or RSTP mode.

3.2  Calculating the Default Path Cost

The MSTP implementation in Comware supports the default path cost calculation methods defined in IEEE 802.1D-1998, IEEE 802.1T, and the Comware-proprietary standard.

For how the default path cost is calculated in IEEE 802.1D-1998 and IEEE 802.1T, refer to the two protocols. This section describes the Comware-proprietary default path cost calculation method and the extensions to the IEEE 802.1D-1998 and IEEE 802.1T standards.

(1)          Extension to the IEEE 802.1D-1998 standard

In IEEE 802.1D-1998, default path cost calculation for an aggregate link is the same as that for a single link. It does not take into account the number of links in the aggregation group.

(2)          Extension to the IEEE 802.1T standard

In IEEE 802.1T, the default path cost is determined by this expression: 20,000,000,000/Link Speed (Kbps), where, for a link aggregation group, Link Speed is the sum of the speeds of the selected ports in the aggregation group.

(3)          Comware-proprietary standard

The Comware-proprietary standard for determining the default path cost is outlined as follows:

The speed of a link aggregation group is the sum of the rates of the unblocked ports in the link aggregation group.

3.3  Timeout Time Factor

The protocol defines that the timeout time of a received STP PDU (referred to as rcvdinfowhile) on a port be less than or equal to three times the hello time. If no new STP PDU is received when rcvdinfowhile expires, the port initiates STP calculation. In actual use, ports failing to receive STP PDUs when rcvdinfowhile expires has been a common cause of topology instability. To address the problem, the timeout time factor is introduced. It allows you to change the timeout timer for STP PDUs according to the performance of your network to improve network stability.

 

&  Note:

l      Timeout time = timeout factor × 3 × hello time.

l      Normally, we recommend that you set the timeout factor to 5, 6, or 7 in a stable network.

 

3.4  Specifying Root Bridge and Secondary Root Bridge

Normally, the root bridge is determined by STP. You can also specify the root bridge at the CLI.

Usually, when the root bridge of an instance fails or is shut down, a secondary root bridge (if configured) becomes the root bridge for the instance. However, if you specify a new root bridge for the instance, the secondary root bridge will not become the root bridge. With multiple secondary root bridges configured for STP, the one with the least MAC address becomes the root bridge when the existing root bridge fails. When both the root bridge and the configured secondary root bridges fail, STP re-elects a root bridge through calculation.

In MSTP, you can specify a switch as the root bridge or the secondary root bridge of a specified spanning tree instance. A switch can play different roles in different spanning tree instances. For example, it can be the root bridge in a spanning tree instance while a secondary root bridge in another spanning tree instance. In the same spanning tree instance, however, a switch cannot be both the root bridge and a secondary root bridge.

3.5  BPDU Guard

Normally, access ports of the devices operating on the access layer are directly connected to user terminals such as PCs or file servers. These ports are usually configured as edge ports to implement rapid transition and do not receive configuration BPDUs. If an edge port receives configuration BPDUs, the switch will re-configure it as a non-edge port and starts spanning tree calculation. Attackers may exploit this weakness to send BPDUs deliberately fabricated to edge ports, causing network topology instability. To prevent this type of attacks, you can use the BPDU guard function.

With this function enabled, a switch shuts down edge ports that receive configuration BPDUs and then reports these cases to the administrator. These ports can be restored only by the network administrator. You are recommended to use the function on switches configured with edge ports.

3.6  Root Guard

The root bridge in a network may receive configuration BPDUs with higher priority because of configuration errors or network attacks. This can cause a new root bridge to be elected, causing network topology instability. The topology change may cause traffic traveling a high-speed link to be switched over to a low-speed link, resulting in congestion. To avoid the situation, you can use the root guard function.

Ports with this function enabled can only be kept as designated ports. Once a port of this type receives configuration BPDUs with higher priority, it changes to the listening state and stops forwarding any packets as if the port was disconnected. The port resumes the normal state if it does not receive any configuration BPDUs with higher priority for a specified period.

In MSTP, this function applies to all the instances.

3.7  Loop Guard

A switch maintains the state of each port by receiving and processing BPDUs from the upstream switch. These BPDUs may get lost because of network congestions or unidirectional link failures. If a switch does not receive BPDUs from the upstream switch for a certain period, the switch selects a new root port; the original root port becomes a designated port; and the blocked ports turn to the forwarding state. All these events result in loops in the network. The loop guard function suppresses loops caused by these events. With this function enabled, a root port turns to the Discarding state when its role changes and remains in the Discarding state without forwarding packets. Thus, loops are prevented.

In MSTP, this function is only applicable to root ports, alternate ports, and backup ports.

Following is an example for implementing loop guard.

Figure 5  An example for implementing loop guard function

Both Switch A and Switch B are Comware switches. Suppose Switch A is the root switch. On Switch B, GigabitEthernet 2/1 is the root port; GigabitEthernet 2/2 is an alternate port. If the root port (that is, GigabitEthernet 2/1) on Switch B fails to receive BPDUs from Switch A in a specific period, Switch B re-calculates the roles of its ports, setting GigabitEthernet 2/1 to be a designated port in the forwarding state and GigabitEthernet 2/2 to be the root port in the forwarding state. Because both links between Switch A and Switch B can forward packets, a broadcast packet can result in a broadcast storm. You can prevent this problem by employing loop guard on GigabitEthernet 2/1. With the loop guard function enabled, GigabitEthernet 2/1 turns to the discarding state if it receives no BPDU from Switch A. At the same time, GigabitEthernet 2/2 transits to the Forwarding state to forward packets. Thus, loops are prevented.

3.8  TC BPDU Attack Guard

When receiving TC-BPDUs (BPDUs used to notify topology changes), the switch flushes the forwarding address entries. If someone forges TC-BPDUs to attack the switch, the switch will receive a larger number of TC-BPDUs within a short time and be busy with forwarding address entry flushing. This affects network stability.

With the TC-BPDU guard function, you can set the maximum number of immediate forwarding address entry flushes that the switch can perform within 10 seconds after receiving the first TC-BPDU. For TC-BPDUs received in excess of the limit, the switch performs forwarding address entry flush only when the 10-second timer expires. This prevents frequent deletion of forwarding address entries.

3.9  Digest Snooping

According to IEEE 802.1S, two connected switches can communicate with each other in an MSTI only when their MST region-related configurations (that is, region name, revision level, and VLAN-to-MSTI mapping) are the same. With MSTP employed, interconnected switches determine whether or not they are in the same MST region by checking the configuration IDs in the BPDUs they exchange. Configuration ID comprises region name, revision level, and configuration digest. Configuration digest is a 16-byte signature obtained from VLAN-to-MSTI mapping by using the HMAC-MD5 algorithm.

In the MSTP implementations of different vendors, the way to calculate configuration may differ. It is possible that a Comware switch and a switch of another vendor are not in the same region even if they have the same region configuration. In this case, they can only communicate with each other in the CIST but not in an MSTI.

For compatibility sake, the digest snooping function was developed in Comware to enable switches adopting different ways to obtain the configuration digest to communicate with each other in an MSTI.

To enable a Comware switch to communicate with a directly connected switch of another vendor in an MSTI, you need first to make sure that the region configuration of the switches are the same and then enable the digest snooping function on the port connecting the Comware switch to the other vendor’s switch. With digest snooping enabled, the port regards the MSTP packets received from the other vendor’s switch as from the same region and stores the configuration digest carried in the packets. The configuration digest is then inserted to the packets sent through the port. When the other vendor’s switch receives the packets, it regards them as from the same region where it is located. In this way, the Comware switch and switches of other vendors are able to communicate with each other in an MSTI region.

 

  Caution:

l      The digest snooping function requires that the interconnected switches involved have the same region configuration. Otherwise, broadcast storms may occur due to different VLAN-to-instance mappings.

l      To enable a Comware switch to communicate with switches of other vendors, you need to enable the digest snooping function on each port directly connecting the Comware switch to a switch of another vendor. Note that you cannot enable digest snooping function on edge ports.  

l      When the digest snooping function is enabled, do not modify the region configuration of the Comware switch and the switches of other vendors directly connected to it. To do that, you need to disable digest snooping first to prevent possible broadcast storms caused by inconsistent VLAN-to-instance mappings.

l      The digest snooping function is not needed in regions that only contain Comware switches.

 

Following is an example for implementing digest snooping.

Figure 6  An example for implementing digest snooping

Switch A and Switch B are Comware switches while Switch C is a switch of another vendor. It adopts a non-standard algorithm to calculate the configuration digest. All the three switches are MSTP-enabled and have the same region configuration.

To enable both Switch A and Switch B to communicate with Switch C in the region, you need to enable the digest snooping function on GigabitEthernet 1/1 of Switch A and GigabitEthernet 1/2 of Switch B. As Switch A and Switch B are Comware switches, the function is unnecessary on the ports that connect them.

3.10  No Agreement Check

As defined in standard MSTP, a rapid transition occurs on a designated port when the port receives a packet carrying the agreement flag from the downstream root port. A root port, however, sends packets carrying the agreement flag only when it receives a packet carrying the agreement flag from the upstream designated port. This can cause fast transition failure of the designated port on a switch that do not send packets with the agreement flag, on an RSTP-enabled switch for example, when the switch is connected to a downstream MSTP-enabled Comware switch. The reason is that the designated port on the upstream switch cannot receive acknowledgement packets with the agreement flag from the downstream switch because it never sends packets with the agreement flag. You can solve this problem by enabling the no agreement check function on a port uplinked your Comware switch to an RSTP-enabled switch or another vendor’s switch that uses a different implementation.

Following is an example for implementing no agreement check.

Figure 7  An example for implementing no agreement check

Switch A is the root switch with RSTP-enabled. GigabitEthernet 2/1 on Switch B is the root port. To make rapid transition available on GigabitEthernet 1/1 of Switch A, you need to enable no agreement check on GigabitEthernet 2/1 of Switch B.

3.11  Standard MSTP Packet Compatibility

This function enables Comware switches to communicate with devices adopting the standard MSTP protocol packet format. It ensures that spanning trees can be determined correctly in a network containing both devices adopting the standard protocol packet format and devices adopting proprietary protocol packet formats.

By default, a Comware device can recognize the format of each received MSTP packet and sends packets in the format of the received MSTP packet. You can also specify the packet format to have the device receive/send packets that are only of the specific format. A Comware device can communicate with other devices in the CIST even if it operates in RSTP-/STP-compatible mode.

3.12  VLAN Ignore Function

Normally, on switches with MSTP enabled, each VLAN is mapped to an MSTI and the forwarding states of the ports in the VLAN are determined by MSTP. In a network with a relatively complex topology, parts of a VLAN may be isolated from the rest of the VLAN by the spanning tree, causing transmission failures in the VLAN. VLAN ignore function is designed to solve this problem. With this function enabled in a VLAN, each port in the VLAN remains in the forwarding state instead of those determined by MSTP. If you disable this function in the VLAN, the states of the ports are determined by MSTP.

Applications Scenarios

MSTP enables packets of different VLANs to be transmitted over different spanning trees, and thus implement per-VLAN load balancing and link redundancy.

 

Figure 8  Network diagram for an MSTP implementation

As shown in the above figure, Switch A and Switch B operate on the distribution layer. Switch C and Switch D operate on the access layer.

To achieve proper load balancing, you can perform configurations on the devices to:

l              Assign all devices to the same MST region.

l              Forward packets of VLAN 10 over MSTI 1, whose root bridge is Switch A.

l              Forward packets of VLAN 20 over MSTI 2, whose root bridge is Switch B.

l              Forward packets of VLAN 30 over MSTI 3, whose root bridge is Switch C.

l              Forward packets of VLAN 40 over MSTI 4, whose root bridge is Switch D.

After MSTP calculation is completed, packets of different VLANs are forwarded over different paths as shown in Figure 9 . This forwarding topology not only balances traffic across links but also achieves redundancy to decrease data loss by providing a backup link for each VLAN.

Figure 9  Traffic forwarding topology

Summary

MSTP overcomes the drawbacks of STP and RSTP. It achieves fast convergence and enables packets of different VLANs to travel separate paths, allowing for a better load balancing mechanism for redundant links. MSTP is flexible and suitable for complex networks. Moreover, it is easy to configure. In the simplest cases, you only need to enable MSTP for it to operate normally; if needed, you can select a path in a VLAN for traffic transmission by configuring bridge priority, region settings, and port path cost.

References

IEEE 802.1D, Spanning Tree Protocol

IEEE 802.1w, Rapid Spanning Tree Protocol

IEEE 802.1s, Multiple Spanning Tree Protocol

 

 

Copyright ©2008 Hangzhou H3C Technologies Co., Ltd. All rights reserved.

No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of Hangzhou H3C Technologies Co., Ltd.

The information in this document is subject to change without notice.