VRRP Troubleshooting Manual

1.1  VRRP Overview

Virtual Router Redundancy Protocol (VRRP) combines a group of routers (including a master and multiple backups) on a LAN into a virtual router called VRRP group. By using an election mechanism, VRRP ensures that only the master forwards all packets and a backup takes over automatically in the event that the master fails.

As an error-tolerant protocol, VRRP improves network reliability and increases network availability.

1.2  Problems

1.2.1   Problem 1: Multiple Masters Are Present in a VRRP Group

I. Symptoms

Hosts on a LAN cannot communicate with the external network. View the state of each router in the VRRP group by using the display vrrp command or through SNMP, and find out that multiple VRRP routers are in the master state.

II. Problem location

Follow these steps to locate the problem:

1)         Check if the VRRP configurations on the routers are consistent.

In a VRRP group, the routers must have the same VRRP configurations, including virtual IP address, advertisement interval, authentication mode and authentication key.

2)         Check the interoperability of ports.

Check whether the interconnected ports are in the up state. Check the configurations of the interconnected ports: if the ports are trunk ports or hybrid ports, check whether their PVIDs are consistent, whether the VLAN to which the VRRP group belongs is permitted on the ports, and whether the 802.1x protocol is configured on the ports. Check whether the ports are blocked because of the STP, RRPP, Smart-link, or LACP protocol. Use the display interface command to view whether a large number of error packets are present.

3)         Check the transmission of VRRP packets

Enable VRRP debugging to check whether packets are transmitted normally. If the VRRP packet debugging information is not displayed, enable IP packet debugging to view the information.

4)         Check the CPU usage

Use the display cpu-usage command to view whether the CPU usage is high on the service boards and main boards where VRRP protocol packets are transmitted.

You can use the display interface command to view the traffic on the ports and check whether there is a broadcast storm on the network. If so, the VRRP packets cannot be sent to the CPU, and the state of the VRRP group must be abnormal.

III. Solution

1)         VRRP configurations on routers are not consistent

Use the display this command in interface view to display the validated configuration information. Check to ensure that the configuration on one router is correct and modify the configuration on the other routers to make the configurations consistent.

2)         Ports are not interoperable

l           If the VLAN to which the VRRP group belongs is not permitted on the ports or the PVIDs are not consistent, modify the related configurations.

l           If the VRRP packets cannot be transmitted because the ports are blocked by protocols such as STP, modify the STP priority configurations on the ports to ensure that the interconnected ports can forward VRRP packets normally.

l           If there are a large number of error packets on the ports, you need to check the links, for example, whether the optical attenuation values on both ends of a link are in the correct range. If not, change the fiber.

3)         The transmission of VRRP packets is abnormal

If the ports are interoperable, but the VRRP packet debugging information still cannot be displayed, it is probably because the VRRP packets are discarded. If it is caused by the speed limitation of the CPU, you can reduce the number of VRRP routers or modify the time interval for the master in the VRRP group to send VRRP advertisements.

4)         The CPU usage is high

Disable unnecessary services. If the CPU usage is still very high, contact our technical support personnel.

1.2.2  The State of VRRP Routers is Unstable

I. Symptoms

The state of one or more VRRP routers is unstable, and state switchover, such as backup->master->backup or master->initialized->backup, occurs frequently.

II. Problem location

Follow these steps to locate the problem:

1)         Check the link state of the flapping VRRP routers

l           Observe the log information to check whether the state of the link has changed (up or down) before the VRRP state changes. Because the link state influences the state of a VRRP router, unstable link state causes VRRP flapping.

l           Pay attention to the link states of the VRRP tracked interface. Because the priority of a VRRP router changes with the link state of the tracked interface, unstable link state of the tracked interface causes VRRP flapping.

2)         Check the interoperability of the network where the VRRP routers reside

On a VRRP router, ping the actual IP address of the interface at the peer end to check the interoperability of the network of VRRP routers. If you cannot ping the interface or discontinuous interruption occurs, check whether there is a loop on the link.

3)         Check the protocols like STP

Execute the display stp brief command and observe if the STP state is normal. The frequent switch of STP state causes the frequent switch of the VRRP state, so you must ensure that the STP state of the link on which the VRRP packets are transmitted is stable. Check whether the ports are blocked because of the STP, RRPP, Smart-link, or LACP protocol.

4)         Check the transmission of VRRP packets

Refer to 1.2.1  II. 4) for details. Observe the VRRP packet debugging information. On a congested network, backup routers probably receive VRRP advertisements after their timers expire. Therefore, they consider that the master fails and consider themselves as the masters and send VRRP advertisements, and thus causes the frequent master/backup switchover.

5)         Check the CPU usage

Refer to 1.2.1  II. 4) for details.

6)         Ask for further help

If the problem cannot be located, contact our technical support personnel.

III. Solution

l           If a link fails, check the networking.

l           If ports are blocked because of protocols like STP, you can refer to the corresponding configuration manuals to modify the configuration, or disable the protocol as needed;

l           If the transmission of packets is abnormal, refer to 1.2.1  III. 3).

l           On a congested network, increase value of the VRRP preemption delay timer of the backups.

l           If the CPU usage is high, refer to 1.2.1  III. 4).

1.2.3  The VRRP Group Failed to Perform Layer 3 Forwarding

I. Symptoms

The host which uses the virtual IP address of the VRRP group as its default gateway cannot access the external network.

II. Problem location

Follow these steps to locate the problem:

1)         Check whether the VRRP group state is stable

Check whether the VRRP group has one master and multiple backups. If there are more than one master in the VRRP group, refer to 1.2.1  Problem 1: Multiple Masters Are Present in a VRRP Group; if the state of the VRRP group is unstable, refer to 1.2.2  The State of VRRP Routers is Unstable.

2)         Check the interoperability of ports

Check the configuration information on the ports through which the traffic passes, determine whether 802.1x or ACL is enabled on the ports, and whether the VLAN to which the VRRP group belongs is permitted on the ports.

Use the display stp brief command to view if the STP state of the ports through which the traffic passes is normal and if the state is stable. Check whether the ports are blocked because of the STP, RRPP, Smart-link, or LACP protocol.

3)         Check the ARP entries and MAC address entries

When the network topology changes, the ARP entries and MAC address entries may not be updated timely, and thus the packets cannot be forwarded normally.

Check the following items one by one:

l           Use the display arp command to check whether the master has learned the ARP entries of the hosts on the internal network.

l           Check the ARP entries learned by the hosts on the internal network. The association between the virtual IP address and MAC address learnt by the hosts on the internal network should be consistent with that configured for the VRRP group. That is, for the real MAC mode, the virtual IP address is associated with the real MAC address of the interface on the master; for the virtual MAC mode, the virtual IP address is associated with the virtual MAC address of the VRRP group.

l           Check the MAC address entries learned by the switch connecting to the routers. Check and make sure that the MAC address (real MAC mode) or virtual MAC address (virtual MAC mode) of the master is associated with the correct switch interface.

4)         Check Layer 3 interoperability

Ping each other at both ends of the VRRP service. The success of the ping operation indicates that Layer 3 forwarding is normal, and you need to check whether the service packets sent by the hosts are correct.

5)         Check route entries

If the ping operation fails, check whether the route entries of the VRRP routers are correct, whether a correct static route is configured, and whether the routing protocol works normally.

6)         Check the transmission of packets

Enable IP packet debugging, observe the fields like TTL, and determine whether the packets are discarded because the fields are filled incorrectly.

III. Solution

1)         VRRP status is unstable

Refer to the related parts in 1.2.1  Problem 1: Multiple Masters Are Present in a VRRP Group and 1.2.2  The State of VRRP Routers is Unstable.

2)         Incorrect port configuration or STP configuration

Modify the PVID configuration on the trunk or hybrid ports, delete useless ACL rules and 802.1x configurations, and modify STP priority configurations to ensure the Layer 2 interoperability.

3)         Route problem

Correct the configurations of the static route or dynamic routing protocol.

4)         ARP learning problem

Refer to ARP Troubleshooting.

5)         The transmission of packets is abnormal.

Check whether the service packets sent are correct. If the packets are correct, contact our technical support personnel.

1.2.4  The Virtual IP Address of the VRRP Group Cannot Be Pinged Successfully

I. Symptoms

Only one master exists in the VRRP group, but the host cannot successfully ping the virtual IP address of the VRRP group.

II. Problem location

Follow these steps to locate the problem:

1)         Check whether the host can successfully ping the actual IP address of the interface on the master.

2)         Execute the display ip routing-table command and check whether the host route corresponding to the virtual IP address is present.

III. Solution

1)         If the actual IP address of the interface on the master cannot be pinged successfully, check whether the link is connected correctly and whether the transmission of packets is normal. For details, refer to 1.2.1  Problem 1: Multiple Masters Are Present in a VRRP Group.

2)         Check the route corresponding to the virtual IP address of the VRRP group in the routing table. If the corresponding route of the virtual IP address is not present, add it by using the static or dynamic route protocol. Check whether the corresponded host route is present. The absent of the host route indicates that the cooperation between the VRRP and route fails, contact our technical support personnel.

<Sysname> display ip routing-table

Routing Tables: Public

         Destinations : 5        Routes : 5

 

Destination/Mask    Proto  Pre  Cost         NextHop         Interface

 

127.0.0.0/8         Direct 0    0            127.0.0.1       InLoop0

127.0.0.1/32        Direct 0    0            127.0.0.1       InLoop0

172.1.0.0/16        Static 60   0            192.168.1.2     GE0/0

192.168.1.0/24      Direct 0    0            192.168.1.1     GE0/0

192.168.1.1/32      Direct 0    0            127.0.0.1       InLoop0

1.2.5  VRRP Tracking Fails

I. Symptoms

Configure VRRP to track an uplink, but when the uplink goes down, no higher priority router in the VRRP group becomes the master; or when the uplink is up, the priority of the master is reduced.

II. Problem location

Follow these steps to locate the problem:

1)         Execute the display vrrp verbose command to check whether the priority of the master is reduced by a specified value.

2)         Check the VRRP configurations. If the VRRP is configured to track a track object instead of an interface (the currently supported detect protocols are BFD and NQA): when the master monitors an uplink, you need to specify the value by which the priority decreases by using the vrrp vrid track reduced command; when a backup monitors the master, you need to configure the backup to work in the switchover mode using the vrrp vrid track switchover command.

3)         After VRRP is configured to monitor an interface, if the VRRP priority is always reducing, check the protocol state of the tracked interface.

III. Solution

1)         If the priority of the master is reduced, but a backup of a higher priority in the VRRP group does not change its role, it means that the priority configuration is incorrect. You can increase the decreased value to ensure that after being reduced, the priority of the master is lower than that of the backup. If the track object is misconfigued, modify the configuration.

2)         The protocol to be tracked depends on the type of the monitored interface, so make sure that the VRRP monitors a right protocol. For an upper layer protocol, the IPv4-based VRRP should monitor IPv4 protocols, and the IPv6-based VRRP should monitor IPv6 protocols. If the VRRP is configured to track a track object, check the state of the track object.

1.3  Troubleshooting Procedures

1.3.1  Troubleshooting Flow

 

Figure 1-1 VRRP troubleshooting flow

1.3.2  Troubleshooting Procedures

I. Check whether the VRRP configurations are correct

Execute the display this command in interface view and observe whether the configurations on the VRRP routers are consistent and whether the priority settings (including priority setting and the reduced priority value of the tracked interface) are reasonable. VRRP requires the configuration consistency of routers in a VRRP group, that is, the virtual IP address, advertisement interval, authentication mode and authentication key must be configured the same on these routers. Besides, the number of the delivered virtual MAC addresses is limited. If the number of VRRP groups exceeds the limitation on a device, some virtual MAC addresses cannot be delivered, and the corresponding VRRP groups return to the initialize state.

II. Check the interoperability of the link

Check configurations on the ports, including VLAN configuration, 802.1x, STP state, and so on. Ping the actual IP address of the VRRP interface to see if you can ping it successfully. Check the configurations on the interconnected ports: if the ports are trunk ports or hybrid ports, check whether their PVIDs are consistent, whether the VLAN to which the VRRP group belongs is permitted on the ports, and whether the 802.1x protocol is configured on the ports. Execute the display stp brief command and observe if the STP state is normal. Check whether the ports are blocked because of the STP, RRPP, Smart-link, or LACP protocol.

III. Check the transmission of VRRP packets

Enable VRRP packet debugging and check whether the packets are transmitted normally on the specified interface.

Enable the Ethernet packet debugging through the debugging ethernet packet command and the IP packet debugging through the debugging ip packet command, check whether the destination MAC address of the Ethernet packets is 0100-5e00-0012, which is the multicast MAC address of VRRP, and whether the IP protocol number of the VRRP packets is 112. If packets do not reach the ingress interface, check the information of the peer end.

IV. Check ARP and route

Check the ARP entries and MAC entries on the access switch (check the association between the MAC address, port and VLAN. Check on each host whether the ARP entry corresponding to the virtual IP address is correct.

If the master router cannot be pinged through, check whether the host route corresponding to the virtual IP address on the master router exists. If packet forwarding problem occurs, check the forwarding route. If route fails, check whether the route protocol operates normally and whether the static route is configured correctly by using route troubleshooting methods.

V. Check the CPU usage

Execute the display cpu-usage command to check whether the CPU usage of the device is high. Disable unnecessary services to reduce the CPU usage.

1.4  Troubleshooting Cases

1.4.1  Network Interruption Caused by VRRP Master and Backup Switchover

I. Network environment

Figure 1-2 Network interruption caused by VRRP master and backup switchover

II. Symptoms

As shown in Figure 1-2, configure 50 VRRP groups that use the virtual MAC mode on Router A and Router B respectively. Router A is the master for the 50 VRRP groups, and Router B is a backup for the 50 VRRP groups. On Switch, the port associated with the virtual MAC address is Ethernet 1/0. Execute the shutdown command to shut down and then the undo shutdown command to bring up the port Ethernet 1/0 quickly. After that, some virtual MAC addresses correspond to port Ethernet 1/1 instead of Ethernet 1/0.

III. Troubleshooting procedures

Check the MAC address entries of Switch and find out that the virtual MAC address of the VRRP group corresponds to port Ethernet 1/1, which is the port connecting to Router B, the backup router, after the switchover.

l           One of the possible reasons is that the sending of the gratuitous ARP messages fails. Therefore, first check whether Router A has sent gratuitous ARP messages or not. Then, execute the debugging arp packet command to enable ARP packet debugging of Router A, repeat the switchover procedure, and find out that Router A has sent gratuitous ARP messages.

l           Another possible reason is that the MAC address entries of Switch A are not updated. Configuring too many VRRP groups causes the sending of a great amount of ARP packets, which exceeds the processing capability of Switch. Therefore, some ARP packets are not processed. Reduce the number of VRRP groups, repeat the shun-down and undo shun-down operations, and the problem will be solved.

1.4.2  Network Traffic Was Interrupted Twice When an Interface Was Down

I. Network environment

Figure 1-3 Network interruption caused by interface down

II. Symptoms

As shown in Figure 1-3, Router A and Router B form a VRRP group. Router A has a higher priority and functions as the master. Configure Router A to monitor the uplink interface Ethernet 1/1. Run OSPF on Router A, Router B, and Router C,

Ping Router C on Switch, and execute the shutdown command on interface Ethernet 1/1 on Router A.

<Switch> ping 100.100.100.100

    Reply from 100.100.100.100: bytes=56 Sequence=188 ttl=254 time=3 ms

    Reply from 100.100.100.100: bytes=56 Sequence=189 ttl=254 time=2 ms

    Reply from 100.100.100.100: bytes=56 Sequence=190 ttl=254 time=2 ms

    Request time out

    Request time out

    Reply from 100.100.100.100: bytes=56 Sequence=193 ttl=254 time=3 ms             

    Request time out

    Reply from 100.100.100.100: bytes=56 Sequence=195 ttl=254 time=3 ms

    Reply from 100.100.100.100: bytes=56 Sequence=196 ttl=254 time=2 ms

When the status of interface Ethernet 1/1 changes from up to down, the priority of Router A is decreased and Router B becomes the master. Theoretically, the traffic should be interrupted only once after the interface is down and before the master and backup switchover is completed; however, the traffic was interrupted for twice actually.

III. Troubleshooting procedures

1)         Enable VRRP packet debugging and VRRP state debugging, and repeat the above operations. No VRRP state flapping occurs. No problem is found out during the whole process: The tracked interface is down, the priority of the master is decreased, the master sent a VRRP advertisement, a backup with a higher priority quickly becomes the master, and the new master sent gratuitous ARP messages.

2)         Check the ARP entries on Switch. Check the egress corresponding to the virtual IP address of the VRRP group and the STP state of each port, and no problem is found out.

3)         Check the transmission of packets on Router C. Through the debugging information, find out that packet receiving on Router C was interrupted only once. Apparently, this is because the link of interface Ethernet 1/1 on Router A failed, so the packets forwarded by Router A did not reach Router C. We can draw the conclusion that the other interruption happened because Router B sent packets to Router A, but the response packets of Router C did not reach Switch.

4)         Check the routing information on Router C. Because OSPF is running on Router C, check OSPF routing information.

<RouterC> display ospf routing

 

          OSPF Process 1 with Router ID 5.5.5.30

                   Routing Tables

 

 Routing for Network

 Destination        Cost     Type    NextHop         AdvRouter       Area

 100.100.100.100/32 0        Stub    100.100.100.100 5.5.5.30        0.0.0.0

 5.5.5.0/24         2        Transit 2.2.2.2         10.10.10.2      0.0.0.0

 5.5.5.0/24         2        Transit 10.10.10.2      10.10.10.2      0.0.0.0

 2.2.2.0/24         1        Transit 2.2.2.1         5.5.5.10        0.0.0.0

 10.10.10.0/24      1        Transit 10.10.10.1      5.5.5.30        0.0.0.0

 

 Total Nets: 5

 Intra Area: 5  Inter Area: 0  ASE: 0  NSSA: 0

From the OSPF routing information, you can see that there are two equal cost routes to the 5.5.5.0/24 segment, and the next-hop of one route is 2.2.2.2 (Router A), and that of the other route is 10.10.10.2 (Router B).

Because there are two equal cost routes, when Router C sends the response packets, it uses both interfaces Ethernet 1/1 and Ethernet 1/2 to share load. After the interface Ethernet 1/1 on Router A fails, when the master and backup switchover is completed, two equal cost routes are present for a period of time (OSPF has not yet deleted the route to Router A), so packets are still forwarded from these two interfaces, and thus the traffic was interrupted.

The ultimate reason for this problem is that it takes time for OSPF to find out the change of the two equal cost routes. You can configure a static route destined to Router B to avoid this problem.

1.5  Troubleshooting Commands

1.5.1  VRRP Packet Debugging

l           debugging vrrp packet

l           debugging vrrp ipv6 packet

1.5.2  VRRP State Debugging

l           debugging vrrp state

l           debugging vrrp ipv6 state

Copyright ©2009 Hangzhou H3C Technologies Co., Ltd. All rights reserved.

No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of Hangzhou H3C Technologies Co., Ltd.

The information in this document is subject to change without notice.