Download Book

Title	Size	Downloads
H3C Super Controller Troubleshooting Guide-5W101-book.pdf	126.28 KB

Table of Contents

H3C Super Controller Troubleshooting Guide-5W101

Related Documents

Contents

Introduction· 1

General guidelines· 1

Collecting diagnosis log messages· 1

Contacting technical support 2

Troubleshooting login to the super controller 3

Troubleshooting the cluster nodes· 5

Cluster node server hardware failure (the failure cannot be recovered and the node server must be replaced) 5

Symptom·· 5

Solution· 5

Troubleshooting the tenant module· 6

Failure to deploy the logical network topology (a message that "Cannot reach sites" is prompted) 6

Symptom·· 6

Solution· 6

Troubleshooting the system module· 7

Internal error prompted when you enter the log settings page· 7

Symptom·· 7

Solution· 7

Data inconsistency after the diagnosis log settings page is refreshed multiple times· 8

Symptom·· 8

Solution· 8

Out-of-order log messages· 8

Symptom·· 8

Solution· 8

Failure to export diagnosis logs· 9

Symptom·· 9

Solution· 9

Introduction

This document provides information about troubleshooting common problems with the H3C super controller.

General guidelines

To help identify the cause of the problem, collect system and configuration information, including:

· H3C super controller version and operation system version.

· Symptom, time of failure, and configuration.

· Network topology information, including the network diagram, port connections, and points of failure.

· Log messages and diagnostic information. For more information, see "Collecting diagnosis log messages."

· Steps you have taken and their effects.

Collecting diagnosis log messages

1. Enter the URL of the super controller in the address bar of a browser (for example, Chrome) to enter the super controller login page.

The URL is in the format of https://controller_ip_address/suc/ui/.

2. On the login page, enter the username and password, and click Log in.

3. From the navigation pane, select System > Log > Diagnosis Log to enter the diagnosis log page, as shown in Figure 1.

4. Select a time range and click Export to export the diagnosis logs within the time range and save them locally.

If you do not select a time range, all diagnosis logs are exported.

Figure 1 Diagnosis log page

Contacting technical support

If you cannot resolve a problem after using the troubleshooting procedures in this document, contact H3C Support.

The following is the contact information for H3C Support:

· Telephone number—400-810-0504.

· E-mail—service@h3c.com.

Troubleshooting login to the super controller

This section provides troubleshooting information for common super controller login problems.

Login to the super controller depends on the Web and AAA microservices. To successfully log in to the super controller, make sure the two microservices work at the same time.

Login failure (a 404 error is prompted)

Symptom

When a user logs in to the super controller through a browser, the login page is not opened, and an 404 error is prompted.

Solution

A possible reason is the Web microservice is deleted.

To resolve the problem:

1. Log in to the H3C Matrix GUI. From the navigation pane, select Deploy > Application to identify whether the Web microservice is installed.

¡ If the Web microservice is installed, proceed with step 3.

¡ If the Web microservice is not installed, proceed with the next step.

2. Re-install the Web microservice.

For more information, see H3C Super Controller Installation Guide.

3. If the problem persists, contact H3C Support.

Login failure (a message that "Login timed out" is prompted)

Symptom

When a user logs in to the super controller through a browser, the user fails to log in after entering the correct username and password and click Log in. The system prompts a message that "Login timed out."

Solution

A possible reason is the AAA microservice is deleted.

To resolve the problem:

1. Log in to the H3C Matrix GUI. From the navigation pane, select Deploy > Application to identify whether the AAA microservice is installed.

¡ If the AAA microservice is installed, proceed with step 3.

¡ If the AAA microservice is not installed, proceed with the next step.

2. Re-install the AAA microservice.

For more information, see H3C Super Controller Installation Guide.

3. If the problem persists, contact H3C Support.

Login failure (a message that "The system is restoring the configuration. Please try again later..." is prompted)

Symptom

When a user logs in to the super controller through a browser, the user fails to log in after entering the correct username and password and clicking Log in. The system prompts a message that "The system is restoring the configuration. Please try again later..."

Solution

Possible reasons are:

· The AAA microservice has just been installed and is synchronizing the configuration. Therefore, the AAA microservice has not started to run properly.

· The cluster hosts are being rebooted.

To resolve the problem:

1. Log in to the super controller again after the system has restored the configuration. If login still fails, proceed with the next step.

2. If the problem persists, contact H3C Support.

Troubleshooting the cluster nodes

This section provides troubleshooting information for common cluster node problems.

Cluster node server hardware failure (the failure cannot be recovered and the node server must be replaced)

Symptom

The hardware failure of a node in the Matrix cluster cannot be recovered. The node server must be replaced.

Solution

A possible reason is that the hardware of a node in the Matrix cluster fails. As a result, the node server cannot operate properly, and cannot be recovered.

To resolve this problem:

1. On a master node server that is operating properly, manually execute the following script to release the IP addresses occupied by the containers on the failed node (matrix02 in this example).

[root@matrix01 ~]# sh /opt/matrix/k8s/disaster-recovery/recovery.sh matrix02

2. Replace the node server. Make sure the new node server has the same IP address, username, and password as the failed node.

3. Copy the /opt/matrix/app/install directory on the primary master node of the cluster to the corresponding directory on the new server.

4. Install the Matrix software package on the new node server. For more information, see H3C Matrix Containerized Application Deployment Platform Installation Guide.

5. Log in to the Matrix platform. On the Deploy > Cluster page, click the icon at the upper right corner of the failed node, and select Disable to disable the node. After the node is disabled, click Enable to enable the node. After the node is enabled, the node server is replaced.

Troubleshooting the tenant module

This section provides troubleshooting information for common tenant module problems.

Failure to deploy the logical network topology (a message that "Cannot reach sites" is prompted)

Symptom

On the Tenant > Tenant Management > Logical Network page, a user fails to deploy the logical network, the icon of the logical resource becomes red, and the system prompts a message that "Cannot reach sites."

Solution

A possible reason is that the super controller cannot communicate with sites.

To resolve the problem:

1. Verify that super controller can communicate with sites properly.

2. Re-deploy the logical network.

3. If the problem persists, contact H3C Support.

Troubleshooting the system module

This section provides troubleshooting information for common system module problems.

Internal error prompted when you enter the log settings page

Symptom

When a user enters the System > Settings > Log Settings page, the user cannot view or modify the settings, and an internal error is prompted.

Solution

A possible reason is that the db microservice operates improperly because the cluster hosts have rebooted multiple times.

To resolve the problem:

1. Log in to the H3C Matrix GUI. From the navigation pane, select Deploy > Backup & recovery. In the Backup history area, identify whether history backup files exist.

¡ If history backup files do not exist, proceed with step 8.

¡ If history backup files exist, proceed with the next step.

2. Log in to the CLI of any master node in the cluster. Execute the kubectl exec -n vcf-system $(kubectl get pod -n vcf-system | grep -im1 pxc | awk '{print $1}') service mysql status command to identify whether the database is running properly.

¡ If the database status is displayed as running, the database is running properly. Proceed with step 8.

¡ If the database status is displayed as stopped, the database is running improperly. Proceed with the next step.

3. Log in to the CLI of the three master nodes separately. For each master node, execute the rm –rf /opt/matrix/app/data/SuperController/db/ command to delete the database directory of the node.

4. Log in to the H3C Matrix GUI. From the navigation pane, select Deploy > Application. In the Operation column of the application list, click the icon for the db microservice to delete the db microservice.

5. Re-install the db microservice.

For more information, see H3C Super Controller Installation Guide.

6. From the navigation pane, select Deploy > Backup & recovery. In the Operation column of the backup history area, click the icon for the specified backup file. On the dialog box that opens, click the microservice list tab, select SuperController/db, and click OK to start data recovery. After data recovery is complete, log in to the super controller again.

7. Execute the kubectl delete pod -n super-controller $(kubectl get pod -n super-controller -o wide | egrep -vi 'aaa|default|nginx|name' | awk '{print $1}' | tr '\n' ' ') command on any master node to restart the microservice that the cluster depends on.

8. If the problem persists, contact H3C Support.

Data inconsistency after the diagnosis log settings page is refreshed multiple times

Symptom

After setting the diagnosis log level on the System > Settings > Log Settings > Diagnosis Log page, the log level values might become different if you refresh the page multiple times.

Solution

Possible reasons are:

· The mq microservice operates improperly because the hosts have rebooted multiple times.

· The mq microservice operates improperly because the network between hosts is unreachable.

To resolve the problem:

1. Log in to the H3C Matrix GUI. From the navigation pane, select Deploy > Application. In the Operation column of the application list, click the icon for the mq microservice to delete the mq microservice.

2. Re-install the mq microservice.

For more information, see H3C Super Controller Installation Guide.

3. Log in to the CLI of any master node. Execute the kubectl delete pod -n super-controller $(kubectl get pod -n super-controller -o wide | egrep -vi 'aaa|default|nginx|name' | awk '{print $1}' | tr '\n' ' ') command to restart the microservices that depend on the mq microservice.

4. If the problem persists, contact H3C Support.

Out-of-order log messages

Symptom

In the operation logs, system logs, and exported diagnosis logs, the log messages are out of order.

Solution

A possible reason is that the system time is modified. The logs are displayed in the order of the system time. As a result, the logs are out of order if the system time is modified.

To resolve the problem:

1. After modifying the system time to the correct time, do not modify the system again. If the newly generated logs are still out of order, proceed with the next step.

2. If the problem persists, contact H3C Support.

Failure to export diagnosis logs

Symptom

When the diagnosis logs are exported, the system prompts a message that it failed to export the logs behind the progress bar on the page, and the exported file cannot be opened.

Solution

A possible reason is that the directory mounted for logs is deleted on the cluster nodes.

To resolve the problem:

1. Log in to the CLI of the cluster node where log export failed. Execute the ls -al /opt/matrix/app/data/SuperController/log/log-tmp command to identify whether the directory mounted for logs exists.

¡ If the directory does not exist, proceed with the next step.

¡ If the directory exists on all cluster nodes, proceed with step 3.

2. On the cluster node where the directory mounted for logs does not exist, execute the docker ps | grep log | grep super-controller | grep -v "POD" | awk '{cmd="docker stop "$1;system(cmd)}' command to restart the container. Wait 1 to 2 minutes. Then, log in to the super controller GUI and export the diagnosis logs again.

3. If the problem persists, contact H3C Support.

H3C Super Controller Troubleshooting Guide-5W101

Introduction

Collecting diagnosis log messages

Cloud & AI

InterConnect

Intelligent Terminal Products

Product Support Services

Technical Service Solutions

Resource Center

Policy

Online Help

Become a Partner

Partner Resources

Partner Business Management

Company Information

News & Events

Contact Us