Cluster high availability (HA) depends on shared storage and dynamic migration technologies to provide simple and efficient HA services for applications running on all cloud hosts in the cluster. It reduces service interruption caused by host hardware failure. Cluster HA is applicable to scenarios that require service continuity.
CVM virtualizes a group of hosts into a cluster that uses a shared resource pool. After you enable HA for the cluster, CVM monitors running state of all hosts and cloud hosts in the cluster.
When a host fails, CVM migrates the cloud hosts on the host to available hosts in the cluster.
When a cloud host fails, CVM restarts the cloud host. If the cloud host is restarted, CVM does not migrate the cloud host. If the restart fails, CVM migrates the cloud host to another host and restarts it.
When the network between a host and the shared storage fails, CVM migrates the cloud hosts on the host to available hosts in the cluster.
Automatically monitors running state of hosts and cloud hosts and migrates a failed cloud host or cloud hosts on a failed host to other hosts in the cluster.
Reserves enough resources for cloud hosts to restart if hosts fail.
Automatically migrates cloud hosts between hosts to ensure service continuity in case of hardware failure.
Automatically selects suitable hosts for cloud hosts on a failed host based on the resource usage if you enable both HA and DRS for the cluster.
All hosts in an HA-enabled cluster must have the same vSwitch configuration, including vSwitch quantity, name, and forwarding mode.
To ensure that cloud hosts in an HA-enabled cluster can migrate between hosts in the cluster, make sure the image files of all cloud hosts in the cluster are saved in the shared storage. As a best practice, do not enable HA or DRS if the cloud hosts use the local storage.
In an HA-enabled cluster, all hosts must use CPUs from the same manufacturer.
To prevent cloud host name conflict, make sure no hosts in abnormal state exist in a cluster before you disable HA for the cluster. If cloud host name conflict occurs, enable HA for the cluster again.
During the process of enabling or disabling HA for a cluster, do not start, deploy, or migrate cloud hosts or restart or shut down hosts in the cluster.
To reinstall the CVK component for a host in an HA-enabled cluster, first delete the host from the cluster, reinstall the CVK component, and then add the host to the cluster again.
Before you enable HA for a cluster, make sure all hosts in the cluster have reserved sufficient system resources so that the cloud hosts can migrate between the hosts.
On the top navigation bar, click Resources.
From the left navigation pane, select Virtualization.
Click the name of a cluster.
Click HA.
Enable HA as needed.
If you enable HA for the cluster, select a default startup priority for cloud hosts in the cluster.
Enable service network HA and HA access control as needed.
If you enable HA access control, specify the minimum number of nodes, select a failover host, or set the reserved CPU and memory percentages.
Click OK.
Parameter |
Description |
|
Startup Priority |
Select a default startup priority for the cloud hosts in the cluster. You can set the startup priority for a cloud host when you add or edit the cloud host. After a host fails, the system migrates the cloud hosts on the host based on their startup priorities until all the cloud hosts are migrated or the cluster does not have any available resources. |
|
Enable Service HA |
Select whether to enable service network HA. After you enable this feature, a cloud host is migrated to another host if the service network of the cloud host fails or is disconnected. HA failure detection is not supported on a vSwitch that uses the management network, uses VXLAN forwarding mode, or is not bound to physical NICs. |
|
Enable HA Access Control |
Select whether to enable HA access control. If you enable this feature, configure the Min Nodes, Failover Host, or HA Resource Reservation parameter. |
|
HA Access Control Settings |
Min Nodes |
Specify the minimum number of hosts for HA to take effect on the cluster. If the number of hosts that are operating correctly in the cluster is smaller than the specified minimum node number, HA cannot take effect on the cluster. To avoid migration failure caused by inaccurate resource calculation, make sure all hosts in the cluster have the same CPU quantity and memory size. |
Failover Host |
The failover hosts must use the same shared storage as the service hosts. |
|
HA Resource Reservation |
Set the reserved CPU and memory percentages. When the remaining resources in the cluster are less than the specified percentage of resources, you cannot start new cloud hosts, set the cloud hosts to running or suspended state, or migrate running cloud hosts to the cluster. |
|
Action |
Select the action to take on related cloud hosts when the shared storage fails. This parameter is available when the Shared Storage Fault Action parameter on the system parameter page is Do Not Restart Host and the HA state is changed from disabled to enabled.
|