The PXE NIC that obtains a DHCP address during device detection is used for deployment. After device detection, do not switch the network connection and make sure the network is not interrupted. To prevent the device from entering the device detection phase during deployment, perform device detection again if the network connection is switched.
Follow these restrictions and guidelines during bare metal address allocation:
The management interface does not support NIC bonding. You must manually configure an address for the interface.
The service interface supports fiber port and copper port NIC bonding. You must manually configure addresses when NIC bonding is used. When NIC bonding is used, addresses can be obtained through DHCP.
If you select a large disk when creating the bare metal image, the available space on the compute node might be insufficient, causing deployment failure. To resolve this issue, re-create an image with a small disk or scale up the disk of the compute node.
The following is an example of the incorrect information in the /var/log/ironic/ironic-conductor.log file of the compute node:
2019-07-30 19:57:23.011 5701 INFO ironic.conductor.task_manager [req-ab37450c-fd75-4520-b55f-5d8e6acf52d1 - - - - -] Node b3d6be6e-2139-4629-ac52-cf12a54bd922 moved to provision state "deploy failed" from state "wait call-back"; target provision state is "active"
2019-07-30 19:57:23.015 5701 ERROR ironic.conductor.utils [req-ab37450c-fd75-4520-b55f-5d8e6acf52d1 - - - - -] Timeout reached while waiting for callback for node b3d6be6e-2139-4629-ac52-cf12a54bd922
This issue is usually caused by the following network errors:
The small image system cannot ping the master node or the bare metal compute node during deployment.
Multiple DHCP servers are deployed in the network where the system resides, causing inconsistency in IP, TFTP server, and DHCP server configuration.
To determine whether a device is in detection phase or deployment phase, open the console and check whether the highlighted information shown in Figure-1 is displayed. If the information is displayed, the device is in detection phase.
Figure-1 Information displayed during node detection
Figure-2 Information displayed on server console during node detection
Figure-3 Information displayed on server console during node deployment
This issue might be caused by the following errors:
The monitor script contains logic errors and deletes the config file in the uuid folder in the /tftpboot directory. To resolve this issue, disable the openstack-compute-monitor.service service on the compute node, or edit the /opt/tools/shell/openstack-compute-monitor.sh script used by this service, commenting out the call of monitor_ironic_service, and then restart the service.
The current system version does not support the UEFI boot mode, but the server boot mode is UEFI. To resolve this issue, set the server boot mode to Legacy BIOS, and then perform node detection and node deployment again.
Front end processing errors occur and the deployed port information is incorrect. The correct information is 28:80:23:a5:00:d8. To resolve this issue, execute the following command:
mv 01-00-11-0a-67-b8-00 01-28-80-23-a5-00-d8
Figure-4 Port information
This log indicates deployment failure caused by missing ramdisk small image files. The small image files have been deleted from the storage even though the output from the OpenStack command glance image-list indicates that the files still exist. The small image files (ironic-deploy-new.initramfs and ironic-deploy-new.kernel) in the /tftpboot directory of the compute node are identical with those on glance. To resolve this issue, upload the deleted files to glance again.
Figure-5 Conductor log