Download Book

H3C HDM System Log Messages Reference-6W105-book(CHM&PDF&Excel).rar(1.05 MB)

Released At: 20-01-2024
Page Views:
Downloads:

Table of Contents

H3C HDM System Log Messages Reference-6W105

Related Documents


H3C HDM
System Log Messages Reference

No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New H3C Technologies Co., Ltd.

Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are the property of their respective owners.

The information in this document is subject to change without notice.

Contents

Introduction· 1

Obtaining system log messages· 1

System log severity level 1

Using this document 1

Applicable products· 2

Event log messages· 3

Temperature· 4

Dropped below the lower minor threshold· 4

Dropped below the lower major threshold· 4

Dropped below the lower critical threshold· 5

Exceeded the upper minor threshold· 5

Exceeded the upper major threshold· 6

Exceeded the upper critical threshold· 6

Abnormal Temperature· 7

Voltage· 7

State Asserted· 7

Dropped below the lower major threshold· 8

Exceeded the upper major threshold· 8

Current 9

State Asserted· 9

Exceeded the upper minor threshold· 9

Exceeded the upper major threshold· 10

Exceeded the upper critical threshold· 10

Fan· 11

Transition to Running· 11

Fully Redundant 11

Non-redundant:Sufficient Resources from Redundant 12

Transition to Off Line· 12

Non-redundant:Insufficient Resources· 13

Transition to Degraded· 13

Install Error 14

Cooling device· 14

Liquid Cooler is not present 14

Liquid Cooler is leakage· 15

Physical security· 16

General Chassis Intrusion· 16

CPU Critical Temperature· 18

Thermal Trip· 19

FRB1/BIST failure· 19

Processor Presence detected· 20

Processor Automatically Throttled· 20

Machine Check Exception· 21

triggered an uncorrectable error 21

Machine Check Error 22

Machine Check Error---CPU core errors· 22

triggered a correctable error 23

Correctable Machine Check Error 23

Correctable Machine Check Error---CPU UPI errors· 24

Correctable Machine Check Error---IOH UPI errors· 24

Correctable Machine Check Error---IOH core errors· 25

Correctable Machine Check Error---VT-d errors· 25

Correctable Machine Check Error---CPU core errors· 26

Correctable Machine Check Error---Cbo error 26

Configuration Error---System is operating in KTI Link Slow Speed Mode· 27

Power supply· 27

Fully Redundant 27

Fully Redundant 28

Presence detected· 28

Redundancy Lost 29

Power Supply Failure detected· 29

Power Supply Predictive Failure---PSU Self Check Failed· 30

Power Supply Predictive Failure· 30

Power Supply input lost (AC/DC) 31

Power Supply input lost or out-of-range· 31

Power Supply input out-of-range - but present 32

Configuration error ---Vendor mismatch· 32

Configuration error---Power supply rating mismatch· 33

Exceeded the upper minor threshold· 33

Power Supply Inactive/standby state· 34

Interlock Power Down· 34

Power Supply Pwok abnormal 35

Power limit is exceeded over correction time limit 35

Power limit is exceeded over correction time limit 36

Memory· 36

Correctable ECC or other correctable memory error 36

CPU triggered a correctable error 37

Uncorrectable ECC or other uncorrectable memory error 37

triggered an uncorrectable error 38

Parity· 38

Parity---Memory Training Faulty Part Tracking Uncorrectable Error 39

Parity---Memory Receive Enable Training Error 39

Parity---Memory Write Leveling Training Error 40

Parity---Memory Write DqDqs Training Error 40

Parity---Memory Sense Amp Training Error 41

Parity---Warning Command Clock Training Error 41

Parity---An uncorrectable error occurs during the memory test phase· 42

Parity---Memory Training Error 42

Parity---The number of correctable memory errors reached the error logging threshold· 43

Parity---An error occurred on the DIMM slot 43

Parity---CMD eye width is too small 44

Parity---The command is not in the FNv table· 44

Parity---CTL is not consistent with clock in timing, and the channel is isolated· 45

Parity---Memory write flyby failed· 45

Parity---Timing error occurred during signal line adjustment for memory write leveling training· 46

Parity---Memory read DqDqs training failed· 46

Parity---Memory receive enable training failed· 47

Parity---Memory write leveling training failed· 47

Parity---Memory write DqDqs training failed· 48

Parity---An error occurs during memory test, and the rank is disabled· 48

Parity---Failed to find the RxVref for data eye training· 49

Parity---LRDIMM RCVEN training failed· 49

Parity---RCVEN CYCLE training failed· 50

Parity---Read delay training failed· 50

Parity---Memory write leveling training failed· 51

Parity---Coarse write leveling training failed· 51

Parity---Write delay training failed· 52

Parity---QxCA_CLK_NO_EYE training failed· 52

Parity---mapped out because failed critical mask test at cold boot 53

Parity---Invalid SPD contents· 53

Memory Device Disabled· 54

Memory Device Disabled---the DIMM is disabled· 54

Memory Device Disabled---the rank is disabled· 55

Memory Device Disabled---Pmem Media disabled· 55

Correctable ECC or other memory error limit reached· 56

Presence detected· 57

Configuration error---RDIMMs are installed on the server that supports only UDIMMs· 57

Configuration error---UDIMMs are installed on the server that supports only RDIMMs· 58

Configuration error---SODIMMs are installed on the server that supports only RDIMMs· 58

Configuration error---The number of ranks per channel can be only 1, 2, or 4· 59

Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported· 59

Configuration error---The number of ranks in the channel exceeds 8· 60

Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server 60

Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V 61

Configuration error---The CPU is not compatible with 3DS DIMMs· 61

Configuration error---NVDIMMs with stepping lower than 0x10 are not supported· 62

Configuration error---The CPU is not compatible with 16-GB single-rank DIMMs· 62

Configuration error---The CPU is not compatible with the DIMMs· 63

Configuration error---The frequency of the DIMM is not supported on the server 63

Configuration error---NVDIMMs are not compatible with the CPU· 64

Configuration error---DCPMMs are not supported· 64

Configuration error---Memory LockStep Disable Error 65

Configuration error---Memory Mirror Disable Error 65

Configuration error---Failed to enable the full mirror mode· 66

Configuration error---The memory interleaving configuration cannot meet the requirements of the server 66

Configuration error---The memory interleaving configuration cannot meet the requirements of the server 67

Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent 67

Configuration error---Memory Rank Sparing Error 68

Configuration error---Failed to enable patrol scrubbing· 68

Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty· 69

Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel 69

Configuration error---The DDR-T memory module is installed in the white slot 70

Configuration error---2LM IMC memory Mismatch· 70

Configuration error---ODT configuration errorThe channel is isolated· 71

Configuration error---Failed to enable ADDDC· 71

Configuration error---Failed to enable SDDC· 72

Configuration error---DCPMM firmware version not supported· 72

Configuration error---DCPMM firmware version not supported· 73

Configuration error---NVMCTRL_MEDIA_NOTREADY· 73

Configuration error---The DDR-T memory modules of the unexpected model are installed· 74

Configuration error---Failed to set the VDD voltage of the DIMM·· 74

Configuration error---Too many RIR rules· 75

Configuration error---The DIMMs for the CPU exceeded the limit 75

Drive slot 76

Drive Presence· 76

Drive Fault 77

Predictive Failure· 78

Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold 79

Consistency Check / Parity Check in progress. System Source Monitor: Relieve resource alarm about Hard Disk Usage 80

In Critical Array· 81

In Failed Array· 82

Rebuild/Remap in progress· 82

The disk triggered a media error 83

The disk triggered an uncorrectable error 83

The disk is missing· 84

System firmware Progress· 84

System Firmware Error (POST Error)---CPU matching failure· 84

System Firmware Error (POST Error)---Firmware (BIOS) ROM corruption detected· 85

System Firmware Error (POST Error)---Load microcode failed· 85

System Firmware Error (POST Error)---No system memory or invalid memory configuration· 86

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image is unsigned or Certificate is invalid· 86

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate not found in Authorized database(db) 87

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate is found in Forbidden database(dbx) 87

System Firmware Error (POST Error)---Memory Population Rule Error 88

System firmware error (POST error)---DIMM installation or compatibility error occurred· 88

System firmware error (POST error)---No Memory Usable· 89

System firmware error (POST error)---No DDR Memory Error 89

System firmware error (POST error)---DIMM Compatible Error(LRDIMM and RDIMM are installed) 90

System Firmware Error (POST Error)---No DIMMs present 90

System Firmware Error (POST Error)---No DDR memory in the system·· 91

System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation· 91

System Firmware Error (POST Error)---Different DIMM types detected· 92

System Firmware Error (POST Error)---DIMM population error 92

System Firmware Error (POST Error)---A maximum of two quad-rank DIMMs can be populated per channel 93

System Firmware Error (POST Error)---The third DIMM slot with green release tabs does not support UDIMMs or SODIMMs· 93

System Firmware Error (POST Error)---DIMM voltage error 94

System Firmware Error (POST Error)---DDR3 and DDR4 DIMMs cannot be mixed· 94

System Firmware Error (POST Error)---256-byte and 512-byte SPD devices cannot be mixed· 95

System Firmware Error (POST Error)---3DS and non-3DS LRDIMMs cannot be mixed· 95

System Firmware Error (POST Error)---DDR-T memory modules and UDIMMs cannot be mixed· 96

System Firmware Error (POST Error)---Memory Unrecognized Initialization Error 96

System Firmware Hang---Unspecified· 97

System firmware hang-----No DDR Memory Error 97

System firmware hang---DIMM Compatible Error(LRDIMM and RDIMM are installed) 98

System firmware hang---Memory Unrecognized Initialization Error 98

System Firmware Progress---Current Memory Ras Mode· 99

System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket 99

System Firmware Error (POST Error)---No DIMMs installed for CPU· 100

Event Logging Disabled· 100

Log Area Reset/Cleared· 100

SEL Full 101

SEL Almost Full 101

Watchdog1· 102

BIOS Watchdog Reset 102

OS Watchdog NMI/Diagnostic Interrupt 102

OS Watchdog pre-timeout Interrupt-non-NMI 103

System Event 103

Timestamp Clock Synch---event is $1 of pair---SEL Timestamp Clock updated· 103

Timestamp clock synch---BMC Time SYNC succeed· 104

Critical Interrupt 104

Transition to Non-Critical from OK· 104

PCI: PCIE Hot Plug PCIe Pull Out 105

PCI: PCIE Hot Plug PCIe Insert 105

PCI SERR· 106

Bus Uncorrectable Error 107

Bus Fatal Error 108

Button/Switch· 109

Power Button pressed---Physical button---Button pressed· 109

Power Button pressed---Physical button---Button released· 109

Power Button pressed---Virtual button---Power cycle command· 109

Power Button pressed---Virtual button---Power off command· 110

Power Button pressed---Virtual button---Power on command· 110

Power Button pressed---Virtual button---Soft off command· 110

Reset Button pressed---Virtual button---Reset command· 111

FRU service request button---Physical button---Uid button pressed· 111

Module/Board· 112

Transition to Critical from less severe· 112

Transition to Non-Recoverable from less severe· 112

Monitor---Board found PSU output can't be enabled· 113

Add-in Card· 113

Transition to OK· 113

Transition to Critical from less severe· 114

Chassis· 114

Transition to OK· 114

State asserted· 115

Transition to Critical from less severe· 115

Transition to Non-recoverable from less severe· 115

System Boot/Restart Initiated· 116

Initiated by power up· 116

Initiated by hard reset 116

Initiated by warm reset 117

System restart---due to fan error:power off 117

System Restart 118

System Restart---due to fan error:power reset 118

System Restart---due to fan error:power cycle· 119

Boot Error 119

No bootable media· 119

OS_BOOT· 120

C: boot completed· 120

PXE boot completed· 120

OS Stop/Shutdown· 121

Run-time Critical Stop· 121

OS Graceful Stop· 121

OS Graceful Shutdown· 122

Slot/Connector 122

Device disabled: PCIe module information not obtained· 122

triggered an uncorrectable error 123

triggered a correctable error 124

Slot/Connector Device installed/attached· 124

Transition to on line· 125

Transition to off line· 125

Transition to Non-Critical from OK· 125

System ACPI Power State· 126

S0/G0 "working" 126

S5/G2 "soft-off" 126

LPC Reset occurred· 127

Watchdog2· 128

Watchdog overflowAction:Timer expired· 128

Watchdog overflowAction:Hard Reset 129

Watchdog overflowAction:Power Down· 130

Watchdog overflowAction:Power Cycle· 131

Watchdog overflowAction:Timer interrupt 132

Management subsystem health· 133

Management controller off-line· 133

Management controller off-line---BMC reset 133

Management controller off-line---HDM cold reboot 134

Management controller off-line---BMC WDT timeout event happened· 134

Management controller off-line---BMC service restart 135

Management controller unavailable· 135

Management controller unavailable---Adapter $1 RAID-P460-B4 is in a fault condition· 136

Sensor access degraded or unavailable--- Adapter $1 RAID-P460-B4 has no response for 2 minutes in $2 slot 136

Sensor access degraded or unavailable--- Adapter $1 has no response for 5 minutes in $2 slot 137

Sensor failure---Adapter $1 has no response for 4 minutes in $2 slot 137

Sensor failure--- Adapter $1 has no response for 10 minutes in $2 slot 138

Battery· 138

Battery low (predictive failure) 138

Battery failed· 139

Battery presence detected· 139

ME status· 140

Management controller unavailable· 140

OEM Record· 140

System Source Monitor:Mem usage exceeds the threshold· 140

System Source Monitor: Relieve resource alarm about Mem Usage· 141

System Source Monitor:Cpu usage exceeds the threshold· 141

System Source Monitor: Relieve resource alarm about Cpu Usage· 142

Memory is not certified· 142

Numbering CPUs· 142

Introduction

This document describes HDM log messages generated to notify the occurrence and removal of system exceptions detected by sensors in the server. You can use this document to obtain message details and recommended actions for server maintenance.

Obtaining system log messages

You can obtain system log messages through the following methods:

· HDM Web interface—Access the HDM Web interface and click Remote O&M > Log > Log Download. On the Log Download tab, select to download the entire log or log entries for a period.

· Alert emails—Complete alert email settings to obtain log messages.

· Third-party platform—Complete SNMP settings to connect HDM to a third-party management platform, and obtain log messages from the platform.

· Redfish event subscription—If a remote subscription server is configured, Redfish uploads received log messages to the remote subscription server.

· IPMI commands—Use IPMItool commands to access the IPMI interface for HDM and enter commands to obtain event log messages.

System log severity level

Table 1 System log message severity levels

Severity	Description
Critical	The target module might be powered off or the system might become unavailable. Actions must be taken immediately.
Major	The system or service modules, including computing, storage, communication, and data security, might fail to operate correctly and service interruption might occur.
Minor	Actions must be taken to prevent failure escalation, if necessary.
Info	Informational message. For example, a normal state change happened or an alarm is removed. No action is required.

Using this document

This document explains messages in tables. Table 2 describes information provided in these tables.

Table 2 Message explanation table contents

Item	Description	Example
Event code	A hexadecimal code that uniquely represents a log message. The parity of the last character in the event code represents the alarm type: · Even—An alarm was generated. · Odd—An alarm was removed.	0x 02900002
Message text	Presents the message description. The same message description might be reported by different types of sensors.	Exceeded the upper major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	Briefly describes the variable fields in the order that they appear in the message text. The variable fields are numbered in the "$Number" form to help you identify their location in the message text.	· $1: Current reading of the voltage sensor. · $2: Major overvoltage threshold of the voltage sensor.
Severity level	Provides the severity level of the message.	Major
Example	Log example.	Exceeded the upper major threshold.---Current reading:2.58---Threshold reading:2.56
Cause	Explains the message, including the event or error cause.	The total input voltage exceeds the major overvoltage alarm threshold. To locate the alarm triggering component, see the sensor name on the Event Log page from the HDM Web interface.
Recommended action	Provides recommended actions. If the issue persists after the recommended actions have been taken, contact the technical support.	1. Verify that the external power supply is operating correctly. 2. Access the HDM Web interface and verify that the power supply is operating correctly. 3. If the issue persists, contact Technical Support.

Applicable products

This document is available for the following product models:

· H3C UniServer B5700 G3

· H3C UniServer B5700 G5

· H3C UniServer B5800 G3

· H3C UniServer B7800 G3

· H3C UniServer E3200 G3

· H3C UniServer R2700 G3

· H3C UniServer R2900 G3

· H3C UniServer R4100 G3

· H3C UniServer R4300 G3

· H3C UniServer R4300 G5

· H3C UniServer R4330 G5

· H3C UniServer R4330 G5 H3

· H3C UniServer R4400 G3

· H3C UniServer R4500 G3

· H3C UniServer R4700 G3

· H3C UniServer R4700 G5

· H3C UniServer R4700LC G5

· H3C UniServer R4900 G3

· H3C UniServer R4900 G5

· H3C UniServer R4900LC G5

· H3C UniServer R4930 G5

· H3C UniServer R4930 G5 H3

· H3C UniServer R4930LC G5 H3

· H3C UniServer R4950 G3

· H3C UniServer R4950 G5

· H3C UniServer R5300 G3

· H3C UniServer R5300 G5

· H3C UniServer R5500 G5

· H3C UniServer R5500 INTEL liquid cooling

· H3C UniServer R6700 G3

· H3C UniServer R6900 G3

· H3C UniServer R6900 G5

· H3C UniServer R8900 G3

Event log messages

This section contains event log messages.

Temperature

Dropped below the lower minor threshold

Event code	0x01000002
Message text	Dropped below the lower minor threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the lower minor temperature alarm threshold.
Severity level	Minor
Example	Dropped below the lower minor threshold.---Current reading:2---Threshold reading:10
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too low. If the temperature does not rise and the alarm persists, it might result in further temperature reduction and produce alarms of the major level. Therefore, it is important to detect potential issues that might lead to low temperature alarms as early as possible to avoid escalation of the issues.
Cause	The temperature is too low.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Log in to HDM, access the Fans page, and verify if the fan speed is too high. If yes, adjust the fan speed mode or fan speed level. 3. If the issue persists, contact Technical Support.

Dropped below the lower major threshold

Event code	0x01200002
Message text	Dropped below the lower major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the lower major temperature alarm threshold.
Severity level	Major
Example	Dropped below the lower major threshold.---Current reading:2---Threshold reading:5
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too low. If the temperature does not rise and the alarm persists, it might result in further temperature reduction and generate alarms of the critical level. Therefore, it is important to detect potential issues that might lead to low temperature alarms as early as possible in order to avoid issue escalation.
Cause	The temperature is too low.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Log in to HDM, access the Fans page, and verify if the fan speed is too high. If yes, adjust the fan speed mode or fan speed level. 3. If the issue persists, contact Technical Support.

Dropped below the lower critical threshold

Event code	0x01400002
Message text	Dropped below the lower critical threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the lower critical temperature alarm threshold.
Severity level	Critical
Example	Dropped below the lower critical threshold.---Current reading:2---Threshold reading:3
Impact	Operating devices in ultra-low temperature environments can reduce device performance, impact device lifespan, disrupt business operations, and lead to system downtime.
Cause	The temperature is too low.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Log in to HDM, access the Fans page, and verify if the fan speed is too high. If yes, adjust the fan speed mode or fan speed level. 3. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x01700002
Message text	Exceeded the upper minor threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the minor overtemperature alarm threshold.
Severity level	Minor
Example	Exceeded the upper minor threshold.---Current reading:100---Threshold reading:80
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too high. If the temperature does not decrease and the alarm persists, it might result in further temperature rise and generate major-level alarms. Therefore, it is important to detect potential issues that might lead to high temperature alarms as early as possible in order to avoid problem escalation.
Cause	High ambient temperature, blockage of air intake or exhaust, and low fan speed.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Verify that the server's air inlet and outlet are not blocked. 3. Log in to HDM, access the Fans page, and verify that all the fans are operating correctly, and the fan speed is not too low. If the fan speed is low, adjust the fan speed mode or fan speed level. 4. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x01900002
Message text	Exceeded the upper major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the major overtemperature alarm threshold.
Severity level	Major
Example	Exceeded the upper major threshold.---Current reading:100---Threshold reading:85
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too high. If the temperature does not decrease and the alarm persists, it might result in further temperature rise and generate critical-level alarms. Therefore, it is important to detect potential issues that might lead to high temperature alarms as early as possible in order to avoid problem escalation.
Cause	High ambient temperature, blocked air intake or exhaust, and low fan speed.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Verify that the server's air inlet and outlet are not blocked. 3. Log in to HDM, access the Fans page, and verify that all the fans are operating correctly, and the fan speed is not too low. If the fan speed is low, adjust the fan speed mode or fan speed level. 4. If the issue persists, contact Technical Support.

Exceeded the upper critical threshold

Event code	0x01b00002
Message text	Exceeded the upper critical threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor. $2: Value of the critical overtemperature alarm threshold.
Severity level	Critical
Example	Exceeded the upper critical threshold.---Current reading:100---Threshold reading:90
Impact	Operating devices in high-temperature environments can reduce device performance, impact device lifespan, increase energy consumption, disrupt business operations, and cause system crashes.
Cause	High ambient temperature, blocked air intake or exhaust, and low fan speed.
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Verify that the server's air inlet and outlet are not blocked. 3. Log in to HDM, access the Fans page, and verify that all the fans are operating correctly, and the fan speed is not too low. If the fan speed is low, adjust the fan speed mode or fan speed level. 4. If the issue persists, contact Technical Support.

Abnormal Temperature

Event code	0x011000de
Message text	Abnormal Temperature---GPU Card Temperature Error---Register location:$1--- GPU location:$2
Variable fields	$1: State register. $2: GPU slot number.
Severity level	Major
Example	Abnormal Temperature---GPU Card Temperature Error---Register location:0x6--- GPU location:11
Impact
Cause
Recommended action	1. Verify that the temperature of the equipment room is as required. 2. Verify that the server's air inlet and outlet are not blocked. 3. Log in to HDM, access the Fans page, and verify that all the fans are operating correctly, and the fan speed is not too low. If the fan speed is low, adjust the fan speed mode or fan speed level. 4. If the issue persists, contact Technical Support.

Voltage

State Asserted

Event code	0x02100006
Message text	State Asserted
Variable fields	N/A
Severity level	Critical
Example	State Asserted
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too high.
Cause	Overvoltage was detected on the system board. To locate the alarm triggering component, see the sensor name on the Event Log page from the HDM Web interface.
Recommended action	1. Power off and then restart the server. 2. If the issue persists, contact Technical Support.

Dropped below the lower major threshold

Event code	0x02200002
Message text	Dropped below the lower major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current value of the total input voltage. $2: Lower major voltage alarm threshold.
Severity level	Major
Example	Dropped below the lower major threshold.---Current reading:2.58---Threshold reading:2.60
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too low.
Cause	Abnormal board voltage.
Recommended action	1. Verify that the external power supply is operating correctly. 2. Log in to HDM and verify that the power supply is operating correctly. 3. Power off and then restart the server. 4. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x02900002
Message text	Exceeded the upper major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current value of the total input voltage. $2: Upper major voltage alarm threshold.
Severity level	Major
Example	Exceeded the upper major threshold.---Current reading:2.58---Threshold reading:2.56
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too low.
Cause	Abnormal board voltage.
Recommended action	1. Verify that the external power supply is operating correctly. 2. Log in to HDM and verify that the power supply is operating correctly. 3. Power off and then restart the server. 4. If the issue persists, contact Technical Support.

Current

State Asserted

Event code	0x03100006
Message text	State Asserted
Variable fields	N/A
Severity level	Critical
Example	State Asserted
Impact	The system might be shut down and powered off.
Cause	Overcurrent was detected for a component on the system board.
Recommended action	1. Log in to HDM, access the Logs page, and verify that no alarm is present for the power supply or system board. 2. Verify that power can be supplied to the server correctly and the voltage is within the normal range. 3. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x03700002
Message text	Exceeded the upper minor threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Real-time current value. $2: Value for the minor current alarm threshold.
Severity level	Minor
Example	Exceeded the upper minor threshold.---Current reading:20---Threshold reading:18
Impact	Performance degradation and unstable operation might occur on the device components if the current is too high.
Cause	The current of the corresponding component is abnormal.
Recommended action	1. Verify that the threshold has a reasonable value. 2. Verify that the system is not overloaded according to the server rated power. 3. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x03900002
Message text	Exceeded the upper major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Real-time current value. $2: Value for the major current alarm threshold.
Severity level	Major
Example	Exceeded the upper major threshold.---Current reading:25---Threshold reading:22
Impact	Performance degradation and unstable operation might occur on the device components if the current is too high.
Cause	Abnormal board current.
Recommended action	1. Verify that the threshold has a reasonable value. 2. Verify that the system is not overloaded according to the server rated power. 3. If the issue persists, contact Technical Support.

Exceeded the upper critical threshold

Event code	0x03b00002
Message text	Exceeded the upper critical threshold.---Current reading:$1---Threshold reading:$2
Variable fields	$1: Real-time current value. $2: Value for the critical current alarm threshold.
Severity level	Critical
Example	Exceeded the upper critical threshold.---Current reading:30---Threshold reading:25
Impact	This could potentially cause component damage, leading to a system crash.
Cause	Abnormal board current.
Recommended action	1. Verify that the threshold has a reasonable value. 2. Verify that the system is not overloaded according to the server rated power. 3. If the issue persists, contact Technical Support.

Fan

Transition to Running

Event code	0x04000014
Message text	Transition to Running.
Variable fields	N/A
Severity level	Info
Example	Transition to Running
Impact	No negative impact.
Cause	The fan is operating correctly.
Recommended action	1. Verify that the fan is present. 2. Re-install the fan. 3. If the issue persists, contact Technical Support.

Fully Redundant

Event code	0x04000017
Message text	Fully Redundant.
Variable fields	N/A
Severity level	Major
Example	Fully Redundant
Impact	Depending on the severity of redundancy loss, it might affect the normal heat dissipation of the server.
Cause	A fan redundancy error is present because a fan is absent, or a fan was removed or failed.
Recommended action	1. Re-install the removed fans. 2. Remove and re-install the fans, and make sure the fans are in good contact. 3. If a fan status sensor reports an error, replace the faulty fan. 4. If the issue persists, contact Technical Support.

Non-redundant:Sufficient Resources from Redundant

Event code	0x04300016
Message text	Non-redundant:Sufficient Resources from Redundant
Variable fields	N/A
Severity level	Major
Example	Non-redundant:Sufficient Resources from Redundant
Impact	This issue does not affect system heat dissipation.
Cause	The fan is invalid or is absent.
Recommended action	1. Re-install the removed fans. 2. Remove and re-install the fans, and make sure the fans are in good contact. 3. If a fan status sensor reports an error, replace the faulty fan. 4. If the issue persists, contact Technical Support.

Transition to Off Line

Event code	0x04400014
Message text	Transition to Off Line.
Variable fields	N/A
Severity level	Info
Example	Transition to Off Line
Impact	This affects system heat dissipation and reduces the performance of the system board components.
Cause	The fan module has been unplugged or the fan module and the system board have poor contact.
Recommended action	1. Re-install the removed fans. 2. Remove and re-install the fans, and make sure the fans are in good contact. 3. If a fan status sensor reports an error, replace the faulty fan. 4. If the issue persists, contact Technical Support.

Non-redundant:Insufficient Resources

Event code	0x04500016
Message text	Non-redundant:Insufficient Resources
Variable fields	N/A
Severity level	Major
Example	Non-redundant:Insufficient Resources
Impact	This affects system heat dissipation, causing the system to overheat and automatically shut down.
Cause	The fan is invalid or is absent.
Recommended action	1. Re-install the removed fans. 2. If a fan status sensor reports an error, replace the faulty fan. 3. Remove and re-install the fans, and make sure the fans are in good contact. 4. If the issue persists, contact Technical Support.

Transition to Degraded

Event code	0x04600014
Message text	Transition to Degraded.
Variable fields	N/A
Severity level	Major
Example	Transition to Degraded
Impact	This affects system heat dissipation and reduces the performance of the system board components.
Cause	The fan speed is abnormal.
Recommended action	1. Log in to HDM and view the fan speed. If the speed is low, a fan might have aged. If the speed is almost zero, a fan might be blocked or have failed. 2. Verify that the fans are not blocked. 3. If a fan status sensor reports an error, replace the faulty fan. 4. Replace the aged fans. 5. If the issue persists, contact Technical Support.

Install Error

Event code	0x04800014
Message text	Install Error.
Variable fields	N/A
Severity level	Minor
Example	Install Error
Impact	The system might fail to be powered on.
Cause	The fan was incorrectly installed.
Recommended action	1. Verify that the fans are installed as instructed. For more information about the installation principles, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Cooling device

Liquid Cooler is not present

Event code	0x0a5000de
Message text	Liquid Cooler is not present.
Variable fields	N/A
Severity level	Minor
Example	Liquid Cooler is not present
Impact	This affects system heat dissipation and system performance.
Cause	The liquid-cooled module is installed incorrectly. This message is available only for liquid-cooled servers.
Recommended action	1. Verify that the liquid-cooled module is present. 2. Verify that the liquid leakage sensor is installed correctly. 3. Replace the liquid-cooled module. 4. If the issue persists, contact Technical Support.

Liquid Cooler is leakage

Event code	0x0a6000de
Message text	Liquid Cooler is leakage.
Variable fields	N/A
Severity level	Critical
Example	Liquid Cooler is leakage
Impact	This might cause system crashes.
Cause	Liquid leakage occurred to the liquid-cooled module. This message is available only for liquid-cooled servers.
Recommended action	1. Verify that the liquid-cooled module is operating correctly and liquid leakage does not occur. 2. Replace the liquid-cooled module. 3. If the issue persists, contact Technical Support.

Liquid Cooler is leakage

Event code	0x0a7000de
Message text	Liquid Cooler is leakage.
Variable fields	N/A
Severity level	Critical
Example	Liquid Cooler is leakage
Impact	This might cause system crashes.
Cause	Liquid leakage occurred to the liquid-cooled module. This message is available only for liquid-cooled servers.
Recommended action	1. Verify that the liquid-cooled module is operating correctly and liquid leakage does not occur. 2. Replace the liquid-cooled module. 3. If the issue persists, contact Technical Support.

Physical security

General Chassis Intrusion

Event code	0x050000de
Message text	General Chassis Intrusion.
Variable fields	N/A
Severity level	Minor
Example	General Chassis Intrusion
Impact	No negative impact.
Cause	The access panel was removed from the server.
Recommended action	1. Verify that the access panel was removed. 2. Verify that the access panel is installed correctly. 3. Verify that the chassis-open alarm module has a good contact with the chassis ear. 4. If the issue persists, contact Technical Support.

LAN Leash Lost

Event code	0x054000de
Message text	LAN Leash Lost.
Variable fields	N/A
Severity level	Info
Example	LAN Leash Lost
Impact	No negative impact.
Cause	The NCSI channel detection of BMC detected that the network is disconnected at the physical layer.
Recommended action	1. Verify that an Ethernet adapter was disabled in the OS. 2. Verify that the message was reported on a power-on or power-off operation. 3. Verify that the Ethernet cable is connected correctly to the shared network port. 4. Disable the shared network port if the shared network port is not necessary. 5. If the issue persists, contact Technical Support.

Processor

IERR

Event code	0x070000de
Message text	Intel: $1 $2 err---Socket $3 AMD: GMI/xGMI err---Socket$1 Die$2 LinkID$3
Variable fields	· Intel: ¡ $1: Signal type. Options include MSMI and CATERR. ¡ $2: Error type. Options include IERR and MCERR. ¡ $3: CPU number. · AMD: ¡ $1: CPU number. ¡ $2: Die number. ¡ $3: Link number.
Severity level	Critical
Example	Intel: CATERR IERR err---Socket 1 AMD: GMI/xGMI err---Socket1 Die1 LinkID1
Impact	This can cause system crashes. By default, the system will then automatically restart.
Cause	A processor internal error, such as a Package Control Unit (PCU) uncorrectable error, occurred.
Recommended action	1. Upgrade the BIOS and HDM firmware to the latest version. 2. Review logs to troubleshoot the issue as instructed. 3. If the issue persists, contact Technical Support.

State Asserted

Event code	0x07100006
Message text	State Asserted.
Variable fields	N/A
Severity level	Major
Example	State Asserted
Impact	This might cause system crashes.
Cause	A processor was overheated.
Recommended action	1. Log in to HDM and verify that the fans are operating correctly. 2. If low-speed alarms are present, re-install or replace the faulty fans. 3. View resource summary to identify system service loads. If the system is overloaded, close uncritical services to reduce service loads. 4. Verify that the ambient temperature of the server is within the normal operation range. 5. Verify that the air inlets and outlets are not blocked. 6. Power off the server, and verify that the processor heatsink has a good contact. Smear the thermal grease onto the heatsink, install the heatsink, and power on the server. 7. If the issue persists, contact Technical Support.

CPU Critical Temperature

Event code	0x071000de
Message text	CPU Critical Temperature.
Variable fields	N/A
Severity level	Critical
Example	CPU Critical Temperature
Impact	This might cause system crashes.
Cause	The temperature of a processor exceeded the critical overtemperature alarm threshold.
Recommended action	1. Log in to HDM and verify that the fans are operating correctly. 2. If a low-speed alarm is present, re-install or replace the faulty fan. 3. View resource summary to identify system service loads. If the system is overloaded, close uncritical services to reduce service loads. 4. Verify that the temperature in the equipment room is within the normal range. 5. Verify that the air inlets and outlets are not blocked. 6. Power off the server, and verify that the processor heatsink has a good contact. Smear the thermal grease onto the heatsink, install the heatsink, and power on the server. 7. If the issue persists, contact Technical Support.

Thermal Trip

Event code	0x071000de
Message text	Thermal Trip
Variable fields	N/A
Severity level	Critical
Example	Thermal Trip
Impact	This might cause system crashes.
Cause	A processor was overheated, which might cause system power-off.
Recommended action	1. Log in to HDM and verify that all fans are operating correctly. 2. If a low-speed alarm is present, re-install or replace the faulty fan. 3. View resource summary to identify system service loads. If the system is overloaded, close uncritical services to reduce service loads. 4. Verify that the temperature in the equipment room is within the normal range. 5. Verify the air inlets and outlets are not blocked. 6. Power off the server, and verify that the processor heatsink has a good contact. Smear the thermal grease onto the heatsink, install the heatsink, and power on the server. 7. If the issue persists, contact Technical Support.

FRB1/BIST failure

Event code	0x072000de
Message text	FRB1/BIST failure
Variable fields	N/A
Severity level	Minor
Example	FRB1/BIST failure
Impact	It might result in the operating system failing to start up normally and hardware being downgraded.
Cause	The processor core BIST failed.
Recommended action	1. Power off and power on the server to clear the alarm. 2. If the issue persists, replace the processors. 3. If the issue persists, contact Technical Support.

Processor Presence detected

Event code	0x077000df
Message text	Processor Presence detected.
Variable fields	N/A
Severity level	Info/Critical
Example	Processor Presence detected
Impact	The system cannot start up if the primary processor is absent.
Cause	The system detected the absence or misinstallation of the primary processor.
Recommended action	1. Verify that the primary processor is installed correctly. 2. Replace the faulty primary processor. 3. If the issue persists, contact Technical Support.

Processor Automatically Throttled

Event code	0x07a000de
Message text	Processor Automatically Throttled---due to fan error.
Variable fields	N/A
Severity level	Minor
Example	Processor Automatically Throttled---due to fan error
Impact	CPU underclocking causes a decrease in system performance.
Cause	The processor was underclocked because a fan fails.
Recommended action	1. Verify that the heat dissipation setting meets the requirements of the running services. 2. Verify that the temperature in the equipment room is within the normal range and the air inlet and outlet are not blocked. 3. Verify that the fans are not blocked and are operating correctly. 4. Replace the faulty fans. 5. If the issue persists, contact Technical Support.

Machine Check Exception

Event code	0x07b000de
Message text	Machine Check Exception---$1---$2---Location: Socket:$3
Variable fields	$1: Error type. $2: Indicates whether the error occurred during this system boot. Options include: ¡ Current Boot Error. ¡ Last Boot Error. $3: CPU number.
Severity level	Critical
Example	Machine Check Exception---SMN---Last Boot Error---Location: Socket:1
Impact	The system stops responding.
Cause	An uncorrectable error occurred.
Recommended action	1. Upgrade the BIOS and HDM firmware to the latest version. 2. Review event logs to locate the failed processor or other components. 3. Reboot the device. 4. Verify that the processor and memories are operating correctly. 5. If the issue persists, contact Technical Support.

triggered an uncorrectable error

Event code	0x07b000de
Message text	CPU $1 triggered an uncorrectable error.
Variable fields	$1: CPU number.
Severity level	Critical
Example	CPU 1 triggered an uncorrectable error.
Impact	The system stops responding.
Cause	An IERR or MCERR error occurred. The BMC diagnostic result is "CPU uncorrectable error."
Recommended action	1. Upgrade the BIOS and HDM firmware to the latest version. 2. Review event logs to locate the failed memory, PCIe module, or processor. 3. Power off the server, and replace the failed component. 4. Replace the system board. 5. If the issue persists, contact Technical Support.

Machine Check Error

Event code	0x07b100de
Message text	Machine Check Error ---Location: Processor:$1 ---IIO Stack number:$2 ---$3---$4
Variable fields	$1: CPU number. $2: IIO Stack number (IIO port number). $3: Current boot or last boot. $4: Error type.
Severity level	Critical
Example	Machine Check Exception---Location: Processor:1 ---IIO Stack number:1 --Last Boot---ITC Error:ECC uncorrectable error in the ITC dat_dword RF
Impact	The system stops responding.
Cause	Internal uncorrectable errors were detected on the processor, such as VT-d errors, ITC errors, OTC errors, DMA errors, IRP errors, and Ring errors. This error also triggers other alarms.
Recommended action	1. Review event logs. 2. If the issue persists, contact Technical Support.

Machine Check Error---CPU core errors

Event code	0x07b150de
Message text	Machine Check Error ---CPU core errors --- ErrorType:$1---Location: Processor:$2 core MCA bank: $(3)
Variable fields	$1: General error type. $2: CPU number. $3: Error type.
Severity level	Critical
Example	Machine Check Exception---CPU core errors--ErrorType:Unknow--Fatal Error--Last Boot---Location: Processor:1 core MCA bank: instruction fetch unit
Impact	The system stops responding.
Cause	Internal uncorrectable errors were detected on the processor, such as CPU core errors.
Recommended action	1. Review event logs. 2. If the issue persists, contact Technical Support.

triggered a correctable error

Event code	0x07c000de
Message text	CPU $1 triggered a correctable error.
Variable fields	$1: CPU number.
Severity level	Minor
Example	CPU 1 triggered a correctable error.
Impact	No negative impact.
Cause	An IERR or MCERR error occurred. The BMC diagnostic result is "CPU uncorrectable error."
Recommended action	1. Upgrade the BIOS and HDM firmware to the latest version. 2. Review event logs to locate the failed CPU or other components. 3. Power off the server, and replace the failed component. 4. Replace the system board. 5. If the issue persists, contact Technical Support.

Correctable Machine Check Error

Event code	0x07c100de
Message text	Correctable Machine Check Error ---location: Processor:$1 ---IIO Stack number:$2 ---$3---$4
Variable fields	$1: CPU number. $2: IIO Stack number. $3: Current boot or last boot. $4: Error type.
Severity level	Minor
Example	Correctable Machine Check Error---Location: Processor:1 ---IIO Stack number:1 --Last Boot---DMA Error:Descriptor Count Error
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as VT-d errors, ITC errors, OTC errors, DMA errors, IRP errors, and Ring errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---CPU UPI errors

Event code	0x07c110de
Message text	Correctable Machine Check Error ---CPU UPI errors ---Location: Processor:$1 UPI port number:$2
Variable fields	$1: CPU number. $2: UPI port.
Severity level	Minor
Example	Correctable Machine Check Error---CPU UPI errors---Location: Processor:2 UPI port number:0x1
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as CPU UPI errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---IOH UPI errors

Event code	0x07c120de
Message text	Correctable Machine Check Error ---IOH UPI errors ---Location: Processor:$1 UPI port number:$2 ---Coherent interface (IRP) local group error code:$3
Variable fields	$1: CPU number. $2: UPI port. $3: Error code.
Severity level	Minor
Example	Correctable Machine Check Error---IOH UPI errors---Location: Processor:1 UPI port number:0x1---Coherent interface (IRP) local group error code:0x6
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as IOH UPI errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---IOH core errors

Event code	0x07c130de
Message text	Correctable Machine Check Error ---IOH core errors ---Location:Processor:$1 ---IIO core local group error code:$2
Variable fields	$1: CPU number. $2: Error code.
Severity level	Minor
Example	Correctable Machine Check Error---IOH core errors---Location: Processor:2---IIO core local group error code:0x6
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as IOH core errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---VT-d errors

Event code	0x07c140de
Message text	Correctable Machine Check Error ---VT-d errors ---Location: Processor:$1 ---VT-d local group error code:$2
Variable fields	$1: CPU number. $2: Error code.
Severity level	Minor
Example	Correctable Machine Check Error---VT-d errors---Location: Processor:2---VT-d local group error code:0x6
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as VT-d errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---CPU core errors

Event code	0x07c150de
Message text	Correctable Machine Check Error ---CPU core errors ---ErrorType:$1 ---Location: Processor:$2 core MCA bank: $3
Variable fields	$1: General error type. $2: CPU number. $3: Error type.
Severity level	Minor
Example	Correctable Machine Check Error---CPU core errors--ErrorType:Unknow--Current Boot---Location: Processor:2 core MCA bank: mid level cache
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as CPU core errors.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Correctable Machine Check Error---Cbo error

Event code	0x07c160de
Message text	Correctable Machine Check Error ---Cbo error--location: CPU core ID:$1 thread ID:$2 caching agent MCA bank: Cbo$3
Variable fields	$1: Core number. $2: Thread number. $3: Cbo number.
Severity level	Minor
Example	Correctable Machine Check Error---Cbo error---Location: CPU core ID:0x0 thread ID:0x0 caching agent MCA bank: Cbo0
Impact	No negative impact.
Cause	Internal correctable errors were detected on the processor, such as Cbo error.
Recommended action	1. Review event logs and troubleshoot the present errors. 2. If the issue persists, contact Technical Support.

Configuration Error---System is operating in KTI Link Slow Speed Mode

Event code	0x075d7010
Message text	Configuration Error---System is operating in KTI Link Slow Speed Mode- Location:CPU:$1
Variable fields	$1: Core number.
Severity level	Minor
Example	Configuration Error---System is operating in KTI Link Slow Speed Mode- Location:CPU:1
Impact	No negative impact.
Cause	The system is operating in Keizer Technology Interconnect (KTI) low speed mode.
Recommended action	1. Verify that the processors are installed correctly as instructed. For more information about the installation principles, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Power supply

Fully Redundant

Event code	0x08000016
Message text	Fully Redundant
Variable fields	N/A
Severity level	Info
Example	Fully Redundant
Impact	No negative impact.
Cause	Power redundancy.
Recommended action	No action is required.

Fully Redundant

Event code	0x08100017
Message text	Fully Redundant
Variable fields	N/A
Severity level	Major
Example	Fully Redundant
Impact	Power redundancy failure reduces the reliability of device power supply.
Cause	The power supply redundancy was lost.
Recommended action	1. Verify that the environment is normal. 2. Verify that no power supply is removed. 3. Verify that the power supplies have good contacts with the power cords. 4. Verify that all power supplies are operating correctly. 5. If the issue persists, contact Technical Support.

Presence detected

Event code	0x080000df
Message text	Presence detected.
Variable fields	N/A
Severity level	Info
Example	Presence detected
Impact	No negative impact.
Cause	0x080000de: When the power supply is detected as being inserted, this event is triggered, indicating a transition from the power supply not being in place to being in place. 0x080000df: When the power supply is detected as being removed, this event is cleared, indicating a transition from the power supply being in place to not being in place.
Recommended action	1. Verify that the power module is not removed. 2. Verify that the power module is installed correctly. 3. If the issue persists, contact Technical Support.

Redundancy Lost

Event code	0x08100016
Message text	Redundancy Lost.
Variable fields	N/A
Severity level	Major
Example	Redundancy Lost
Impact	Power redundancy failure reduces the reliability of device power supply.
Cause	The power supply redundancy was lost.
Recommended action	1. Verify that the environment is normal. 2. Verify that no power supply is removed. 3. Verify that the power supplies have good contacts with the power cords. 4. Verify that all power supplies are operating correctly. 5. If the issue persists, contact Technical Support.

Power Supply Failure detected

Event code	0x081000de
Message text	Power Supply Failure detected.
Variable fields	N/A
Severity level	Major
Example	Power Supply Failure detected
Impact	It affects system power supply and may result in abnormal system power-off.
Cause	A power supply fault was detected.
Recommended action	1. Verify that the power supply fans are operating correctly. 2. Re-install the power supplies. 3. Verify that the input voltage of the power supply is normal. 4. Replace the faulty power supply. 5. If the issue persists, contact Technical Support.

Power Supply Predictive Failure---PSU Self Check Failed

Event code	0x082000de
Message text	Power Supply Predictive Failure---PSU Self Check Failed---Id: $1
Variable fields	$1: Number of a power supply.
Severity level	Minor
Example	Power Supply Predictive Failure---PSU Self Check Failed---Id: 1
Impact	The power supply may have malfunctions that affect system power supply.
Cause	Power supply self-check failed.
Recommended action	1. Verify that the power supply LED is operating correctly. 2. Verify that the power supply fans are operating correctly. 3. Verify that the power supply is compatible with the server. 4. If the issue persists, contact Technical Support.

Power Supply Predictive Failure

Event code	0x082000de
Message text	Power Supply Predictive Failure.
Variable fields	N/A
Severity level	Minor
Example	Power Supply Predictive Failure
Impact	The power supply may have malfunctions that affect system power supply.
Cause	A predictive power supply fault is detected.
Recommended action	1. Verify that the power supply LED is operating correctly. 2. Verify that the power supply fans are operating correctly. 3. Verify that the input voltage of the power supply is normal. 4. If the issue persists, contact Technical Support.

Power Supply input lost (AC/DC)

Event code	0x083000de
Message text	Power Supply input lost (AC/DC).
Variable fields	N/A
Severity level	Major
Example	Power Supply input lost (AC/DC)
Impact	It may cause the server to power off abnormally.
Cause	The AC power cable of the power supply is unplugged or the AC input is abnormal.
Recommended action	1. Verify that all power cords are not damaged and are correctly connected. 2. Verify that all power supplies are correctly installed. 3. Verify that the power supply fans are operating correctly. 4. Verify that the power input is normal. 5. If the issue persists, contact Technical Support.

Power Supply input lost or out-of-range

Event code	0x084000de
Message text	Power Supply input lost or out-of-range.
Variable fields	N/A
Severity level	Major
Example	Power Supply input out-of-range
Impact	This might cause the server to be powered off abnormally.
Cause	The input voltage of the power supply exceeded the rated range.
Recommended action	1. Verify that the power supply has not been cut off manually. 2. Verify that the input voltage of the power supply is normal. 3. Verify that the power cords and power modules are installed correctly. 4. Re-install the power supplies. Make sure the power supplies have a good contact. 5. Verify that the power supply fans are operating correctly. 6. If the issue persists, contact Technical Support.

Power Supply input out-of-range - but present

Event code	0x085000de
Message text	Power Supply input out-of-range - but present.
Variable fields	N/A
Severity level	Major
Example	Power Supply input out-of-range - but present
Impact	Abnormal power input beyond the supported range might cause the server to be powered off.
Cause	The input voltage was too high.
Recommended action	1. Verify that the input voltage of the power supply is normal. 2. Verify that the power cords and power modules are installed correctly. 3. Re-install the power supplies. Make sure the power supplies have a good contact. 4. Verify that the power supply fans are operating correctly. 5. If the issue persists, contact Technical Support.

Configuration error ---Vendor mismatch

Event code	0x086000de
Message text	Configuration error ---Vendor mismatch.
Variable fields	N/A
Severity level	Minor
Example	Configuration error ---Vendor mismatch
Impact	An unknown risk exists due to the use of non-originally certified components.
Cause	Non-originally certified power supplies are installed.
Recommended action	1. Verify that all power supplies are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---Power supply rating mismatch

Event code	0x086000de
Message text	Configuration error---Power supply rating mismatch:PSU$1,POUT:$2
Variable fields	$1: Power supply ID. $2: Output power of the power supply.
Severity level	Minor
Example	Configuration error---Power supply rating mismatch:PSU1,POUT:2000
Impact	This might result in unstable power supply and abnormal system shutdown.
Cause	Originally certified power supplies are installed, but the models of the two power supplies do not match.
Recommended action	1. If the rated power of the installed power supplies is consistent, remove and install the power supplies in sequence. 2. If the rated power of the installed power supplies is inconsistent, replace the power supplies to make sure they are of the same rated power. 3. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x08700002
Message text	Exceeded the upper minor threshold. ---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading. $2: Total power alarm threshold.
Severity level	Minor
Example	Exceeded the upper minor threshold.---Current reading:2030---Threshold reading:493
Impact	The power exceeded the upper limit and will cause system shutdown.
Cause	The power exceeded the threshold.
Recommended action	1. Log in to HDM and verify that the alarm threshold is appropriate. 2. Log in to HDM and verify that the total power is not too high. 3. Verify that the total power of the power supplies can meet service requirements. 4. If the issue persists, contact Technical Support.

Power Supply Inactive/standby state

Event code	0x087000df
Message text	Power Supply Inactive/standby state.
Variable fields	N/A
Severity level	Info
Example	Power Supply Inactive/standby state
Impact	No negative impact.
Cause	A power supply exited the cold standby state. If power redundancy is configured, a standby power supply automatically exits the cold standby state and supplies power to the server when the system power consumption is too high.
Recommended action	1. Log in to HDM and verify if the total power of the server is too high. 2. If the issue persists, contact Technical Support.

Interlock Power Down

Event code	0x093000de
Message text	Interlock Power Down
Variable fields	N/A
Severity level	Critical
Example	Interlock Power Down
Impact	This might lead to a system crash.
Cause	Fluctuations in the power grid can cause an AC momentary interruption.
Recommended action	1. Verify that the external power supply environment of the server is in normal state. 2. Press and hold the power button until the UID LED stops flashing. 3. If the issue persists, contact Technical Support.

Power Supply Pwok abnormal

Event code	0x08a000de
Message text	Power Supply Pwok abnormal
Variable fields	N/A
Severity level	Major
Example	Power Supply Pwok abnormal
Impact	This might affect the system power supply and eventually leads to system crashes.
Cause	Power output is normal but the Pwok signal on the system board is abnormal and the health LED is on.
Recommended action	1. Verify that the power input is correct. 2. Verify that the system board is in operating correctly. 3. Verify that power supplies are connected correctly to the system board. 4. If the issue persists, contact Technical Support.

Power limit is exceeded over correction time limit

Event code	0x095000de
Message text	Power limit is exceeded over correction time limit---Current Power: $1W.
Variable fields	$1: Current power threshold value.
Severity level	Minor
Example	Power limit is exceeded over correction time limit---Current Power: 2000W.
Impact	The specified action will be taken if power capping fails.
Cause	This alarm is generated if the power exceeds the cap value for a period of time.
Recommended action	1. Adjust the power cap value or the server work load. 2. If the issue persists, contact Technical Support.

Power limit is exceeded over correction time limit

Event code	0x095010de
Message text	Power limit is exceeded over correction time limit---GPU Current Power: $1W.
Variable fields	$1: Configured power threshold value.
Severity level	Minor
Example	Power limit is exceeded over correction time limit---GPU Current Power: 2000W.
Impact	The specified action will be taken if power capping fails.
Cause	This alarm is generated if the power exceeds the cap value for a period of time.
Recommended action	1. Adjust the power cap value or the GPU work load. 2. If the issue persists, contact Technical Support.

Memory

Correctable ECC or other correctable memory error

Event code	0x0c0000de
Message text	Correctable ECC or other correctable memory error--$1-Location:CPU:$2 MEM CTRL:$3 CH:$4 DIMM:$5 $6
Variable fields	$1: Indicates whether the error occurred during this system boot. Options include: · Current Boot Error. · Last Boot Error. $2: CPU number. $3: Memory controller number. $4: Channel number. $5: DIMM number. $6: DIMM mark.
Severity level	Minor
Example	Correctable ECC or other correctable memory error---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A1
Impact	No negative impact.
Cause	A correctable memory error occurred.
Recommended action	No action is required.

CPU triggered a correctable error

Event code	0x0c0000de
Message text	CPU $1 $2 triggered a correctable error
Variable fields	$1: CPU number. $2: DIMM number.
Severity level	Minor
Example	CPU 1 A0 triggered a correctable error
Impact	No negative impact.
Cause	An IERR or MCERR error was triggered. The error was identified by HDM as a correctable error.
Recommended action	No action is required.

Uncorrectable ECC or other uncorrectable memory error

Event code	0x0c1000de
Message text	Uncorrectable ECC or other uncorrectable memory error--$1-Location:CPU:$2 MEM CTRL:$3 CH:$4 DIMM:$5 $6
Variable fields	$1: Indicates whether the error occurred during this system boot. Options include: · Current Boot Error. · Last Boot Error. $2: CPU number. $3: Memory controller number. $4: Channel number. $5: DIMM number. $6: DIMM mark.
Severity level	Major
Example	Uncorrectable ECC or other uncorrectable memory error---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A1
Impact	It can cause the system to stop sending responses, unless the memory is in certain RAS modes, such as mirror or MCA recovery.
Cause	A non-correctable (multiple bit flip) ECC error has occurred.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

triggered an uncorrectable error

Event code	0x0c1000de
Message text	CPU$1 $2 triggered an uncorrectable error
Variable fields	$1: CPU number. $2: DIMM number.
Severity level	Major
Example	CPU1 A0 triggered an uncorrectable error
Impact	It can cause the system to restart or stop sending responses.
Cause	An IERR or MCERR error was triggered. The error was identified by BMC as a memory uncorrectable error.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the CPU socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity

Event code	0x0c2000de
Message text	Parity---$1---Location: Location:CPU:$2 MEM CTRL:$3 CH:$4 DIMM:$5 $6
Variable fields	$1: Indicates whether the error occurred during this system boot. Options include: · Current Boot Error. · Last Boot Error. $2: CPU number. $3: Memory controller number. $4: Channel number. $5: DIMM number. $6: DIMM mark.
Severity level	Minor
Example	Parity---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A0
Impact	No negative impact.
Cause	This error message is generated when a failure occurs in data parity on the command/address lines while the system is reading the memory cell data, resulting in abnormal data access to the memory.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory Training Faulty Part Tracking Uncorrectable Error

Event code	0x0c201310
Message text	Parity---Memory Training Faulty Part Tracking Uncorrectable Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Training Faulty Part Tracking Uncorrectable Error-Location:CPU:2 CH:1 DIMM:B1 Rank:0
Impact	No negative impact.
Cause	A Faulty Parts Tracking error occurred because of an uncorrectable error.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory Receive Enable Training Error

Event code	0x0c204140
Message text	Parity---Memory Receive Enable Training Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Receive Enable Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	A Faulty Parts Tracking error occurred because memory receive enable training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory Write Leveling Training Error

Event code	0x0c205150
Message text	Parity---Memory Write Leveling Training Error-Location:CPU:&1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Write Leveling Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	A Faulty Parts Tracking error occurred because memory write leveling training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory Write DqDqs Training Error

Event code	0x0c206160
Message text	Parity---Memory Write DqDqs Training Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Write DqDqs Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory write Dq and Dqs training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory Sense Amp Training Error

Event code	0x0c2072f0
Message text	Parity---Memory Sense Amp Training Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Sense Amp Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Sense Amp Training failed because a voltage input error occurred.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Warning Command Clock Training Error

Event code	0x0c208260
Message text	Parity---Warning Command Clock Training Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Warning Command Clock Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	An error occurred for command clock training.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---An uncorrectable error occurs during the memory test phase

Event code	0x0c20b1c0
Message text	Parity---An uncorrectable error occurs during the memory test phase-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---An uncorrectable error occurs during the memory test phase-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	An uncorrectable error occurred during the memory test phase.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Memory Training Error

Event code	0x0c20c290
Message text	Parity---Memory Training Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Training Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	A memory training error occurred on the DIMM during the POST phase.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---The number of correctable memory errors reached the error logging threshold

Event code	0x0c21f010
Message text	Parity---The number of correctable memory errors reached the error logging threshold-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---The number of correctable memory errors reached the error logging threshold-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	The number of correctable memory errors reached the error logging threshold.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---An error occurred on the DIMM slot

Event code	0x0c21f020
Message text	Parity---An error occurred on the DIMM slot-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---An error occurred on the DIMM slot-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	An error occurred on the DIMM slot.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---CMD eye width is too small

Event code	0x0c226010
Message text	Parity---CMD eye width is too small-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---CMD eye width is too small-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	CMD eye width was too small.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---The command is not in the FNv table

Event code	0x0c228000
Message text	Parity---The command is not in the FNv table-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---The command is not in the FNv table-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	The command was not in the FNv table.
Recommended action	1. Update BIOS and DCPMM controller firmware to the latest version. 2. If the issue persists, contact Technical Support.

Parity---CTL is not consistent with clock in timing, and the channel is isolated

Event code	0x0c229020
Message text	Parity---CTL is not consistent with clock in timing, and the channel is isolated-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---CTL is not consistent with clock in timing, and the channel is isolated-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	CTL was inconsistent with Clock in timing and the channel was isolated.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory write flyby failed

Event code	0x0c231000
Message text	Parity---Memory write flyby failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory write flyby failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory write flyby failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Timing error occurred during signal line adjustment for memory write leveling training

Event code	0x0c231010
Message text	Parity---Timing error occurred during signal line adjustment for memory write leveling training-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Timing error occurred during signal line adjustment for memory write leveling training-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Timing error occurred during signal line adjustment for write leveling training.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory read DqDqs training failed

Event code	0x0c231130
Message text	Parity---Memory read DqDqs training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory read DqDqs training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory read Dq and Dqs training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory receive enable training failed

Event code	0x0c231140
Message text	Parity---Memory receive enable training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory receive enable training failed-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory Faulty Parts Tracking failed, causing the memory Receive Enable signal to fail to train the corresponding timing.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory write leveling training failed

Event code	0x0c231150
Message text	Parity---Memory write leveling training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory write leveling training failed-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory write leveling training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory write DqDqs training failed

Event code	0x0c231160
Message text	Parity---Memory write DqDqs training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory write DqDqs training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory write Dq and Dqs training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---An error occurs during memory test, and the rank is disabled

Event code	0x0c2311c0
Message text	Parity---An error occurs during memory test, and the rank is disabled-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---An error occurs during memory test, and the rank is disabled-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	An error occurred during the memory test phase. The rank is disabled.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Failed to find the RxVref for data eye training

Event code	0x0c231250
Message text	Parity---Failed to find the RxVref for data eye training-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Failed to find the RxVref for data eye training-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory parity check failed and LRDIMM RCVEN training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---LRDIMM RCVEN training failed

Event code	0x0c231260
Message text	Parity---LRDIMM RCVEN training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---LRDIMM RCVEN training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	LRDIMM RCVEN training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---RCVEN CYCLE training failed

Event code	0x0c231270
Message text	Parity---RCVEN CYCLE training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---RCVEN CYCLE training failed-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	RCVEN CYCLE training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Read delay training failed

Event code	0x0c231280
Message text	Parity---Read delay training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Read delay training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Read delay training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Memory write leveling training failed

Event code	0x0c231290
Message text	Parity---Memory write leveling training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Memory write leveling training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory parity check error occurred and memory write leveling training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Coarse write leveling training failed

Event code	0x0c2312a0
Message text	Parity---Coarse write leveling training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Coarse write leveling training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory parity check error occurred and coarse write leveling training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---Write delay training failed

Event code	0x0c2312b0
Message text	Parity---Write delay training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Write delay training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Write delay training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---QxCA_CLK_NO_EYE training failed

Event code	0x0c2312c0
Message text	Parity---QxCA_CLK_NO_EYE training failed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---QxCA_CLK_NO_EYE training failed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Memory parity check error occurred and QxCA_CLK_NO_EYE training failed.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Parity---mapped out because failed critical mask test at cold boot

Event code	0x0c28c020
Message text	Parity---mapped out because failed critical mask test at cold boot-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---mapped out because failed critical mask test at cold boot-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	During the cold start process, the critical mask detection of the memory failed and was marked as a defect area.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Invalid SPD contents

Event code	0x0c2ed090
Message text	Parity---Invalid SPD contents-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Parity---Invalid SPD contents-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance.
Cause	Invalid SPD contents.
Recommended action	1. Verify that the ambient temperature and humidity are as required. 2. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Verify that the pins in the processor socket are not bent. If any pins are bent, replace the system board. 4. Replace the DIMM. 5. If the issue persists, contact Technical Support.

Memory Device Disabled

Event code	0x0c4000de
Message text	Memory Device Disabled---Location:CPU:$1 Channel:$2 Dimm:$3 $4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: DIMM mark.
Severity level	Major
Example	Memory Device Disabled---Location:Socket:1 Channel:1 Dimm:1 A1
Impact	This might lead to a decrease in system performance.
Cause	The system detected a memory error during startup.
Recommended action	1. Verify if the DIMM is disabled from the BIOS. If yes, enable the DIMM from the BIOS. 2. Verify that the DIMM channel is not faulty. 3. If the issue persists, contact Technical Support.

Memory Device Disabled---the DIMM is disabled

Event code	0x0c40a040
Message text	Memory Device Disabled---The DIMM is disabled-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Major
Example	Memory Device Disabled---The rank is disabled-Location:CPU:2 CH:1 DIMM:B1 Rank:1
Impact	This might lead to a decrease in system performance.
Cause	The DIMM is disabled.
Recommended action	1. Verify if the DIMM is disabled from the BIOS. If yes, enable the DIMM from the BIOS. 2. Verify that the DIMM channel is not faulty. 3. If the issue persists, contact Technical Support.

Memory Device Disabled---the rank is disabled

Event code	0x0c40a030
Message text	Memory Device Disabled---The rank is disabled-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Major
Example	Memory Device Disabled---The rank is disabled-Location:CPU:2 CH:1 DIMM:B1 Rank:1
Impact	This might lead to a decrease in system performance.
Cause	A rank was disabled.
Recommended action	1. Verify that the DIMM is disabled from the BIOS. If yes, enable the DIMM from the BIOS. 2. Verify that the DIMM channel is not faulty. 3. If the issue persists, contact Technical Support.

Memory Device Disabled---Pmem Media disabled

Event code	0x0c484030
Message text	Memory Device Disabled---Pmem Media disabled-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Major
Example	Memory Device Disabled---Pmem Media disabled-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might lead to a decrease in system performance. If critical system components exist in PMem, the system might fail to function properly.
Cause	An error was detected during PMem initialization, which disabled the PMem media. The PMem can be reached and managed in an in-band manner, but the PMem was non-functional and data in the PMem cannot be accessed.
Recommended action	1. Replace the faulty DIMM. 2. If the issue persists, contact Technical Support.

Correctable ECC or other memory error limit reached

Event code	0x0c5000de
Message text	Correctable ECC or other memory error limit reached--$1-Location:CPU:$2 MEM CTRL:$3 CH:$4 DIMM:$5 $6
Variable fields	$1: Indicates whether the error occurred during this system boot. Options include: · Current Boot Error. · Last Boot Error. $2: CPU number. $3: Memory controller number. $4: Channel number. $5: DIMM number. $6: DIMM mark.
Severity level	Minor
Example	Correctable ECC or other memory error limit reached---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A1
Impact	This might result in a restart or stop the system from responding.
Cause	The number of correctable memory errors reached the logging threshold. A correctable memory error might occur if a DIMM is installed incorrectly or an internal memory error occurs. If the memory RAS mode is set, the system performs the specified operation. In memory repair mode, the system still generates the message if the logging threshold is exceeded.
Recommended action	1. Re-install the target DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects 2. Verify that the ambient temperature and humidity are as required. 3. Access the BIOS setup utility, and verify that the correctable error threshold setting is proper. 4. If the issue persists, contact Technical Support.

Presence detected

Event code	0x0c6000de/0x0c6000df
Message text	Presence detected.
Variable fields	N/A
Severity level	Info/Minor
Example	Presence detected
Impact	If the memory module is present, this alarm has no negative impact. If the memory module is absent, this might reduce the system performance.
Cause	0x0c6000de: The system detected the presence of a DIMM. 0x0c6000df: The system detected the absence of a DIMM.
Recommended action	1. Access the BIOS setup utility, and verify if the server starts up with the minimum configuration. If yes, components that are not started are isolated by the BIOS and cannot be detected by HDM. 2. Install or re-install DIMMs. Make sure the gold contacts on the DIMMs are not contaminated, and DIMM slots do not contain any foreign objects. 3. If the issue persists, contact Technical Support.

Configuration error---RDIMMs are installed on the server that supports only UDIMMs

Event code	0x0c701010
Message text	Configuration error---RDIMMs are installed on the server that supports only UDIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---RDIMMs are installed on the server that supports only UDIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	RDIMMs are installed for a processor platform that supports only UDIMMs.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---UDIMMs are installed on the server that supports only RDIMMs

Event code	0x0c702010
Message text	Configuration error---UDIMMs are installed on the server that supports only RDIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---UDIMMs are installed on the server that supports only RDIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	UDIMMs are installed on a server that supports only RDIMMs.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---SODIMMs are installed on the server that supports only RDIMMs

Event code	0x0c703010
Message text	Configuration error---SODIMMs are installed on the server that supports only RDIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---SODIMMs are installed on the server that supports only RDIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	SODIMMs are installed on a server that supports only RDIMMs.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks per channel can be only 1, 2, or 4

Event code	0x0c707020
Message text	Configuration error---The number of ranks per channel can be only 1, 2, or 4-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks per channel can be only 1, 2, or 4-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The number of ranks per channel does not meet the requirements of the processor platform. The processor platform supports only 1, 2, or 4 ranks.
Recommended action	1. Verify that the number of ranks is as required. If not, replace the DIMMs. 2. If the issue persists, contact Technical Support.

Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported

Event code	0x0c707040
Message text	Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, or LRDIMMs are not supported.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks in the channel exceeds 8

Event code	0x0c707050
Message text	Configuration error---The number of ranks in the channel exceeds 8-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks in the channel exceeds 8-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The number of ranks in the channel exceeded 8 (maximum supported number).
Recommended action	1. Verify that the number of ranks in the channel does not exceed upper limit. 2. If the issue persists, contact Technical Support.

Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server

Event code	0x0c707090
Message text	Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Support for ECC on the DIMMs is inconsistent with support for ECC on the server.
Recommended action	1. Identify the DIMM type. Log in to HDM and view ECC support details. If the inconsistency is confirmed, replace the DIMMs. 2. If the issue persists, contact Technical Support.

Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V

Event code	0x0c7070a0
Message text	Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The current voltage cannot meet the requirement of the present DIMMs. The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V.
Recommended action	1. Replace with DIMMs compatible with the current voltage. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with 3DS DIMMs

Event code	0x0c707100
Message text	Configuration error---The CPU is not compatible with 3DS DIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with 3DS DIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The processor is not compatible with 3DS DIMMs.
Recommended action	1. Replace the DIMMs. 2. If the issue persists, contact Technical Support.

Configuration error---NVDIMMs with stepping lower than 0x10 are not supported

Event code	0x0c707110
Message text	Configuration error---NVDIMMs with stepping lower than 0x10 are not supported-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---NVDIMMs with stepping lower than 0x10 are not supported-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The configuration is incorrect. NVDIMMs with stepping lower than 16 are not supported.
Recommended action	1. Access the BIOS setup utility and verify that the DIMMs are supported by the processor. If not, replace the DIMMs. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with 16-GB single-rank DIMMs

Event code	0x0c707120
Message text	Configuration error---The CPU is not compatible with 16-GB single-rank DIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with 16-GB single-rank DIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The processor is not compatible with 16GB single-rank DIMMs.
Recommended action	1. Examine whether the DIMM is a 16GB single-rank DIMM. If yes, replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with the DIMMs

Event code	0x0c707140
Message text	Configuration error---The CPU is not compatible with the DIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with the DIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The processor is not compatible with the DIMMs.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---The frequency of the DIMM is not supported on the server

Event code	0x0c707150
Message text	Configuration error---The frequency of the DIMM is not supported on the server-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The frequency of the DIMM is not supported on the server-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The frequency of the DIMM is not supported on the server.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. Access the BIOS setup utility and verify that Enforce POR is enabled. 3. If the issue persists, contact Technical Support.

Configuration error---NVDIMMs are not compatible with the CPU

Event code	0x0c7071a0
Message text	Configuration error---NVDIMMs are not compatible with the CPU-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---NVDIMMs are not compatible with the CPU-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	DCPMMs are not compatible with the processor.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---DCPMMs are not supported

Event code	0x0c7071d0
Message text	Configuration error---DCPMMs are not supported-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---DCPMMs are not supported-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a decrease in system performance.
Cause	DCPMMs are not supported.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---Memory LockStep Disable Error

Event code	0x0c709090
Message text	Configuration error---Memory LockStep Disable Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Memory LockStep Disable Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a decrease in system performance.
Cause	Failed to enable the LockStep mode. The mode was degraded to independent.
Recommended action	1. Verify that the installed DIMMs meet the requirements of the LockStep mode. For the DIMM installation requirements, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---Memory Mirror Disable Error

Event code	0x0c70a0c0
Message text	Configuration error---Memory Mirror Disable Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Memory Mirror Disable Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	A memory error that BIOS cannot identify occurred. The memory installation does not meet the requirements of the mirror mode.
Recommended action	1. Verify that the installed DIMMs meet the requirements of the Mirror mode. For the DIMM installation requirements, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable the full mirror mode

Event code	0x0c70c010
Message text	Configuration error---Failed to enable the full mirror mode
Variable fields	N/A
Severity level	Minor
Example	Configuration error---Failed to enable the full mirror mode
Impact	This might result in a restart or stop the system from responding.
Cause	Failed to enable the Full Mirror RAS mode. The mirror configuration degraded.
Recommended action	1. Verify that the installed DIMMs meet the requirements of the LockStep mode. For the DIMM installation requirements, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c70e030
Message text	Configuration error---The memory interleaving configuration cannot meet the requirements of the server-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The memory interleaving configuration cannot meet the requirements of the server-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. The memory interleaving configuration cannot meet the requirements of the server.
Recommended action	1. Access the BIOS setup utility, and verify that the memory interleaving configuration (such as NUMA and interleave) can meet the server requirements. 2. Upgrade the BIOS firmware to the latest version. 3. If the issue persists, contact Technical Support.

Configuration error---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c70e080
Message text	Configuration error---The memory interleaving configuration cannot meet the requirements of the server-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The memory interleaving configuration cannot meet the requirements of the server-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. The memory interleaving configuration cannot meet the requirements of the server.
Recommended action	1. Access the BIOS setup utility, and verify that the memory interleaving configuration (such as NUMA and interleave) can meet the server requirements. 2. Upgrade the BIOS firmware to the latest version. 3. If the issue persists, contact Technical Support.

Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent

Event code	0x0c710010
Message text	Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Failed to enable the Rank Sparing mode. The memory RAS mode has degraded to independent mode.
Recommended action	1. Verify that the installed DIMMs meet the requirements of the Rank Sparing mode. For the DIMM installation requirements, see the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---Memory Rank Sparing Error

Event code	0x0c710100
Message text	Configuration error---Memory Rank Sparing Error-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Memory Rank Sparing Error-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a decrease in system performance.
Cause	The memory rank sparing configuration does not take effect.
Recommended action	1. Access the BIOS setup utility and verify that Rank Sparing is enabled. 2. Verify that the installed DIMMs meet the requirements of the Rank Sparing mode. For the DIMM installation requirements, see the user guide for the server. 3. If the issue persists, contact Technical Support.

Configuration error---Failed to enable patrol scrubbing

Event code	0x0c711000
Message text	Configuration error---Failed to enable patrol scrubbing-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable patrol scrubbing-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Failed to enable memory patrol.
Recommended action	1. Identify RAS features supported by the processor specifications as instructed in H3C G3 Servers RAS Technology White Paper. If Patrol Scrub is not supported, disable Patrol Scrub. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty

Event code	0x0c717010
Message text	Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The memory installation is incorrect. Make sure memory installation follows these restrictions: · Populate the DIMM with more ranks in the white slot in each channel. · Populate DIMMs first in white slots.
Recommended action	1. Re-install DIMMs as required in the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel

Event code	0x0c717030
Message text	Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Two DCPMM memory modules cannot be installed in the same channel.
Recommended action	1. Re-install DIMMs as required in the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---The DDR-T memory module is installed in the white slot

Event code	0x0c717050
Message text	Configuration error---The DDR-T memory module is installed in the white slot-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The DDR-T memory module is installed in the white slot-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The DCPMM memory module is installed in the white slot.
Recommended action	1. Re-install DIMMs as required in the user guide for the server. 2. If the issue persists, contact Technical Support.

Configuration error---2LM IMC memory Mismatch

Event code	0x0c7170c0
Message text	Configuration error---2LM IMC memory Mismatch-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---2LM IMC memory Mismatch-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The memory installation did not meet the requirement for single Integrated Memory Controller (IMC) installation in 2LM mode.
Recommended action	1. Verify that DIMMs are installed as required in 2LM mode. Make sure each IMC contains a minimum of one DDR and one DCPMM whose available capacity is larger than 0. 2. If the issue persists, contact Technical Support.

Configuration error---ODT configuration errorThe channel is isolated

Event code	0x0c729030
Message text	Configuration error---ODT configuration error The channel is isolated-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---ODT configuration errorThe channel is isolated-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Memory ODT is configured incorrectly, and the channel is isolated.
Recommended action	1. Re-install the DIMM. Make sure the gold contacts on the DIMM and the DIMM slot are clean. 2. Replace the DIMM. 3. If the issue persists, contact Technical Support.

Configuration error---Failed to enable ADDDC

Event code	0x0c73a010
Message text	Configuration error---Failed to enable ADDDC
Variable fields	N/A
Severity level	Minor
Example	Configuration error---Failed to enable ADDDC
Impact	This might result in a restart or stop the system from responding.
Cause	Failed to enable ADDDC.
Recommended action	1. Access the BIOS setup utility and verify that the memory configuration meets the ADDDC requirements. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable SDDC

Event code	0x0c73b020
Message text	Configuration error---Failed to enable SDDC
Variable fields	N/A
Severity level	Minor
Example	Configuration error---Failed to enable SDDC
Impact	This might result in a decrease in system performance.
Cause	Memory configuration is incorrect. Failed to enable SDDC.
Recommended action	1. Access the BIOS setup utility and verify that the memory configuration meets the SDDC requirements. 2. If the issue persists, contact Technical Support.

Configuration error---DCPMM firmware version not supported

Event code	0x0c73c000
Message text	Configuration error---DCPMM firmware version not supported-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---DCPMM firmware version not supported-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a decrease in system performance.
Cause	Memory configuration is incorrect. The DCPMM firmware version is not supported.
Recommended action	1. Update the DCPMM firmware to the latest version. 2. If the issue persists, contact Technical Support.

Configuration error---DCPMM firmware version not supported

Event code	0x0c73c010
Message text	Configuration error---DCPMM firmware version not supported-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---DCPMM firmware version not supported-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a decrease in system performance.
Cause	Memory configuration is incorrect. The DCPMM firmware version is not supported.
Recommended action	1. Update the DCPMM firmware to the latest version. 2. If the issue persists, contact Technical Support.

Configuration error---NVMCTRL_MEDIA_NOTREADY

Event code	0x0c784020
Message text	Configuration error---NVMCTRL_MEDIA_NOTREADY-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---NVMCTRL_MEDIA_NOTREADY-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	The DCPMM firmware medium is not ready.
Recommended action	1. Update the DCPMM firmware to the latest version. 2. Replace the DIMM. 3. If the issue persists, contact Technical Support.

Configuration error---The DDR-T memory modules of the unexpected model are installed

Event code	0x0c7ed0c0
Message text	Configuration error---The DDR-T memory modules of the unexpected model are installed-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The DDR-T memory modules of the unexpected model are installed-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. The DCPMMs are incompatible with the server.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to set the VDD voltage of the DIMM

Event code	0x0c7f0010
Message text	Configuration error---Failed to set the VDD voltage of the DIMM
Variable fields	N/A
Severity level	Minor
Example	Configuration error---Failed to set the VDD voltage of the DIMM
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. Failed to set the DIMM VDD voltage.
Recommended action	1. Replace the DIMMs. 2. Replace the system board. 3. If the issue persists, contact Technical Support.

Configuration error---Too many RIR rules

Event code	0x0c7f9010
Message text	Configuration error---Too many RIR rules
Variable fields	N/A
Severity level	Minor
Example	Configuration error---Too many RIR rules
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. Too many RIR rules.
Recommended action	1. Upgrade the BIOS to the latest version. 2. Verify that the DIMMs and processors are installed correctly according to the user guide for the server. 3. Access the BIOS setup utility and verify that the memory interleaving and NUMA settings are correct. 4. If the issue persists, contact Technical Support.

Configuration error---The DIMMs for the CPU exceeded the limit

Event code	0x0c7fa010
Message text	Configuration error---The DIMMs for the CPU exceeded the limit-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: Rank number.
Severity level	Minor
Example	Configuration error---The DIMMs for the CPU exceeded the limit-Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	This might result in a restart or stop the system from responding.
Cause	Memory configuration is incorrect. The DIMMs for the processor exceeded the limit.
Recommended action	1. Verify that the memory configuration is supported by the processor specifications. 2. If the issue persists, contact Technical Support.

Drive slot

Drive Presence

Event code	0x0d0000df
Message text	Drive Presence --- $1: $2, HDD Slot: $3.
Variable fields	$1: Drive bay slot in HDD bay deployment or JBOD slot in a cabinet server. $2: ¡ When $1 is a drive bay slot, this parameter represents the drive bay slot number, which can be 1, 2, 5, 6, 9, 10, 13, or 14. ¡ When $1 is a JBOD slot, this parameter represents the JBOD slot number in the range of 1 to 8. $3: ¡ When $1 is a drive bay slot, this parameter represents the drive identifier in the range of 0 to 39. ¡ When $1 is a JBOD slot, this parameter represents the drive slot number in the range of 0 to 22.
Severity level	Info
Example	Drive Presence --- Bay Slot: 1, HDD Slot: 2
Impact	The drive presence changed.
Cause	The drive presence changed.
Recommended action	No action is required.

Drive Fault

Event code	0x0d1000de
Message text	Drive Fault --- $1: $2, HDD Slot: $3
Variable fields	$1: Drive bay slot in HDD bay deployment or JBOD slot in a cabinet server. $2: ¡ When $1 is a drive bay slot, this parameter represents the drive bay slot number, which can be 1, 2, 5, 6, 9, 10, 13, or 14. ¡ When $1 is a JBOD slot, this parameter represents the JBOD slot number in the range of 1 to 8. $3: ¡ When $1 is a drive bay slot, this parameter represents the drive identifier in the range of 0 to 39. ¡ When $1 is a JBOD slot, this parameter represents the drive slot number in the range of 0 to 22.
Severity level	Major
Example	Drive Fault --- Bay Slot: 1, HDD Slot: 2
Impact	Data loss might occur due to the drive fault.
Cause	The drive was faulty.
Recommended action	1. Log in to HDM, view drive information, and verify that all drives in the logical drive are identified correctly. If a drive cannot be identified, re-install the drive. If the drive cannot be identified after re-installation, replace the drive. 2. View drive information and verify that the status of the drive is Unconfigured Good. 3. View drive information and verify that the drive can be identified and is normal, and the drive number on HDM is consistent with the drive number in the message. If the drive number on HDM is different from the drive number in the message, verify that the drive cables are connected correctly. 4. If multiple drives are absent, examine if the data cables and RAID controllers are faulty. If multiple drives are in place but not displayed, examine if the signal cables and the drive backplane are faulty. 5. Verify that drive LEDs are normal, and the drive can be identified and is accessible in the OS. If a drive LED is orange, the drive is faulty. Replace the faulty components, if any. 6. Verify that the storage controller is in normal state. 7. If the issue persists, contact Technical Support.

Predictive Failure

Event code	0x0d2000de
Message text	Predictive Failure---Bay Slot: $1, HDD Slot: $2
Variable fields	$1: Bay slot number, which can be 1, 2, 5, 6, 9, 10, 13, or 14. $2: Drive identifier in the range of 0 to 39.
Severity level	Minor
Example	Predictive Failure---Bay Slot: 1, HDD Slot: 2
Impact	The decrease in drive reliability might affect the operating system's storage performance and service operations.
Cause	The RAID controller reports a predictive failure, which can be a storage medium reserved block alarm, drive lifetime alarm, Prefail alarm, or bad sector alarm.
Recommended action	1. Log in to HDM to verify that the drive is in normal state. 2. Replace the drive. 3. If the issue persists, contact Technical Support.

Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold

Event code	0x0d4000de
Message text	Linux: Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold---OS:Linux/Unix,See disk details about Logical disk name, Threshold $1: ---Current usage $2 Windows: Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold---OS:Windows, Logical disk $1:---Current usage $2
Variable fields	Linux: · $1: Drive space usage threshold. · $2: Current drive space usage. Windows: · $1: Drive letter. · $2: Current drive space usage.
Severity level	Info
Example	Linux: Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold --OS:Linux/Unix,See disk details about Logical disk name, Threshold 75%: ---Current usage 80% Windows: Consistency Check / Parity Check in progress. System Source Monitor: Hard Disk usage exceeds the threshold ---OS:Windows, Logical disk d: ---Current usage 80%
Impact	A high usage will result in decreased performance, task backlog, decreased system stability, and data loss or corruption.
Cause	Drive usage exceeded the threshold. You can configure the processor usage, memory usage, and drive usage thresholds from HDM. During operation, FIST SMS obtains system resource usage information, and sends the information to HDM through IPMI commands. HDM generates this message if a threshold is exceeded.
Recommended action	1. Use the HDM system resource monitoring feature to monitor the drive usage. If the usage is abnormal, contact Technical Support. 2. If the drive usage is normally high, back up data and expand the drive capacity.

Consistency Check / Parity Check in progress. System Source Monitor: Relieve resource alarm about Hard Disk Usage

Event code	0x0d4000df
Message text	Linux: Consistency Check / Parity Check in progress. System Source Monitor: System Source Monitor: Relieve resource alarm about Hard Disk Usage ---OS:Linux/Unix,See disk details about Logical disk name, Threshold $1: ---Current usage $2 Windows: Consistency Check / Parity Check in progress. System Source Monitor: System Source Monitor: Relieve resource alarm about Hard Disk Usage ---OS:Windows, Logical disk $1:---Current usage $2
Variable fields	Linux: · $1: Drive space usage threshold. · $2: Current drive space usage. Windows: · $1: Drive letter. · $2: Current drive space usage.
Severity level	Info
Example	Linux: Consistency Check / Parity Check in progress. System Source Monitor: Relieve resource alarm about Hard Disk Usage ---OS:Linux/Unix,See disk details about Logical disk name, Threshold 80%: ---Current usage 75% Windows: Consistency Check / Parity Check in progress. System Source Monitor: Relieve resource alarm about Hard Disk Usage ---OS:Windows, Logical disk d: ---Current usage 80%
Impact	Performance decrease, system crashes, data corruption, and security issues might occur.
Cause	This message is generated when the system resource usage drops below the alarm threshold. This is an alarm removal log for event 0x0d4000de. You can configure the processor usage, memory usage, and drive usage thresholds from HDM. During operation, FIST SMS obtains system resource usage information, and sends the information to HDM through IPMI commands. HDM generates this message if a threshold is exceeded.
Recommended action	No action is required.

In Critical Array

Event code	0x0d5000de
Message text	In Critical Array---$1:$2$3 :$4.
Variable fields	$1: Drive bay slot or PCIe slot. $2: When $1 is a drive bay slot, this parameter represents the drive bay slot number, which can be 1, 2, 5, 6, 9, 10, 13, or 14. When $1 is a PCIe slot, this parameter represents the slot number of the storage controller that manages the logical drive. $3: HDD slot or LDDevno. $4: When $3 represents HDD slot, this parameter represents the drive identifier in the range of 0 to 39. When $3 represents LDDevno, this parameter represents the logical drive number.
Severity level	Major
Example	In Critical Array---PCIe slot:1---LDDevno :2
Impact	The RAID array is degraded, which will affect data reliability.
Cause	A drive in a logical drive was removed or failed and the logical drive degraded.
Recommended action	1. Verify that the drive is not removed. If the drive is removed, re-install the drive and recreate the RAID array. 2. Log in to HDM, view drive information, and verify that all drives in the logical drive are identified correctly. If a drive cannot be identified, re-install the drive. If the drive cannot be identified after re-installation, replace the drive. 3. Log in to HDM, view drive information, and verify that the status of the drive is Unconfigured Good. 4. After the drive is identified correctly, recreate the RAID array. 5. If the issue persists, contact Technical Support.

In Failed Array

Event code	0x0d6000de
Message text	In Failed Array---$1:$2$3 :$4.
Variable fields	$1: Drive bay slot or PCIe slot. $2: When $1 is a drive bay slot, this parameter represents the drive bay slot number. When $1 is a PCIe slot, this parameter represents the slot number of the storage controller that manages the logical drive. $3: HDD slot or LDDevno. $4: When $3 represents HDD slot, this parameter represents the drive identifier. When $3 represents LDDevno, this parameter represents the logical drive number.
Severity level	Major
Example	In Failed Array---PCIe slot:1---LDDevno :2
Impact	The RAID array becomes invalid, which causes the loss of the offline data.
Cause	A drive in a logical drive was removed or failed and the logical drive was totally corrupted.
Recommended action	1. Verify that the drive is not removed. If the drive is removed, re-install the drive and recreate the RAID array. 2. Log in to HDM, view drive information, and verify that all drives in the logical drive are identified correctly. If a drive cannot be identified, re-install the drive. If the drive cannot be identified after re-installation, replace the drive. 3. Log in to HDM, view drive information, and verify that the status of the drive is Unconfigured Good. 4. After the drive is identified correctly, verify that the RAID array is normal. If the RAID array is faulty, recreate the RAID array. 5. If the issue persists, contact Technical Support.

Rebuild/Remap in progress

Event code	0x0d7000de
Message text	Rebuild/Remap in progress---Bay Slot: $1, HDD Slot: $2.
Variable fields	$1: Drive bay number, including 1, 2, 5, 6, 9, 10, 13, and 14. $2: Drive number on a drive bay in the range of 0 to 39.
Severity level	Info
Example	Rebuild/Remap in progress---Bay Slot: 1, HDD Slot: 2
Impact	No negative impact.
Cause	The message is generated when a drive is installed during RAID rebuilding.
Recommended action	No action is required.

The disk triggered a media error

Event code	0x0da000de
Message text	The disk triggered a media error--$1.
Variable fields	$1: Drive location.
Severity level	Info
Example	The disk triggered an media error--Front 1
Impact	Data loss might occur due the occurrence of media errors on the storage media.
Cause	The number of media errors exceeded the threshold.
Recommended action	1. Upgrade the firmware of the drive. 2. Replace the drive. 3. If the issue persists, contact Technical Support.

The disk triggered an uncorrectable error

Event code	0x0db000de
Message text	The disk triggered an uncorrectable error--$1.
Variable fields	$1: Drive location.
Severity level	Minor
Example	The disk triggered an uncorrectable error--Front 1
Impact	Data loss might occur due the occurrence of uncorrectable errors on the storage media.
Cause	The number of uncorrectable errors exceeded the threshold.
Recommended action	1. Upgrade the firmware of the drive. 2. Replace the drive. 3. If the issue persists, contact Technical Support.

The disk is missing

Event code	0x0dc000de
Message text	The disk is missing.
Variable fields	N/A
Severity level	Major
Example	The disk is missing
Impact	The reliability of the storage system might be affected due to the removal or incorrect installation of a drive.
Cause	This message is generated when the storage system fails to identify this drive or a cable connection error occurs.
Recommended action	1. Log in to HDM, view drive information, and verify that all drives in the logical drive are identified correctly. 2. Verify that the drive data cables, power cords, and signal cables are connected correctly. 3. Re-install the drive. 4. Replace the drive. 5. Verify that the storage controller is in normal state. 6. If the issue persists, contact Technical Support.

System firmware Progress

System Firmware Error (POST Error)---CPU matching failure

Event code	0x0f0000de
Message text	System Firmware Error (POST Error)---CPU matching failure.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU matching failure
Impact	The system might fail to start up correctly. The system might fail to start up correctly.
Cause	The BIOS detected a CPU frequency, microcode, or UPI matching error at POST.
Recommended action	1. Verify that the processors are installed correctly as required in the user guide for the server. 2. Verify that the CPUs have the same model. 3. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---Firmware (BIOS) ROM corruption detected

Event code	0x0f0000de
Message text	System Firmware Error (POST Error)---Firmware (BIOS) ROM corruption detected.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Firmware (BIOS) ROM corruption detected
Impact	The system cannot start up correctly.
Cause	The BIOS detected ROM corruption at POST. The BIOS firmware is damaged when this message is generated.
Recommended action	1. Upgrade the BIOS firmware. 2. Upgrade the BIOS with the factory defaults (if any) or default settings of the BIOS restored. 3. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---Load microcode failed

Event code	0x0f0000de
Message text	System Firmware Error (POST Error)---Load microcode failed.
Variable fields	N/A
Severity level	Minor
Example	System Firmware Error (POST Error)---Load microcode failed
Impact	The system might fail to start up correctly.
Cause	The BIOS detected errors at POST because CPU microcode failed to be loaded, but the system did not hang.
Recommended action	1. Power off and then power on the server. 2. Upgrade HDM and the BIOS firmware to the latest version. 3. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---No system memory or invalid memory configuration

Event code	0x0f0000de
Message text	System Firmware Error (POST Error)---No system memory or invalid memory configuration.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No system memory or invalid memory configuration
Impact	The system might fail to start up correctly.
Cause	No DIMM was detected during BIOS startup. This symptom might occur if DIMMs are installed incorrectly.
Recommended action	1. Verify that the DIMMs are installed correctly as required in the user guide of the server. Re-install all DIMMs if needed. 2. If the issue persists, contact Technical Support.

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image is unsigned or Certificate is invalid

Event code	0x0f0000de
Message text	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image is unsigned or Certificate is invalid.
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image is unsigned or Certificate is invalid
Impact	The system cannot start up correctly.
Cause	The BIOS detected ROM corruption at POST.
Recommended action	1. Verify if the BIOS boot mode meets the requirements of secure boot. If not, change the boot mode to UEFI. 2. Verify that the BIOS firmware is upgraded successfully. 3. Upgrade the BIOS with the factory defaults (if any) or default settings of the BIOS restored. 4. If the issue persists, contact Technical Support.

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate not found in Authorized database(db)

Event code	0x0f0000de
Message text	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate not found in Authorized database(db).
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate not found in Authorized database(db)
Impact	The system might fail to start up correctly.
Cause	The BIOS detected ROM corruption at POST.
Recommended action	1. Verify if the BIOS boot mode meets the requirements of secure boot. If not, change the boot mode to UEFI. 2. Verify that the BIOS firmware is upgraded successfully. 3. Upgrade the BIOS with the factory defaults (if any) or default settings of the BIOS restored. 4. If the issue persists, contact Technical Support.

System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate is found in Forbidden database(dbx)

Event code	0x0f0000de
Message text	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate is found in Forbidden database(dbx).
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---Firmware (BIOS) ROM corruption detected:Image Certificate is found in Forbidden database(dbx)
Impact	The system might fail to start up correctly.
Cause	The BIOS detected ROM corruption at POST.
Recommended action	1. Verify if the BIOS boot mode meets the requirements of secure boot. If not, change the boot mode to UEFI. 2. Verify that the BIOS firmware is upgraded successfully. 3. Upgrade the BIOS with the factory defaults (if any) or default settings of the BIOS restored. 4. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---Memory Population Rule Error

Event code	0x0f002170
Message text	System Firmware Error (POST Error)---Memory Population Rule Error
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Memory Population Rule Error
Impact	The system might fail to start correctly, or the system performance might decrease.
Cause	DIMM Faulty Parts Tracking error occurred because of incorrect DIMM population.
Recommended action	1. Verify that DIMMs are installed correctly base on the user guide of the server. Re-install the DIMMs if needed. 2. If the issue persists, contact Technical Support.

System firmware error (POST error)---DIMM installation or compatibility error occurred

Event code	0x0f003070
Message text	System firmware error (POST error)---DIMM installation or compatibility error occurred.
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---DIMM installation or compatibility error occurred
Impact	The system might fail to start correctly, or the system performance might decrease.
Cause	DIMMs were installed incorrectly.
Recommended action	1. Log in to HDM, access the Memory page, and verify that the no faulty DIMMs exist. 2. Verify that DIMMs are installed correctly as required in the user guide for the server. 3. Verify that a minimum of one DIMM operates correctly for each processor. 4. If the issue persists, contact Technical Support.

System firmware error (POST error)---No Memory Usable

Event code	0x0f003e80
Message text	System firmware error (POST error)---No Memory Usable
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---No Memory Usable
Impact	The system cannot start up correctly.
Cause	No memory was available.
Recommended action	1. Verify that the DIMM s are installed as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 5. Replace the DIMM, and then restart the server. 6. If the issue persists, contact Technical Support.

System firmware error (POST error)---No DDR Memory Error

Event code	0x0f0082a0
Message text	System firmware error (POST error)---No DDR Memory Error
Variable fields	N/A
Severity level	Major
Example	System firmware error (POST error)---No DDR Memory Error
Impact	The system cannot start up correctly.
Cause	No DDR memory module was available.
Recommended action	1. Verify that the DIMMs are installed as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMM correctly. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 5. Replace the DIMM, and then restart the server. 6. If the issue persists, contact Technical Support.

System firmware error (POST error)---DIMM Compatible Error(LRDIMM and RDIMM are installed)

Event code	0x0f00bed0
Message text	System firmware error (POST error)---DIMM Compatible Error(LRDIMM and RDIMM are installed)
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DIMM Compatible Error(LRDIMM and RDIMM are installed)
Impact	The system cannot start up correctly.
Cause	Both LRDIMM and RDIMM DIMMs are installed on the same server.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. Install DIMMs as required in the user guide for the server. 3. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---No DIMMs present

Event code	0x0f02a010
Message text	System Firmware Error (POST Error)---No DIMMs present
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No DIMMs present
Impact	The system cannot start up correctly.
Cause	No DIMM is available for the G5 server.
Recommended action	1. Verify that the DIMMs are installed correctly as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 5. Replace the DIMM, and then restart the server. 6. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---No DDR memory in the system

Event code	0x0f02a040
Message text	System Firmware Error (POST Error)---No DDR memory in the system
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No DDR memory in the system
Cause	No DDR memory module is available for the G5 server.
Impact	The system cannot start up correctly.
Recommended action	1. Verify that the DIMM is installed correctly as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 5. Replace the DIMM, and then restart the server. 6. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation

Event code	0x0f0e8020
Message text	System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation
Impact	The system performance might decrease.
Cause	No DIMMs is available for memory mapping.
Recommended action	1. Log in to HDM, access the Memory page, and verify that available DIMMs exist. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---Different DIMM types detected

Event code	0x0f0ed010
Message text	System Firmware Error (POST Error)---Different DIMM types detected.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Different DIMM types detected.
Impact	The system might fail to start up correctly.
Cause	Different DIMM types were detected.
Recommended action	1. Log in to HDM, access the Event Log page, and identify the slot of the faulty DIMM. 2. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 3. Use the memory configuration query tool on the website accessed at step 2 to verify that the DIMMs are correctly installed. 4. Re-install the DIMMs as required in the user guide for the server. 5. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---DIMM population error

Event code	0x0f0ed020
Message text	System Firmware Error (POST Error)---DIMM population error.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DIMM population error.
Impact	The system might fail to start up correctly.
Cause	A memory compatibility error occurs.
Recommended action	1. Log in to HDM, access the Event Log page, and identify the slot of the faulty DIMM slot. 2. Re-install the DIMMs as required in the user guide for the server. 3. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---A maximum of two quad-rank DIMMs can be populated per channel

Event code	0x0f0ed030
Message text	System Firmware Error (POST Error)---A maximum of two quad-rank DIMMs can be populated per channel.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---A maximum of two quad-rank DIMMs can be populated per channel.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. You can install a maximum of two quad-rank DIMMs per channel.
Recommended action	1. Re-install the DIMMs as required in the user guide for the server. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---The third DIMM slot with green release tabs does not support UDIMMs or SODIMMs

Event code	0x0f0ed040
Message text	System Firmware Error (POST Error)---The third DIMM slot with green release tabs does not support UDIMMs or SODIMMs.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---The third DIMM slot with green release tabs does not support UDIMMs or SODIMMs.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. The third DIMM slot does not support UDIMMs or SODIMMs.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ to identify DIMMs compatible with the server. Replace the UDIMMs or SODIMMs with compatible DIMMs. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---DIMM voltage error

Event code	0x0f0ed050
Message text	System Firmware Error (POST Error)---DIMM voltage error.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DIMM voltage error.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. A DIMM voltage fault is present.
Recommended action	1. Log in to HDM, access the Event Log page, and identify the slot of the faulty DIMM. Cross-validate the DIMM with other DIMMs. If the issue persists, replace the system board because the error is caused by a faulty memory slot. If the issue does not re-occur, replace the faulty DIMM. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---DDR3 and DDR4 DIMMs cannot be mixed

Event code	0x0f0ed060
Message text	System Firmware Error (POST Error)---DDR3 and DDR4 DIMMs cannot be mixed.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DDR3 and DDR4 DIMMs cannot be mixed.
Impact	The system cannot start up correctly.
Cause	A system firmware error (POST error) occurred. You cannot install both DDR3 DIMMs and DDR4 DIMMs on the same server.
Recommended action	1. Replace the DDR3 DIMMs or DDR4 DIMMs to make sure DIMMs installed on the server are of the same type. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---256-byte and 512-byte SPD devices cannot be mixed

Event code	0x0f0ed070
Message text	System Firmware Error (POST Error)---256-byte and 512-byte SPD devices cannot be mixed.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---256-byte and 512-byte SPD devices cannot be mixed.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. You cannot use both 256-byte and 512-byte SPD devices at the same time.
Recommended action	1. Replace the 256-byte or 512-byte SPD devices to make sure devices installed on the server are of the same type. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---3DS and non-3DS LRDIMMs cannot be mixed

Event code	0x0f0ed080
Message text	System Firmware Error (POST Error)---3DS and non-3DS LRDIMMs cannot be mixed.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---3DS and non-3DS LRDIMMs cannot be mixed.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. You cannot use both 3DS and non-3DS LRDIMMs on the same server.
Recommended action	1. Replace the 3DS or non-3DS LRDIMMs to make sure DIMMs installed on the server are of the same type. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---DDR-T memory modules and UDIMMs cannot be mixed

Event code	0x0f0ed0b0
Message text	System Firmware Error (POST Error)---DDR-T memory modules and UDIMMs cannot be mixed.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DDR-T memory modules and UDIMMs cannot be mixed.
Impact	The system might fail to start up correctly.
Cause	A system firmware error (POST error) occurred. You cannot use both DDR-T DIMMs and UDIMMs on the same server.
Recommended action	1. Replace the DDR-T DIMMs or UDIMMs to make sure DIMMs installed on the server are of the same type. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---Memory Unrecognized Initialization Error

Event code	0x0f0ffff0
Message text	System Firmware Error (POST Error)---Memory Unrecognized Initialization Error.
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Memory Unrecognized Initialization Error.
Impact	The system might fail to start up correctly.
Cause	An initialization error was detected on the initialization of some memory modules.
Recommended action	1. Resolve the issue as instructed by the log reported simultaneously for the component. 2. If the issue persists, contact Technical Support.

System Firmware Hang---Unspecified

Event code	0x0f1000de
Message text	System Firmware Hang---Unspecified.
Variable fields	N/A
Severity level	Major
Example	System Firmware Hang---Unspecified.
Impact	The system cannot operate correctly.
Cause	The BIOS hangs during startup.
Recommended action	1. Resolve the issue based on other event logs reported simultaneously for the component. 2. If the issue persists, contact Technical Support.

System firmware hang-----No DDR Memory Error

Event code	0x0f103e80
Message text	System firmware hang-----No DDR Memory Error.
Variable fields	N/A
Severity level	Major
Example	System firmware hang-----No DDR Memory Error.
Impact	The system cannot operate correctly.
Cause	The operating system hanged because no DDR DIMMs were available.
Recommended action	1. Make sure the device is installed with DDR DIMMs as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMMs. Make sure that the golden plating is not contaminated, no foreign objects are in the memory slots, and the memory installation complies with the requirements. 5. Replace the DIMMs and power cycle the server again. 6. If the issue persists, contact Technical Support.

System firmware hang---DIMM Compatible Error(LRDIMM and RDIMM are installed)

Event code	0x0f10bed0
Message text	System firmware hang---DIMM Compatible Error(LRDIMM and RDIMM are installed).
Variable fields	N/A
Severity level	Major
Example	System firmware hang---DIMM Compatible Error(LRDIMM and RDIMM are installed).
Impact	The system cannot operate correctly.
Cause	Both LRDIMMs and RDIMMs are installed on the device.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. Re-install the DIMMs as required in the user guide for the server. 3. If the server persists, contact Technical Support.

System firmware hang---Memory Unrecognized Initialization Error

Event code	0x0f1ffff0
Message text	System firmware hang---Memory Unrecognized Initialization Error.
Variable fields	N/A
Severity level	Critical
Example	System firmware hang---Memory Unrecognized Initialization Error.
Impact	The system cannot operate correctly.
Cause	A memory initialization error occurred. An error occurs on the DIMMs for the primary CPU. Then, no DIMMs are available for the primary CPU, which causes system hanging.
Recommended action	1. Resolve the issue as instructed by the logs reported simultaneously for the component. 2. If the issue persists, contact Technical Support.

System Firmware Progress---Current Memory Ras Mode

Event code	0x0f20eff0
Message text	System Firmware Progress---Current Memory Ras Mode.
Variable fields	N/A
Severity level	Info
Example	System Firmware Progress---Current Memory Ras Mode.
Impact	No negative impact.
Cause	The memory is in RAS mode.
Recommended action	No action is required.

System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket

Event code	0x0f017130
Message text	System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket
Variable fields	N/A
Severity level	Minor
Example	System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket
Impact	The system performance might decrease.
Cause	The DIMMs are installed incorrectly.
Recommended action	1. Access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/ and verify that the DIMMs are compatible with the server. 2. Re-install the DIMMs as required in the user guide for the server. 3. If the server persists, contact Technical Support.

System Firmware Error (POST Error)---No DIMMs installed for CPU

Event code	0x0f017180
Message text	System Firmware Error (POST Error)---No DIMMs installed for CPU
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No DIMMs installed for CPU
Impact	The system cannot operate correctly.
Cause	No DIMM is installed.
Recommended action	1. Verify that DIMMs are installed correctly as required in the user guide for the server. 2. Upgrade the BIOS and HDM firmware to the latest version. 3. Power off the server, reconnect all power cords, and then power on the server. Make sure the server is completely powered off before powering on the server. 4. Re-install the DIMMs. Make sure that the golden plating is not contaminated, no foreign objects are in the memory slots, and the memory installation complies with the requirements. 5. Replace the DIMMs and power cycle the server again. 6. If the issue persists, contact Technical Support.

Event Logging Disabled

Log Area Reset/Cleared

Event code	0x102000de
Message text	Log Area Reset/Cleared.
Variable fields	N/A
Severity level	Info
Example	Log Area Reset/Cleared.
Impact	No negative impact.
Cause	This message is displayed if all event logs are cleared.
Recommended action	No action is required.

SEL Full

Event code	0x104000de
Message text	SEL Full.
Variable fields	N/A
Severity level	Minor
Example	SEL Full.
Impact	The system cannot continue recording event logs.
Cause	This message might be displayed if one of the following occurs: · The event log reaches its maximum size. The system stops logging new events, and the old logs might be overwritten. · A user disables event logging.
Recommended action	Log in to HDM, enter the Event Log page, and clear all event logs.

SEL Almost Full

Event code	0x105000de
Message text	SEL Almost Full.
Variable fields	N/A
Severity level	Minor
Example	SEL Almost Full.
Impact	No negative impact,
Cause	The log file is reaching its maximum size.
Recommended action	Log in to HDM, enter the Event Log page, and clear all event logs.

Watchdog1

BIOS Watchdog Reset

Event code	0x110000de
Message text	BIOS Watchdog Reset.
Variable fields	N/A
Severity level	Major
Example	BIOS Watchdog Reset.
Impact	The system restarts.
Cause	This message is displayed if one of the following occurs: · The BIOS startup time exceeds the threshold. · The BIOS gets stuck during startup.
Recommended action	1. Verify that the BIOS is operating correctly: a. Verify that the peripheral modules operate correctly and the BIOS settings are configured correctly. b. Verify that the BIOS debug mode is disabled. 2. If the issue persists, contact Technical Support.

OS Watchdog NMI/Diagnostic Interrupt

Event code	0x115000de
Message text	OS Watchdog NMI/Diagnostic Interrupt.
Variable fields	N/A
Severity level	Major
Example	OS Watchdog NMI/Diagnostic Interrupt.
Impact	If this message is manually triggered, the system might fail to start up correctly.
Cause	Non-Maskable Interrupt (NMI) was triggered after OS Watchdog is enabled.
Recommended action	1. Verify that the service software is operating correctly. 2. Disable the watchdog from the BIOS. Access the BIOS setup utility, and set OS Watchdog Timer to Disabled. 3. If the issue persists, contact Technical Support.

OS Watchdog pre-timeout Interrupt-non-NMI

Event code	0x117000de
Message text	OS Watchdog pre-timeout Interrupt-non-NMI.
Variable fields	N/A
Severity level	Major
Example	OS Watchdog pre-timeout Interrupt-non-NMI.
Impact	The system might fail to start up correctly.
Cause	The OS failed to start up after a long time, and non-NMI was triggered by OS watchdog pre-timeout.
Recommended action	1. Check the boot options for exceptions, and fix the OS boot environment if any exception is detected. 2. If the issue persists, contact Technical Support.

System Event

Timestamp Clock Synch---event is $1 of pair---SEL Timestamp Clock updated

Event code	0x125000de
Message text	Timestamp Clock Synch---event is $1 of pair---SEL Timestamp Clock updated.
Variable fields	$1: Options include: · first—Pre-synchronization event. · second—Post-synchronization event.
Severity level	Info
Example	Timestamp Clock Synch---event is first of pair---SEL Timestamp Clock updated.
Impact	No negative impact,
Cause	HDM synchronizes the time with the server each time the server is powered up. The first event is triggered before the synchronization, and the second event is triggered after the synchronization.
Recommended action	No action is required.

Timestamp clock synch---BMC Time SYNC succeed

Event code	0x125000de
Message text	Timestamp Clock Synch---BMC Time SYNC succeed.
Variable fields	N/A
Severity level	Info
Example	Timestamp Clock Synch---BMC Time SYNC succeed.
Impact	No negative impact,
Cause	BMC synchronized its time with the ME successfully.
Recommended action	No action is required.

Critical Interrupt

Transition to Non-Critical from OK

Event code	0x1300000e
Message text	Transition to Non-Critical from OK--- Single-bit ECC error---PCIe slot:$1
Variable fields	$1: Slot number.
Severity level	Major
Example	Transition to Non-Critical from OK--- Single-bit ECC error---PCIe slot: 2
Impact	An error occurred during the access to a PCIe module. This has no negative impact on the system operation.
Cause	The PCIe module in the slot is faulty.
Recommended action	This message is generated when an error is detected by PCIe hardware check. Review the related event log messages and replace the faulty PCIe module or contact Technical Support.

PCI: PCIE Hot Plug PCIe Pull Out

Event code	0x13000010
Message text	PCI: PCIE Hot Plug PCIe Pull Out---Slot number $1.
Variable fields	$1: Slot number.
Severity level	Info
Example	PCI: PCIE Hot Plug PCIe Pull Out---Slot number 34.
Impact	No negative impact.
Cause	A PCIe module was removed from the riser card on the operating server. This message is available only for an R8900 G3 server.
Recommended action	1. Verify that the removal operation has been performed. 2. If the PCIe module is not removed, ensure secure installation of the module. 3. If the issue persists, contact Technical Support.

PCI: PCIE Hot Plug PCIe Insert

Event code	0x13100010
Message text	PCI: PCIE Hot Plug PCIe Insert---Slot number $1.
Variable fields	$1: Slot number
Severity level	Info
Example	PCI: PCIE Hot Plug PCIe Insert---Slot number 34.
Impact	No negative impact,
Cause	A PCIe module was inserted into the riser card on the operating server. This message is available only for an R8900 G3 server.
Recommended action	1. Verify that the insert operation has been performed. 2. Ensure secure installation of the module. 3. If the issue persists, contact Technical Support.

PCI SERR

Event code	0x135000de
Message text	PCI SERR ------Slot $1---PCIE Name: $2.
Variable fields	$1: PCIe device slot number. $2: PCIe device name.
Severity level	Major
Example	PCI SERR ------Slot 5---PCIE Name: EF-I20.
Impact	The system might crash.
Cause	An uncorrectable error occurred on the PCIe device.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe device based on the slot number. 4. If the PCIe device is a removable component, perform the following actions: a. Verify that the PCIe device is installed correctly. b. Verify that the golden plating on the PCIe device is not contaminated. c. Install the PCIe device into another slot to verify whether the error is present on the PCIe device or the slot. d. If the error occurs on the PCIe device, upgrade firmware and drivers of the PCIe device. e. If the error occurs on the slot, verify that the gold plating on the riser card is not contaminated. f. Replace the PCIe device. 5. If the PCIe device is embedded on the system board, perform the following actions: a. Update the BIOS, firmware, and drivers. b. Replace the system board. 6. If the issue persists, contact Technical Support.

Bus Uncorrectable Error

Event code	0x138000de
Message text	Bus Uncorrectable Error ---Slot $1---PCIE Name:$2.
Variable fields	$1: PCIe device slot number. $2: PCIe device name.
Severity level	Major
Example	Bus Uncorrectable Error---Slot 3---PCIE Name: RAID-LSI-9361-8i.
Impact	An error occurred on a PCIe device. If the error is severe, it might become an error at the host system level.
Cause	An internal uncorrectable error occurred on the PCIe device.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the server components such as riser cards are securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe device based on the slot number. 4. If the PCIe device is a removable component, perform the following actions: a. Verify that the PCIe device is installed correctly. b. Verify that the golden plating on the PCIe device is not contaminated. c. Install the PCIe device into another slot to verify whether the error is present on the PCIe device or the slot. d. If the error occurs on the PCIe device, upgrade firmware and drivers of the PCIe device. e. If the error occurs on the slot, verify that the gold plating on the riser card or other components is not contaminated. f. Verify that the server component where the PCIe device is installed is normal. g. Replace the PCIe device. 5. If the PCIe device is embedded on the system board, perform the following actions: a. Update the BIOS, firmware, and drivers. b. Replace the system board. 6. If multiple GPU modules or multiple network adapters in the network adapter cage encounter errors, replace the SW module or system board. 7. If the issue persists, contact Technical Support.

Bus Fatal Error

Event code	0x13a000de
Message text	Bus Fatal Error ------Slot $1---PCIE Name: $2.
Variable fields	$1: PCIe device slot number. $2: PCIe device name.
Severity level	Major
Example	Bus Fatal Error---Slot 3---PCIE Name: RAID-LSI-9361-8i.
Impact	An access error occurred on a PCIe device. If the error is severe, it might become an error at the host system level.
Cause	An internal fatal error occurred on the PCIe device.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the server components such as riser cards are securely connected to the system board. 2. Reboot the server and verify that the message is not generated again. 3. Locate the failed PCIe device based on the slot number. 4. If the PCIe device is removable, perform the following actions: a. Verify that the PCIe device is installed correctly. b. Verify that the gold plating on the PCIe device is not contaminated. c. Install the PCIe device into another slot to determine whether the error is present on the PCIe device or the slot. d. If the error occurs on the PCIe device, upgrade firmware and drivers of the PCIe device. e. If the error occurs on the slot, verify that the gold plating on the riser card or components is not contaminated. f. Verify that the server component where the PCIe device is installed is normal. g. Replace the PCIe device. 5. If the PCIe device is embedded on the system board, perform the following actions: a. Update the BIOS, firmware, and drivers. b. Replace the system board. 6. If multiple GPU modules or multiple network adapters in the network adapter cage encounter errors, replace the SW module or system board. 7. If the issue persists, contact Technical Support.

Button/Switch

Power Button pressed---Physical button---Button pressed

Event code	0x140000de
Message text	Power Button pressed---Physical button---Button pressed.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Physical button---Button pressed.
Impact	The system is powered on and powered off.
Cause	This message is displayed if the power button on the front panel of the server is pressed.
Recommended action	No action is required.

Power Button pressed---Physical button---Button released

Event code	0x140000de
Message text	Power Button pressed---Physical button---Button released.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Physical button---Button released.
Impact	The system is powered on and powered off.
Cause	This message is displayed if the power button on the front panel of the server is released.
Recommended action	No action is required.

Power Button pressed---Virtual button---Power cycle command

Event code	0x140000de
Message text	Power Button pressed---Virtual button---Power cycle command.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Virtual button---Power cycle command.
Impact	The server restarts.
Cause	This message is displayed if a power-cycle operation (Force System Cycle) is performed from HDM or a KVM console.
Recommended action	No action is required.

Power Button pressed---Virtual button---Power off command

Event code	0x140000de
Message text	Power Button pressed---Virtual button---Power off command.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Virtual button---Power off command.
Impact	The server is powered off.
Cause	This message is generated when you press the physical power button on the front panel of the server or execute commands to forcedly power off the server, gracefully power off the server, or power cycle the server.
Recommended action	No action is required.

Power Button pressed---Virtual button---Power on command

Event code	0x140000de
Message text	Power Button pressed---Virtual button---Power on command.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Virtual button---Power on command.
Impact	The server is powered on.
Cause	This message is generated when you press the physical power button on the front panel of the server or execute commands to forcedly power off the server, gracefully power off the server, or power cycle the server.
Recommended action	No action is required.

Power Button pressed---Virtual button---Soft off command

Event code	0x140000de
Message text	Power Button pressed---Virtual button---Soft off command.
Variable fields	N/A
Severity level	Info
Example	Power Button pressed---Virtual button---Soft off command.
Impact	The server is powered off.
Cause	This message is generated when you press the physical power button on the front panel of the server or execute commands to forcedly power off the server, gracefully power off the server, or power cycle the server.
Recommended action	No action is required.

Reset Button pressed---Virtual button---Reset command

Event code	0x142000de
Message text	Reset Button pressed---Virtual button---Reset command.
Variable fields	N/A
Severity level	Info
Example	Reset Button pressed---Virtual button---Reset command.
Impact	The server restarts.
Cause	This message is generated when one of the conditions is met: · The reset command was executed. · An IERR error occurred.
Recommended action	1. Verify that the reset command was executed. If the command was executed, no action is required. 2. Check whether an IERR error occurred. 3. If the issue persists, contact Technical Support.

FRU service request button---Physical button---Uid button pressed

Event code	0x144000de
Message text	FRU service request button---Physical button---Uid button pressed.
Variable fields	N/A
Severity level	Info
Example	FRU service request button---Physical button---Uid button pressed.
Impact	No negative impact.
Cause	This message is displayed if the UID button is pressed.
Recommended action	No action is required.

Module/Board

Transition to Critical from less severe

Event code	0x1520000e
Message text	Transition to Critical from less severe.
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe.
Impact	An access error occurred on a PCIe BUS0 device, If the error is severe, it might become an error at the primary system-level error.
Cause	An internal uncorrectable error occurs on a PCIe BUS0 device.
Recommended action	1. Verify that the system power is being supplied correctly. 2. Replace the component to verify if the component is faulty. 3. If the issue persists, contact Technical Support.

Transition to Non-Recoverable from less severe

Event code	0x1530000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure on $1($2).
Variable fields	$1: Component, such as Motherboard, PDB, CMOD, and riser card. $2: Fault location, such as P5V, P5V_STBY, CPU1_PVCSA, and CPU2_PVCCIO.
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure on Motherboard(P5V).
Impact	The system might be powered off.
Cause	The voltage inside the card is abnormal.
Recommended action	1. Ignore this message if it is triggered by a system power-on or power-off event. 2. Reconnect power cords and verify whether the server can be powered on correctly. ¡ If the server can be powered on, the message might be generated because the detection signals were interfered. No action is required. ¡ If the server cannot be powered on, review the SDS logs to locate the fault and replace the faulty component. 3. If the error occurs again, replace the component. 4. If the issue persists, contact Technical Support.

Monitor---Board found PSU output can't be enabled

Event code	0x1570000e
Message text	Monitor---Board found PSU output can't be enabled($1)
Variable fields	$1: Faulty module.
Severity level	Major
Example	Monitor---Board found PSU output can't be enabled(PSU2)
Impact	The system might be powered off.
Cause	This message is generated when the power supply fails to provide power to the System board.
Recommended action	1. Verify that the power supply LED is normal. ¡ If the LED is abnormal, replace the power supply. ¡ If the LED is normal, install the power supply to another normal slot to identify whether the power supply is operating correctly. If the issue persists, replace the system board because the error is caused by a faulty slot. If the issue does not re-occur, verify that the power supply is installed correctly. If the power supply has been installed correctly, replace the power supply. 2. If the issue persists, contact Technical Support.

Add-in Card

Transition to OK

Event code	0x1700000e
Message text	Transition to OK---PCIe slot: $1---LDDevno:$2.
Variable fields	$1: Slot number of the storage controller. $2: Logical drive number.
Severity level	Info
Example	Transition to OK---PCIe slot:1---LDDevno:0.
Impact	No negative impact.
Cause	This message is generated if the logical drive managed by the storage controller changes from abnormal to normal.
Recommended action	No action is required.

Transition to Critical from less severe

Event code	0x1720000e
Message text	Transition to Critical from less severe---PCIe slot: $1---LDDevno:$2
Variable fields	$1: Slot number of the storage controller. $2: Logical drive number.
Severity level	Major
Example	Transition to Critical from less severe---PCIe slot: 1---LDDevno:0
Impact	The system might be powered off.
Cause	This message is generated when the logical drive managed by the storage controller is degraded or faulty, or the backplane power is faulty.
Recommended action	1. Log in to HDM to verify whether the logical drive is degraded or faulty. 2. If the logical drive is degraded, perform the following actions: ¡ Verify that all member drives in the logical drives are operating correctly. ¡ Re-install member drives to verify whether the drives can be correctly identified. ¡ Access the BIOS to verify whether all member drives have been configured correctly. ¡ Check the error logs for the drives. ¡ Replace the faulty drives. ¡ If the issue persists, contact Technical Support. 3. If the logical drive is faulty, perform the following tasks: ¡ Verify that the drive has not been uninstalled. ¡ Re-install the member drives and rebuild the RAID. ¡ Replace the faulty drives and then restart the server. ¡ If the issue persists, contact Technical Support.

Chassis

Transition to OK

Event code	0x1800000e
Message text	Transition to OK.
Variable fields	N/A
Severity level	Info
Example	Transition to OK.
Impact	No negative impact.
Cause	The chassis status changed from abnormal to normal.
Recommended action	If the event code is 0x1800000e, no action is required. If the event code is 0x1800000f, perform the following operations: 1. Review the logs to identify the failure reason and examine if any other component is faulty. 2. If the issue persists, contact Technical Support.

State asserted

Event code	0x18100006
Message text	State asserted.
Variable fields	N/A
Severity level	Major
Example	State asserted.
Impact	The impact depends on the component where the error occurs.
Cause	The system detected an error.
Recommended action	1. Review the event log reported simultaneously for the component to correct the error. 2. If the issue persists, contact Technical Support.

Transition to Critical from less severe

Event code	0x1820000e
Message text	Transition to Critical from less severe.
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe.
Impact	If the error is severe, it will become a host system-level error.
Cause	The chassis status changed from less severe to critical.
Recommended action	1. Verify that power is being supplied correctly. 2. Review the other logs and examine whether a component is faulty. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x1830000e
Message text	Transition to Non-recoverable from less severe.
Variable fields	N/A
Severity level	Major
Example	Transition to Non-recoverable from less severe.
Impact	The system will be powered off.
Cause	The chassis state changed from less severe to non-recoverable.
Recommended action	1. Verify that power is being supplied correctly. 2. Review the other logs and examine whether a component is faulty. 3. If the issue persists, contact Technical Support.

System Boot/Restart Initiated

Initiated by power up

Event code	0x1d0000de
Message text	Initiated by power up---$1 reset by $2.
Variable fields	$1: System or module. Options include: ¡ BIOS. ¡ BMC. $2: Reboot method. Options include: · Power up. · Power recycle. · Power reset.
Severity level	Info
Example	Initiated by power up---BIOS reset by power up.
Impact	No negative impact.
Cause	This alarm is triggered by system power-on. The message content after "---" is displayed only for a server that has a BIOS_Boot_Up sensor.
Recommended action	1. See related logs for more information. 2. If the issue persists, contact Technical Support.

Initiated by hard reset

Event code	0x1d1000de
Message text	Initiated by hard reset---$1 reset by $2.
Variable fields	$1: System or module. Options include: ¡ BIOS. ¡ BMC. $2: Reboot method. Options include: · Power up. · Power recycle. · Power reset.
Severity level	Info
Example	Initiated by hard reset---BIOS reset by power reset.
Impact	No negative impact.
Cause	This alarm is triggered by system restart. The message content after "---" is displayed only for a server that has a BIOS_Boot_Up sensor.
Recommended action	1. See related logs for more information. 2. If the issue persists, contact Technical Support.

Initiated by warm reset

Event code	0x1d2000de
Message text	Initiated by warm reset---$1 reset by $2.
Variable fields	$1: System or module. Options include: ¡ BIOS. ¡ BMC. $2: Reboot method. Options include: · Power up. · Power recycle. · Power reset.
Severity level	Info
Example	Initiated by warm reset---BIOS reset by power reset.
Impact	No negative impact.
Cause	This alarm is triggered by system warm restart. The message content after "---" is displayed only for a server that has a BIOS_Boot_Up sensor.
Recommended action	1. See related logs for more information. 2. If the issue persists, contact Technical Support.

System restart---due to fan error:power off

Event code	0x1d7000de
Message text	System restart---due to fan error:power off.
Variable fields	N/A
Severity level	Info
Example	System Restart---due to fan error:power off.
Impact	No negative impact.
Cause	The server was powered off because two or more fans at critical locations were absent or faulty.
Recommended action	1. Verify that the air inlets and outlets of the server are not blocked 2. Log in to HDM, access the Fan page, and verify that all fans are working correctly. 3. Log in to HDM, access the Fan page, and verify that the fan speed is not too low. If the speed is low, adjust the fan speed mode or fan speed level as needed. 4. If the issue persists, contact Technical Support.

System Restart

Event code	0x1d7000de
Message text	System Restart---$1.
Variable fields	$1: Reason for the system restart. Possible options include: · Unknown cause. · Chassis control command—IPMI commands or power options from HDM. · Reset via pushbutton. · Power-up via power pushbutton. · Watchdog expiration. · AC lost.
Severity level	Info
Example	System Restart---Reset via pushbutton.
Impact	No negative impact.
Cause	The system restarted.
Recommended action	No action is required.

System Restart---due to fan error:power reset

Event code	0x1d7000de
Message text	System Restart---due to fan error:power reset.
Variable fields	N/A
Severity level	Info
Example	System Restart---due to fan error:power reset.
Impact	No negative impact.
Cause	The system was reset because two or more fans at critical locations were absent or faulty.
Recommended action	1. Verify that the air inlets and outlets of the server are not blocked 2. Log in to HDM, access the Fan page, and verify that all fans are working correctly. 3. Log in to HDM, access the Fan page, and verify that the fan speed is not too low. If the speed is low, adjust the fan speed mode or fan speed level as needed. 4. If the issue persists, contact Technical Support.

System Restart---due to fan error:power cycle

Event code	0x1d7000de
Message text	System Restart---due to fan error:power cycle.
Variable fields	N/A
Severity level	Info
Example	System Restart---due to fan error:power cycle.
Impact	No negative impact.
Cause	The system was cold reset because two or more fans at critical locations were absent or faulty.
Recommended action	1. Verify that the air inlets and outlets of the server are not blocked 2. Log in to HDM, access the Fan page, and verify that all fans are working correctly. 3. Log in to HDM, access the Fan page, and verify that the fan speed is not too low. If the speed is low, adjust the fan speed mode or fan speed level as needed. 4. If the issue persists, contact Technical Support.

Boot Error

No bootable media

Event code	0x1e0000de
Message text	No bootable media.
Variable fields	N/A
Severity level	Info
Example	No bootable media.
Impact	No negative impact.
Cause	No bootable media was found.
Recommended action	1. Specify an available boot device. 2. If the issue persists, contact Technical Support.

OS_BOOT

C: boot completed

Event code	0x1f1000de
Message text	C: boot completed.
Variable fields	N/A
Severity level	Info
Example	C: boot completed.
Impact	No negative impact.
Cause	The operating system booted from a hard drive. This event happens for most Windows OSs.
Recommended action	No action is required.

PXE boot completed

Event code	0x1f2000de
Message text	PXE boot completed.
Variable fields	N/A
Severity level	Info
Example	PXE boot completed.
Impact	No negative impact.
Cause	The operating system booted from a PXE boot device. This event happens for most Windows OSs.
Recommended action	No action is required.

OS Stop/Shutdown

Run-time Critical Stop

Event code	0x201000de
Message text	Run-time Critical Stop--$1.
Variable fields	$1: Reason for the OS crash. This field is optional.
Severity level	Critical
Example	Run-time Critical Stop--System Shut Down Cause by DFC Critical Warning.
Impact	The system crashes.
Cause	A critical error occurred during operating system operation, which caused system crash.
Recommended action	1. Verify that the installed system, drivers, firmware, and software do not have bugs and are compatible with the server. 2. Update the versions if bugs or compatibility issues exist. 3. Verify that the installed hardware options are compatible with the server. For more information about component and server compatibility, access the component compatibility query tool at http://www.h3c.com/cn/Service/Document_Software/Document_Center/Server/. 4. If the issue persists, contact Technical Support.

OS Graceful Stop

Event code	0x202000de
Message text	OS Graceful Stop.
Variable fields	N/A
Severity level	Info
Example	OS Graceful Stop.
Impact	The system is powered off.
Cause	The Windows OS was forcedly stopped.
Recommended action	No action is required.

OS Graceful Shutdown

Event code	0x203000de
Message text	OS Graceful Shutdown.
Variable fields	N/A
Severity level	Info
Example	OS Graceful Shutdown.
Impact	The system is powered off.
Cause	The Windows OS was shut down gracefully.
Recommended action	No action is required.

Slot/Connector

Device disabled: PCIe module information not obtained

Event code	0x21000012
Message text	Device disabled: PCIe module information not obtained---Slot $1.
Variable fields	$1: PCIe slot number.
Severity level	Minor
Example	Device Disabled: PCIe module information not obtained---Slot 1.
Impact	The system performance might decrease due to the failure of PCIe module identification.
Cause	The PCIe module is faulty.
Recommended action	1. Verify that the server starts up with the minimum configuration. For more information, see H3C Servers Troubleshooting Guide. 2. Verify that port is disabled in the BIOS. 3. Verify that the PCIe module is compatible with the server. 4. Verify that the PCIe module is installed correctly. 5. Install the PCIe module into another slot to verify that the PCIe module is not faulty. 6. If the issue persists, contact Technical Support.

triggered an uncorrectable error

Event code	0x210000de
Message text	$1 triggered an uncorrectable error.
Variable fields	$1: PCIe module type.
Severity level	Major
Example	NIC triggered an uncorrectable error.
Impact	An error occurred on a PCIe device. If the error is severe, it might become an error at the host system level.
Cause	An IERR or MCERR error was triggered. The error was identified as a PCIe uncorrectable error by SHD.
Recommended action	1. Use the slot number to locate the fault PCIe module. 2. If the PCIe module is removable, perform the following tasks: a. Update the firmware and drivers for the PCIe module to the latest version. b. Verify that the PCIe module is installed correctly as required. c. Install the PCIe module into another slot to determine whether the error is present on the module or the slot. 3. If the PCIe module is embedded on the system board, perform the following tasks: a. Update the BIOS firmware and drivers to the latest version. b. Replace the system board. 4. If the issue persists, contact Technical Support.

triggered a correctable error

Event code	0x211000de
Message text	$1 triggered a correctable error.
Variable fields	$1: PCIe module type.
Severity level	Minor
Example	NIC triggered a correctable error.
Impact	An error occurred on a PCIe device. If the error is severe, it might become an error at the host system level.
Cause	An IERR or MCERR error was triggered. The error was identified as a PCIe correctable error by SHD.
Recommended action	1. Ignore this message if it is an occasional event. 2. If the same message is generated repeatedly, use the slot number to locate the fault PCIe module. 3. If the PCIe module is removable, perform the following tasks: a. Upgrade firmware and drivers for the PCIe module. b. Verify that the PCIe module is installed correctly. c. Install the PCIe module into another slot to determine whether the error is present on the module or the slot. 4. If the PCIe module is embedded on the system board: a. Update the BIOS firmware and drivers. b. Replace the system board. 5. If the issue persists, contact Technical Support.

Slot/Connector Device installed/attached

Event code	0x212000de
Message text	Slot/Connector Device installed/attached.
Variable fields	N/A
Severity level	Info
Example	Slot/Connector Device installed/attached.
Impact	No negative impact.
Cause	A module was installed.
Recommended action	No action is required.

Transition to on line

Event code	0x21300014
Message text	Transition to on line.
Variable fields	N/A
Severity level	Info
Example	Transition to on line.
Impact	No negative impact.
Cause	The shared network port is connected to the network.
Recommended action	No action is required.

Transition to off line

Event code	0x21300015
Message text	Transition to off line.
Variable fields	N/A
Severity level	Info
Example	Transition to off line.
Impact	No negative impact.
Cause	The shared network port lost network connection.
Recommended action	Verify that the network cable is removed from the shared network port. If the cable is still connected, contact Technical Support.

Transition to Non-Critical from OK

Event code	0x2110000e
Message text	Transition to Non-Critical from OK---Slot $1
Variable fields	$1: Slot number of the network adapter.
Severity level	Major
Example	Transition to Non-Critical from OK---Slot 6
Impact	The system performance might decrease due to an error on the PCIe module.
Cause	This message is generated when network adapter is disconnected unexpectedly.
Recommended action	1. Verify that the network adapter is not faulty. 2. Verify that the related links such as I2C or MCTP link are in correct status. 3. If the issue persists, contact Technical Support.

System ACPI Power State

S0/G0 "working"

Event code	0x220000de
Message text	S0/G0 "working".
Variable fields	N/A
Severity level	Info
Example	S0/G0 "working".
Impact	No negative impact.
Cause	The system is operating correctly. G(0-2) indicate the global states. S(0-5) indicate the sleep states. In G0 state, applications can operate. In S0 state, the system is operating correctly.
Recommended action	No action is required.

S5/G2 "soft-off"

Event code	0x225000de
Message text	S5/G2 "soft-off".
Variable fields	N/A
Severity level	Info
Example	S5/G2 "soft-off".
Impact	No negative impact.
Cause	The device is software shutdown. You cannot run applications or the operating system when the device is software shutdown. Software shutdown shuts down the entire operating system except the main power supply unit. Almost no power is consumed during software shutdown. The waking time will be longer to reboot the system after a soft shutdown.
Recommended action	No action is required.

LPC Reset occurred

Event code	0x22d000de
Message text	LPC Reset occurred.
Variable fields	N/A
Severity level	Info
Example	LPC Reset occurred.
Impact	No negative impact.
Cause	The server was reset. This message is available only for servers that use Intel processors.
Recommended action	No action is required.

Watchdog2

Watchdog overflowAction:Timer expired

Event code	0x230000de
Message text	Watchdog overflow.Action:Timer expired - status only (no action and no interrupt)---interrupt type:$1---timer use at expiration:$2.
Variable fields	$1: Type of the interruption. Options include: ¡ none. ¡ SMI. ¡ NMI. ¡ Messaging Interrupt. ¡ unspecified. $2: Type of the watchdog timer. Options include: ¡ reserved. ¡ BIOS FRB2. ¡ BIOS POST. ¡ OS Load. ¡ SMS OS. ¡ OEM. ¡ unspecified.
Severity level	Info
Example	Watchdog overflow.Action:Timer expired - status only (no action and no interrupt)---interrupt type:none---timer use at expiration:BIOS FRB2.
Impact	The system cannot start up.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or OS operation. · The timeout action is set to no action.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to verify if software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Verify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Hard Reset

Event code	0x231000de
Message text	Watchdog overflow.Action:Hard Reset---interrupt type:$1---timer use at expiration:$2.
Variable fields	$1: Type of the interruption. Options include: ¡ none. ¡ SMI. ¡ NMI. ¡ Messaging Interrupt. ¡ unspecified. $2: Type of the watchdog timer. Options include: ¡ reserved. ¡ BIOS FRB2. ¡ BIOS POST. ¡ OS Load. ¡ SMS OS. ¡ OEM. ¡ unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Hard Reset---interrupt type:none---timer use at expiration:BIOS FRB2.
Impact	The system cannot start up.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to hard reset.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to verify if software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Verify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Power Down

Event code	0x232000de
Message text	Watchdog overflow.Action:Power Down---interrupt type:$1---timer use at expiration:$2.
Variable fields	$1: Type of the interruption. Options include: ¡ none. ¡ SMI. ¡ NMI. ¡ Messaging Interrupt. ¡ unspecified. $2: Type of the watchdog timer. Options include: ¡ reserved. ¡ BIOS FRB2. ¡ BIOS POST. ¡ OS Load. ¡ SMS OS. ¡ OEM. ¡ unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Power Down---interrupt type:none---timer use at expiration:BIOS FRB2.
Impact	The system cannot start up.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to power down. The watchdog powered off the system forcibly. Services are interrupted and the data that has not been saved will get lost.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to verify if software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Verify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Power Cycle

Event code	0x233000de
Message text	Watchdog overflow.Action:Power Cycle---interrupt type:$1---timer use at expiration:$2.
Variable fields	$1: Type of the interruption. Options include: ¡ none. ¡ SMI. ¡ NMI. ¡ Messaging Interrupt. ¡ unspecified. $2: Type of the watchdog timer. Options include: ¡ reserved. ¡ BIOS FRB2. ¡ BIOS POST. ¡ OS Load. ¡ SMS OS. ¡ OEM. ¡ unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Power Cycle---interrupt type:none---timer use at expiration:BIOS FRB2.
Impact	The system cannot start up.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to power cycle.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to verify if software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Verify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Timer interrupt

Event code	0x238000de
Message text	Watchdog overflow.Action:Timer interrupt---interrupt type:$1---timer use at expiration:$2.
Variable fields	$1: Type of the interruption. Options include: ¡ none. ¡ SMI. ¡ NMI. ¡ Messaging Interrupt. ¡ unspecified. $2: Type of the watchdog timer. Options include: ¡ reserved. ¡ BIOS FRB2. ¡ BIOS POST. ¡ OS Load. ¡ SMS OS. ¡ OEM. ¡ unspecified.
Severity level	Minor
Example	Watchdog overflow.Action:Timer interrupt---interrupt type:none---timer use at expiration:BIOS FRB2.
Impact	The system cannot start up.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to timer interrupt.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to verify if software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Verify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Management subsystem health

Management controller off-line

Event code	0x282000de
Message text	Management controller off-line.
Variable fields	N/A
Severity level	Info
Example	Management controller off-line.
Impact	No negative impact.
Cause	HDM is offline. Possible reasons include HDM shutdown or no power input (AC gets lost).
Recommended action	1. Review operation logs and verify if a power-off operation was performed by a user. If yes, wait for HDM to restart. 2. If no power-off operation was performed or if the issue occurs again after HDM restarts, examine if an AC lost event has occurred or a power supply is operating incorrectly. 3. Replace the power supply. 4. If the issue persists, contact Technical Support.

Management controller off-line---BMC reset

Event code	0x282000de
Message text	Management controller off-line---BMC reset.
Variable fields	N/A
Severity level	Info
Example	Management controller off-line---BMC reset.
Impact	No negative impact.
Cause	This message is often generated when a user resets HDM.
Recommended action	1. Review operation logs and verify if a warm reset operation was performed by a user. If yes, wait for HDM to restart. 2. Verify that no system board or power supply error is present. 3. If the issue persists, contact Technical Support.

Management controller off-line---HDM cold reboot

Event code	0x282000de
Message text	Management controller off-line---HDM cold reboot.
Variable fields	N/A
Severity level	Info
Example	Management controller off-line---HDM cold reboot.
Impact	No negative impact.
Cause	This message is often generated when a user performs cold HDM reboot.
Recommended action	1. Review operation logs and verify if a cold reboot operation was performed by a user. If yes, wait for HDM to restart. 2. Verify if an AC lost event has occurred or the power cord is disconnected or faulty. ¡ If an AC lost event occurred or a power supply fails or is faulty, replace the power supply. ¡ If the power cord is disconnected, reconnect the power cord. ¡ If the power cord is faulty, replace the power cord. 3. If the issue persists, contact Technical Support.

Management controller off-line---BMC WDT timeout event happened

Event code	0x282000de
Message text	Management controller off-line---BMC WDT timeout event happened.
Variable fields	N/A
Severity level	Info
Example	Management controller off-line---BMC WDT timeout event happened.
Impact	No negative impact.
Cause	The management controller went offline because the system was restarted by the watchdog.
Recommended action	1. Review event logs to identify the cause of watchdog timeout. 2. Upgrade HDM to the latest version. 3. If the issue persists, contact Technical Support.

Management controller off-line---BMC service restart

Event code	0x282000de
Message text	Management controller off-line---BMC service restart.
Variable fields	N/A
Severity level	Info
Example	Management controller off-line---BMC service restart.
Impact	No negative impact.
Cause	HDM restarted proactively.
Recommended action	1. Verify if HDM restarted or was upgraded and if HDM is operating correctly. If the issue rarely occurs and HDM operates correctly, no action is required. 2. If the issue persists, contact Technical Support.

Management controller unavailable

Event code	0x283000de
Message text	Management controller unavailable.
Variable fields	N/A
Severity level	Major
Example	Management controller unavailable.
Impact	No negative impact.
Cause	The management controller is unavailable. Possible reasons include unavailable HDM or ME.
Recommended action	1. Wait for one or two minutes and then refresh the page. 2. Replace the system board. 3. If the issue persists, contact Technical Support.

Management controller unavailable---Adapter $1 RAID-P460-B4 is in a fault condition

Event code	0x283000de
Message text	Management controller unavailable---Adapter $1 is in a fault condition
Variable fields	$1: Storage controller model.
Severity level	Major
Example	Management controller unavailable---Adapter RAID-P460-B4 is in a fault condition.
Impact	The system might crash depending on the installation location of the system.
Cause	This message is generated when the PMC storage controller is abnormal.
Recommended action	1. Restart HDM and identify whether the alarm is cleared from the Event Log tab. 2. Restart the server and identify whether the alarm is cleared from the Event Log tab. 3. If the issue persists, contact Technical Support.

Sensor access degraded or unavailable--- Adapter $1 RAID-P460-B4 has no response for 2 minutes in $2 slot

Event code	0x280000de
Message text	Sensor access degraded or unavailable--- Adapter $1 has no response for 2 minutes in $2 slot.
Variable fields	$1: Storage controller model. $2: Slot of the storage controller where the alarm occurs.
Severity level	Minor
Example	Sensor access degraded or unavailable--- Adapter RAID-P460-B4 has no response for 2 minutes in 1 slot
Impact	Out-of-band identification is abnormal. If in-band identification is also abnormal, the system might crash.
Cause	This message is generated when HDM fails to identify the PMC storage controller in slot $2 within 2 minutes.
Recommended action	1. Restart HDM and identify whether the alarm is cleared from the Event Log tab. 2. Restart the server and identify whether the alarm is cleared from the Event Log tab. 3. If the issue persists, contact Technical Support.

Sensor access degraded or unavailable--- Adapter $1 has no response for 5 minutes in $2 slot

Event code	0x280000de
Message text	Sensor access degraded or unavailable--- Adapter $1 has no response for 5 minutes in $2 slot
Variable fields	$1: Storage controller model. $2: Slot of the storage controller where the alarm occurs.
Severity level	Minor
Example	Sensor access degraded or unavailable--- Adapter HBA-LAI-9300-8i-A1-X has no response for 5 minutes in 1 slot
Impact	Out-of-band identification is abnormal. If in-band identification is also abnormal, the system might crash.
Cause	This message is generated when HDM fails to identify the LSI storage controller in slot $2 within 5 minutes.
Recommended action	1. Restart HDM and identify whether the alarm is cleared from the Event Log tab. 2. Restart the server and identify whether the alarm is cleared from the Event Log tab. 3. If the issue persists, contact Technical Support.

Sensor failure---Adapter $1 has no response for 4 minutes in $2 slot

Event code	0x284000de
Message text	Management controller unavailable---Adapter $1 has no response for 4 minutes in $2 slot
Variable fields	· $1: Storage controller model. · $2: Slot of the storage controller where the alarm occurs.
Severity level	Major
Example	Management controller unavailable---Adapter RAID-P460-B4 has no response for 4 minutes in 1 slot
Impact	Out-of-band identification is abnormal. If in-band identification is also abnormal, the system might crash.
Cause	This message is generated when HDM fails to identify the PMC storage controller in slot $2 within 5 minutes.
Recommended action	1. Restart HDM and identify whether the alarm is cleared from the Event Log tab. 2. Restart the server and identify whether the alarm is cleared from the Event Log tab. 3. If the issue persists, contact Technical Support.

Sensor failure--- Adapter $1 has no response for 10 minutes in $2 slot

Event code	0x284000de
Message text	Management controller unavailable---Adapter $1 has no response for 10 minutes in $2 slot.
Variable fields	$1: Slot of the storage controller where the alarm occurs.
Severity level	Major
Example	Management controller unavailable---Adapter HBA-LAI-9300-8i-A1-X has no response for 10 minutes in 1 slot
Impact	Out-of-band identification is abnormal. If in-band identification is also abnormal, the system might crash.
Cause	This message is generated when HDM fails to identify the PMC storage controller in slot $2 within 10 minutes.
Recommended action	1. Restart HDM and identify whether the alarm is cleared from the Event Log tab. 2. Restart the server and identify whether the alarm is cleared from the Event Log tab. 3. If the issue persists, contact Technical Support.

Battery

Battery low (predictive failure)

Event code	0x290000de
Message text	Battery low (predictive failure)---PCIe slot:$1.
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Minor
Example	Battery low (predictive failure)---PCIe slot:1
Impact	The reliability of the storage controller decreases, which might cause the decrease of the system performance.
Cause	The supercapacitor of the storage controller has a low charge, overtemperature, overvoltage, or overcurrent condition.
Recommended action	1. Power on the server to charge the supercapacitor. Log in to HDM to view the supercapacitor status and verify whether the alarm is cleared after a period of time. 2. Verify that the power fail safeguard module is installed correctly. 3. Replace the supercapacitor or corresponding flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

Battery failed

Event code	0x291000de
Message text	Battery failed---PCIe slot:$1.
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Minor
Example	Battery failed---PCIe slot:1.
Impact	The reliability of the storage controller decreases, which might cause the decrease of the system performance.
Cause	An internal error occurred on the power fail safeguard module of the storage controller. Possible reasons include: · The supercapacitor is exhausted or has expired. · The power fail safeguard module failed to be initialized. · The power fail safeguard module subsystem failed. · The supercapacitor failed to be charged. · The supercapacitor fails.
Recommended action	1. Log in to HDM and view the supercapacitor status. 2. Verify that the power fail safeguard module is installed correctly. 3. Replace the supercapacitor or corresponding flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

Battery presence detected

Event code	0x292000df
Message text	Battery presence detected---PCIe slot:$1.
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Info
Example	Battery presence detected---PCIe slot:1.
Impact	The reliability of the storage controller decreases, which might cause the decrease of the system performance.
Cause	The supercapacitor of the storage controller is not detected.
Recommended action	1. Log in to HDM and view the supercapacitor status. 2. Verify that the supercapacitor is installed correctly and the supercapacitor cable is connected correctly. 3. Replace the supercapacitor or the corresponding flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

ME status

Management controller unavailable

Event code	0xb03000de
Message text	Management controller unavailable.
Variable fields	N/A
Severity level	Minor
Example	Management controller unavailable.
Impact	No negative impact.
Cause	This message is displayed if the ME self-test fails.
Recommended action	1. Identify the current ME version and update the ME to the latest version if a new version is available. 2. Update the BIOS to the latest version. 3. If the issue persists, contact Technical Support.

OEM Record

System Source Monitor:Mem usage exceeds the threshold

Event code	0xe01000de
Message text	System Source Monitor:Mem usage exceeds the threshold---Current usage $1 Threshold $2.
Variable fields	$1: Memory usage. $2: Memory usage threshold.
Severity level	Info
Example	System Source Monitor:Mem usage exceeds the threshold---Current usage 100%, Threshold 80%.
Impact	The system might get stuck.
Cause	The memory usage exceeded the alarm threshold. This alarm is triggered by FIST SMS.
Recommended action	1. Verify that the memory usage threshold setting is reasonable. 2. Check the current memory usage. Adjust the running services to lower the memory usage or expand the memory capacity. 3. If the issue persists, contact Technical Support.

System Source Monitor: Relieve resource alarm about Mem Usage

Event code	0xe01000df
Message text	System Source Monitor: Relieve resource alarm about Mem Usage---Current usage $1 Threshold $2.
Variable fields	$1: Memory usage. $2: Memory usage threshold.
Severity level	Info
Example	System Source Monitor:Relieve resource alarm about Mem Usage---Current usage 80%, Threshold 100%.
Impact	No negative impact.
Cause	The memory usage dropped below the alarm threshold. This alarm is triggered by FIST SMS.
Recommended action	No action is required.

System Source Monitor:Cpu usage exceeds the threshold

Event code	0xe02000de
Message text	System Source Monitor:Cpu usage exceeds the threshold---Current usage $1 Threshold $2.
Variable fields	$1: CPU usage. $2: CPU usage threshold.
Severity level	Info
Example	System Source Monitor:Cpu usage exceeds the threshold---Current usage 100%, Threshold 80%.
Impact	The system performance might decrease.
Cause	The processor usage exceeded the alarm threshold. This alarm is triggered by FIST SMS.
Recommended action	1. Verify that the processor usage threshold setting is reasonable. 2. Check the current processor usage. Adjust the running services to lower the processor usage. 3. .If the issue persists, contact Technical Support.

System Source Monitor: Relieve resource alarm about Cpu Usage

Event code	0xe02000df
Message text	System Source Monitor: Relieve resource alarm about Cpu Usage---Current usage $1 Threshold $2.
Variable fields	$1: CPU usage. $2: CPU usage threshold.
Severity level	Info
Example	System Source Monitor:Relieve resource alarm about Cpu Usage---Current usage 80%, Threshold 100%.
Impact	No negative impact.
Cause	The processor usage dropped below the alarm threshold. This alarm is triggered by FIST SMS.
Recommended action	No action is required.

Memory is not certified

Event code	0xe11000de
Message text	Memory is not certified---Location: CPU:$1 CH:$2 DIMM:$3 $4.
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: DIMM identifier.
Severity level	Minor
Example	Memory is not certified---Location:CPU:1 CH:1 DIMM:0 A1
Impact	No negative impact.
Cause	HDM performs anti-counterfeit verification on DIMMs every time the BIOS restarts. This message is displayed if a DIMM is not certified by H3C.
Recommended action	1. Log in to HDM, access the Memory page, and verify that all DIMMs are H3C certified. Using a non-H3C-certified DIMM can cause stability issues. 2. Verify that the DIMMs are correctly installed. 3. If the issue persists, contact Technical Support.

Numbering CPUs

CPUs are numbered starting from different numbers for different servers, as shown in Table 3.

Table 3 CPU numbering for different servers

Servers	Starting number for CPUs
· H3C UniServer B5700 G3 · H3C UniServer B5700 G5 · H3C UniServer B5800 G3 · H3C UniServer B7800 G3 · H3C UniServer E3200 G3 · H3C UniServer R2700 G3 · H3C UniServer R2900 G3 · H3C UniServer R4300 G3 · H3C UniServer R4300 G5 · H3C UniServer R4330 G5 · H3C UniServer R4400 G3 · H3C UniServer R4500 G3 · H3C UniServer R4700 G3 · H3C UniServer R4700 G5 · H3C UniServer R4900 G3 · H3C UniServer R4900 G5 · H3C UniServer R4900LC G5 · H3C UniServer R4930 G5 · H3C UniServer R4950 G5 · H3C UniServer R5300 G3 · H3C UniServer R5300 G5 · H3C UniServer R5500 G5 · H3C UniServer R6700 G3 · H3C UniServer R6900 G3 · H3C UniServer R6900 G5 · H3C UniServer R8900 G3	CPU 1
· H3C UniServer R4950 G3 (Hygon) · H3C UniServer R4950 G3 (Naples) · H3C UniServer R4950 G3 (Rome)	CPU 0
H3C UniServer R4100 G3	The server comes with only one CPU and does not require CPU numbering.