H3C HDM2 System Log Messages Reference-6W102

Title	Size	Downloads
H3C HDM2 System Log Messages Reference-6W102-book(CHM&PDF&Excel).rar	1.19 MB

Table of Contents

H3C HDM2 System Log Messages Reference-6W102

Related Documents

book(CHM&PDF&Excel)

Title	Size	Download
book(CHM&PDF&Excel)	1.19 MB

H3C HDM2
System Log Messages Reference

No part of this manual may be reproduced or transmitted in any form or by any means without prior written consent of New H3C Technologies Co., Ltd.

Except for the trademarks of New H3C Technologies Co., Ltd., any trademarks that may be mentioned in this document are the property of their respective owners.

The information in this document is subject to change without notice.

Contents

Introduction· 1

Use cases· 1

Obtaining system log messages· 1

System log severity level 1

Using this document 2

Applicable products· 2

Event log messages· 4

Temperature· 4

Dropped below the lower minor threshold· 4

Dropped below the lower major threshold· 4

Dropped below the lower critical threshold· 5

Exceeded the upper minor threshold· 5

Exceeded the upper major threshold· 6

Exceeded the upper critical threshold· 7

Voltage· 7

Dropped below the lower minor threshold· 7

Dropped below the lower major threshold· 8

Dropped below the lower critical threshold· 9

Exceeded the upper minor threshold· 9

Exceeded the upper major threshold· 10

Exceeded the upper critical threshold· 11

Transition to Non-recoverable from less severe· 11

Transition to Non-recoverable from less severe· 12

Transition to Non-recoverable from less severe· 13

Transition to Non-recoverable from less severe· 14

Transition to Non-recoverable from less severe· 15

Transition to Non-recoverable from less severe· 16

Transition to Non-recoverable from less severe· 17

Transition to Non-recoverable from less severe· 18

Transition to Non-recoverable from less severe· 19

Transition to Non-recoverable from less severe· 20

Transition to Non-recoverable from less severe· 21

Transition to Non-recoverable from less severe· 22

Transition to Non-recoverable from less severe· 23

Transition to Non-recoverable from less severe· 24

Transition to Non-recoverable from less severe· 25

Transition to Non-recoverable from less severe· 26

Transition to Non-recoverable from less severe· 27

Transition to Non-recoverable from less severe· 28

Transition to Non-recoverable from less severe· 29

Transition to Non-recoverable from less severe· 30

Transition to Non-recoverable from less severe· 31

Transition to Non-recoverable from less severe· 32

Transition to Non-recoverable from less severe· 33

Transition to Non-recoverable from less severe· 34

Transition to Non-recoverable from less severe· 35

Transition to Non-recoverable from less severe· 36

Transition to Non-recoverable from less severe· 37

Transition to Non-recoverable from less severe· 38

Transition to Non-recoverable from less severe· 39

Transition to Non-recoverable from less severe· 40

Transition to Non-recoverable from less severe· 41

Transition to Non-recoverable from less severe· 42

Transition to Non-recoverable from less severe· 43

Transition to Non-recoverable from less severe· 44

Transition to Non-recoverable from less severe· 45

Transition to Non-recoverable from less severe· 46

Transition to Non-recoverable from less severe· 47

Transition to Non-recoverable from less severe· 48

Transition to Non-recoverable from less severe· 49

Transition to Non-recoverable from less severe· 50

Transition to Non-recoverable from less severe· 51

Current 52

Transition to Critical from less severe· 52

Exceeded the upper minor threshold· 52

Exceeded the upper major threshold· 53

Exceeded the upper critical threshold· 54

Fan· 54

Predictive Failure deasserted· 54

Predictive Failure asserted· 55

Transition to Running· 55

Transition to Off Line· 56

Transition to Degraded· 56

Fully Redundant 57

Non-redundant:Sufficient Resources from Redundant 57

Non-redundant:Insufficient Resources· 58

Physical Security· 58

General Chassis Intrusion· 58

FRB1/BIST failure· 61

FRB2/Hang in POST failure· 62

FRB3/Processor Startup/Initialization failure· 62

Configuration Error 63

Processor Presence detected· 63

Processor Automatically Throttled· 64

Processor Automatically Throttled· 65

Machine Check Exception· 66

Triggered a uncorrectable error 66

Machine Check Exception· 67

Triggered a correctable error 67

Correctable Machine Check Error 68

Machine Check Exception· 69

Correctable Machine Check Error 69

Power Supply· 70

Presence detected· 70

Power Supply Failure detected· 70

Power Supply Predictive Failure· 71

Power Supply input lost (AC/DC) 71

Power Supply input lost or out-of-range· 72

Power Supply input out-of-range - but present 72

Configuration error ---Vendor mismatch· 72

Configuration error---Power Supply rating mismatch· 73

Configuration error---Power supply rating mismatch· 73

Power Supply Inactive/standby state· 73

PSU failure detected by CPLD·· 74

Redundancy Lost 74

Power Unit 75

Power limit is exceeded over correction time limit 75

Cooling Device· 75

Transition to OK· 75

Transition to Non-recoverable---Liquid leakage occurred· 76

Transition to Non-recoverable from less severe· 76

Transition to Non-Critical from OK--- Liquid leakage detection cable is disconnected· 77

Other Units-based Sensor 77

Exceeded the upper minor threshold· 77

Memory· 78

Correctable ECC or other correctable memory error 78

Correctable ECC or other correctable memory error 79

CPU triggered a correctable error 80

Uncorrectable ECC or other uncorrectable memory error 80

Uncorrectable ECC or other uncorrectable memory error 81

Triggered an uncorrectable error 81

Uncorrectable ECC or other uncorrectable memory error 82

Parity· 82

Parity· 83

Parity---An uncorrectable error occurs during the memory test phase· 83

Parity---The memory interleaving configuration cannot meet the requirements of the server 84

Parity---The memory interleaving configuration cannot meet the requirements of the server 85

Parity---CMD eye width is too small 85

Parity---CmdPiGroup: No Eye width· 86

Parity---The command is not in the FNv table· 86

Parity---Memory read DqDqs training failed· 87

Parity---Memory Receive Enable Training Error 87

Parity---Memory write DqDqs training failed· 88

Parity---An error occurs during memory test, and the rank is disabled· 88

Parity---LRDIMM RCVEN training failed· 89

Parity---Read delay training failed· 89

Parity---Write delay training failed· 90

Parity---Mapped out because failed critical mask test at cold boot 90

Parity---Invalid SPD contents· 91

Parity---The DCPMM memory modules of the unexpected model are installed· 91

Parity---Failed to set the VDD voltage of the DIMM·· 92

Parity---Delay exceeded· 92

Parity---Timing error occurred during signal line adjustment for memory write leveling training· 93

Parity---CS is not consistent with clock in timing, and the channel is isolated· 93

Parity---CA is not consistent with clock in timing, and the channel is isolated· 94

Parity---LRDIMM external coarse training failed· 94

Parity---LRDIMM external fine training failed· 95

Parity---LRDIMM internal coarse training failed· 95

Parity---LRDIMM internal fine training failed· 96

Memory Device Disabled---The Rank is disabled· 96

Memory Device Disabled---The DIMM is disabled· 97

Memory Device Disabled· 97

Memory Device Disabled· 98

Correctable ECC or other memory error limit reached· 98

Correctable ECC or other memory error limit reached· 99

Presence detected· 99

Memory patrol scrub CE occurred· 100

Memory patrol scrub UCE occurred and degraded to CE· 100

Configuration error---RDIMMs are installed on the server that supports only UDIMMs· 101

Configuration error---UDIMMs are installed on the server that supports only RDIMMs· 101

Configuration error---SODIMMs are installed on the server that supports only RDIMMs· 102

Configuration error---The number of ranks per channel can be only 1, 2, or 4· 102

Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported· 103

Configuration error---The number of ranks in the channel exceeds 8· 103

Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server 104

Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V 104

Configuration error---The CPU is not compatible with 3DS DIMMs· 105

Configuration error---NVDIMMs with stepping lower than 0x10 are not supported· 105

Configuration error---The CPU is not compatible with the DIMMs· 106

Configuration error---The frequency of the DIMM is not supported on the server 106

Configuration error---24Gb or higher Capacity DRAMs not supported with this CPU· 107

Configuration error---The CPU is not compatible with LRDIMMs· 107

Configuration error--- DCPMM + HBM config is not supported. Disable DCPMM populated channel 108

Configuration error--- Failed to enable the lockstep mode The memory RAS mode has degraded to independent 108

Configuration error---Failed to enable the full mirror mode· 109

Configuration error---Failed to enable the partial mirror mode The memory RAS mode degraded to independent 109

Configuration error---The memory interleaving configuration cannot meet the requirements of the server 110

Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent 110

Configuration error---Failed to enable patrol scrubbing· 111

Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty· 111

Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel 112

Configuration error---The DDR-T memory module is installed in the white slot 112

Configuration error---ODT configuration errorThe channel is isolated· 113

Configuration error---REQ is not consistent with clock in timing· 113

Configuration error---Failed to enable ADDDC· 114

Configuration error---NVMCTRL_MEDIA_NOTREADY· 114

Drive Slot 115

Drive Presence· 115

Drive Fault 115

Drive Fault---The disk is missing· 116

Predictive Failure· 116

In Critical Array· 117

In Failed Array· 117

Rebuild/Remap in progress· 118

The disk triggered an media error 118

The disk triggered an uncorrectable error 118

The disk is missing· 119

System Firmware Progress· 119

System Firmware Error (POST Error)---Run sense AMP HW FSM failed· 119

System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket 120

System Firmware Error (POST Error)---No Dimm on socket$1· 120

System Firmware Error (POST Error)---No memory found· 121

System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation· 121

System Firmware Error (POST Error)---DIMM population error 121

System Firmware Error (POST Error)---Some CPU links failed to train. UPI topology changed across reset 122

System Firmware Error (POST Error)---CPU stepping mismatch detected· 122

System Firmware Error (POST Error)---KTI Topology Change Logged· 122

System Firmware Error (POST Error)---CPU matching failure---CPU stepping is detected· 123

System Firmware Error (POST Error)---CPU matching failure---CPU frequency is detected· 123

System Firmware Error (POST Error)---CPU matching failure---CPU Microcode is detected· 124

System Firmware Error (POST Error)---CPU matching failure---UPI Topology is detected· 124

System Firmware Error(POST Error)---Unrecoverable video controller failure· 125

System Firmware Hang· 125

System software triggered an uncorrectable error 125

System software triggered a correctable error 126

System Firmware Progress---Video initialization---Detection unsuccessful 126

System Firmware Progress---Secondary processor(s) initialization---Detection unsuccessful 126

Event Logging Disabled· 127

Log Area Reset/Cleared· 127

SEL Full 127

SEL Almost Full 127

System Event 128

System Reconfigured---BIOS load default. CMOS cleared· 128

Limit Exceeded---CPU usage exceeds the threshold· 128

Limit Exceeded---Mem usage exceeds the threshold· 129

Limit Exceeded---Network usage exceeds the threshold· 129

Limit Exceeded---Hard disk usage exceeds the threshold· 130

Timestamp clock synch---BMC Time SYNC succeed· 130

Timestamp clock synch· 131

Critical Interrupt 132

PCI PERR· 132

PCI SERR· 133

Bus Correctable Error 134

Bus Uncorrectable Error 135

Bus Uncorrectable Error 136

Bus Fatal Error 137

Bus Degraded· 138

$1 triggered an uncorrectable error 139

$1 triggered a correctable error 139

Button / Switch· 140

Power Button pressed---Physical button---Button pressed· 140

Reset Button pressed· 140

Module / Board· 141

Transition to Non-Critical from OK· 141

Transition to Critical from less severe· 141

Transition to Non- Recoverable from less severe· 142

Transition to Non-Critical from OK---System is operating in KTI Link Slow Speed Mode· 142

Transition to Non-Critical from OK---Requested Link Speed is not supported. Defaulting to 18GT· 143

Transition to Non-Critical from OK---One or more per Link option mismatch detected. Forcing to common setting 143

Transition to Non-Critical from OK---Some CPU has more than one link connecting to other CPU. Disable one of the Dual-Link· 144

Transition to Non-Critical from OK---KTI Adaptation is in progress, or High Speed adaptation is failed 144

System board triggered an uncorrectable error 145

System board triggered a correctable error 145

Add-in Card· 145

Transition to OK· 145

Transition to Critical from less severe· 146

Transition to Critical from less severe· 147

Transition to Non-recoverable from less severe· 148

ChipSet 148

Transition to Critical from less severe· 148

Cable/Interconnect 149

Configuration Error - Incorrect cable connected / Incorrect interconnection· 149

Configuration Error - Incorrect cable connected / Incorrect interconnection· 150

System Boot / Restart Initiated· 150

Initiated by power up· 150

Initiated by hard reset 150

Initiated by warm reset 151

System restart 151

Boot Error 151

No bootable media· 151

OS_BOOT· 152

C: boot completed· 152

Boot completed - boot device not specified· 152

OS Stop / Shutdown· 153

Run-time Critical Stop· 153

OS Graceful Stop· 153

OS Graceful Shutdown· 153

Slot / Connector 154

Device disabled: PCIe module information not obtained· 154

Fault Status asserted· 154

Transition to Non-Critical from OK· 155

System ACPI Power State· 155

S0 / G0 "working" 155

S0 / G0 "working" 156

S5 / G2 "soft-off" 156

S5 / G2 "soft-off" 157

S4 / S5 soft-off, particular S4 / S5 state cannot be determined· 157

LPC Reset occurred· 158

Watchdog2· 158

Watchdog overflowAction:Timer expired· 158

Watchdog overflowAction:Hard Reset 159

Watchdog overflowAction:Power Down· 160

Watchdog overflowAction:Power Cycle· 161

Entity Presence· 161

Entity Present---License is about to expire· 161

Entity Disabled---License has expired· 162

Management Subsystem Health· 162

Controller access degraded or unavailable· 162

Controller access degraded or unavailable· 163

Battery· 163

Battery low (predictive failure) 163

Battery failed· 164

Battery presence detected· 164

Version Change· 165

Hardware incompatibility detected with associated Entity---Memory is not certified· 165

Introduction

This document describes HDM2 log messages generated to notify the occurrence and removal of system exceptions detected by sensors in the server. You can use this document to obtain message details and recommended actions for server maintenance.

HDM2 is an upgraded version of HDM. For convenience purposes, the term "HDM" refers to HDM2 in this document.

Use cases

When the device experiences a failure or certain reasons lead to an abnormal working state of the system, the system is able to generate alarms based on the faults occurring in different modules, as well as generate event log information. After obtaining the log information, users can search for the corresponding log information in this document using the relevant fields in the log information. This will allow them to understand the detailed content of the log information and receive recommended solutions for handling, thus facilitating the maintenance of the server's normal operation.

Obtaining system log messages

You can obtain system log messages through the following methods:

· HDM Web interface—Access the HDM Web interface and click Remote O&M > Log > Log Download. On the Log Download tab, select to download the entire log or log entries for a period.

· Alert emails—Complete alert email settings to obtain log messages.

· Third-party platform—Complete SNMP, SMTP, and SYSLOG settings to connect HDM to a third-party management platform, and obtain log messages from the platform.

· Redfish event subscription—If a remote subscription server is configured, Redfish uploads received log messages to the remote subscription server.

· IPMI commands—Use IPMItool commands to access the IPMI interface for BMC and enter commands to obtain event log messages.

System log severity level

Table 1 System log message severity levels

Severity	Description
Critical	The following conditions are present: severe decreases in the processing power of the system processing unit, significant reduction in available system resources, severe decreases in service processing capabilities, widespread interruptions in service modules, or unavailability of storage devices. This may lead to or cause server failure, system crashes, service data loss or other similar situations. Immediate action is required.
Major	Such events have had a significant impact on the system and there is a possibility of interrupting the normal operation of system or service modules (computing, storage, communication, and user data security), which may lead to service interruption.
Minor	Such events have not had a significant impact on the system, but there may be some risks and potential hazards. It is advisable to observe the relevant events and take necessary measures when needed to prevent further escalation of faults.
Info	Event logs generated during the normal operation of the server. Such events do not affect the normal operation of the server and do not require any action.

Using this document

This document explains messages in tables. Table 2 describes information provided in these tables.

Table 2 Message explanation table contents

Item	Description	Example
Event code	A hexadecimal code that uniquely represents a log message. The parity of the last character in the event code represents the alarm type: · Even—An alarm was generated. · Odd—An alarm was removed.	0x 02900002
Message text	Presents the message description. The same message description might be reported by different types of sensors.	Exceeded the upper major threshold.---Current reading:$1---Threshold reading:$2
Variable fields	Briefly describes the variable fields in the order that they appear in the message text. The variable fields are numbered in the "$Number" form to help you identify their location in the message text.	· $1: Current reading of the voltage sensor. · $2: Major overvoltage threshold of the voltage sensor.
Severity level	Provides the severity level of the message.	Major
Example	Log example.	Exceeded the upper major threshold.---Current reading:2.58---Threshold reading:2.56
Impact	Explains the impact of the alarm event on the system	Performance degradation and unstable operation might occur on the device components if the voltage is too high.
Cause	Explains the reason for the log generation	Abnormal board voltage.
Recommended action	Provides recommended actions. If the issue persists after the recommended actions have been taken, contact the technical support.	1. Verify that the external power supply is operating correctly. 2. Access the HDM Web interface and verify that the power supply is operating correctly. 3. If the issue persists, contact Technical Support.

Applicable products

This document is available for the following product models:

· H3C UniServer R4300 G6

· H3C UniServer R4700 G6

· H3C UniServer R4700LE G6

· H3C UniServer R4900 G6

· H3C UniServer R4900 G6 Ultra

· H3C UniServer R4900LE G6 Ultra

· H3C UniServer R4950 G6

· H3C UniServer R5300 G6

· H3C UniServer R5350 G6

· H3C UniServer R5500 G6

· H3C UniServer B5700 G6

· H3C UniServer R6700 G6

· H3C UniServer R6900 G6

Event log messages

Temperature

Dropped below the lower minor threshold

Event code	0x01000002
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a minor low-temperature notification.
Severity level	Minor
Example	Dropped below the lower minor threshold---Current reading:8--Threshold reading:10
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too low. If the temperature does not rise and the alarm persists, it may result in further temperature reduction and produce alarms of the major level. Therefore, it is important to detect potential issues that may lead to low temperature alarms as early as possible to avoid escalation of the problem.
Cause	The temperature is too low.
Recommended action	1. Adjust the temperature of the equipment room. 2. If the issue persists, contact Technical Support.

Dropped below the lower major threshold

Event code	0x01200002
Message text	Dropped below the lower major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a major low-temperature notification.
Severity level	Major
Example	Dropped below the lower major threshold---Current reading:4--Threshold reading:5
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too low. If the temperature does not rise and the alarm persists, it may result in further temperature reduction and generate alarms of the critical level. Therefore, it is important to detect potential issues that may lead to low temperature alarms as early as possible in order to avoid problem escalation.
Cause	The temperature is too low.
Recommended action	1. Adjust the temperature of the equipment room. 2. If the issue persists, contact Technical Support.

Dropped below the lower critical threshold

Event code	0x01400002
Message text	Dropped below the lower critical threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a critical low-temperature notification.
Severity level	Critical
Example	Dropped below the lower critical threshold---Current reading:0--Threshold reading:1
Impact	Operating devices in ultra-low temperature environments can reduce device performance, impact device lifespan, disrupt business operations, and lead to system downtime.
Cause	The temperature is too low.
Recommended action	1. Adjust the temperature of the equipment room. 2. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x01700002
Message text	Exceeded the upper minor threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a minor high-temperature notification.
Severity level	Minor
Example	Exceeded the upper minor threshold---Current reading:85---Threshold reading:80
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too high. If the temperature does not decrease and the alarm persists, it may result in further temperature rise and generate major-level alarms. Therefore, it is important to detect potential issues that may lead to high temperature alarms as early as possible in order to avoid problem escalation.
Cause	High ambient temperature, blockage of air intake or exhaust, and low fan speed.
Recommended action	1. Adjust the temperature of the equipment room. 2. Verify that the air inlet and outlet are not blocked. 3. Log in to HDM, and verify that the fans are running correctly. If abnormal fans exist, replace them. 4. Log in to HDM, access the fan management page, and verify that the fan speed is appropriate. 5. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x01900002
Message text	Exceeded the upper major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a major high-temperature notification.
Severity level	Major
Example	Exceeded the upper major threshold---Current reading:90---Threshold reading:88
Impact	Performance degradation and unstable operation might occur on the device components if the temperature is too high. If the temperature does not decrease and the alarm persists, it may result in further temperature rise and generate critical-level alarms. Therefore, it is important to detect potential issues that may lead to high temperature alarms as early as possible in order to avoid problem escalation.
Cause	High ambient temperature, clogged air intake or exhaust, and low fan speed.
Recommended action	1. Adjust the temperature of the equipment room. 2. Verify that the air inlet and outlet are not blocked. 3. Log in to HDM, and verify that the fans are running correctly. If abnormal fans exist, replace them. 4. Log in to HDM, access the fan management page, and verify that the fan speed is appropriate. 5. If the issue persists, contact Technical Support.

Exceeded the upper critical threshold

Event code	0x01b00002
Message text	Exceeded the upper critical threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the temperature sensor $2: Threshold (in Celsius) for triggering a critical high-temperature notification.
Severity level	Critical
Example	Exceeded the upper critical threshold---Current reading:95---Threshold reading:90
Impact	Operating devices in high-temperature environments can reduce device performance, impact device lifespan, increase energy consumption, disrupt business operations, and cause system crashes.
Cause	High ambient temperature, clogged air intake or exhaust, and low fan speed.
Recommended action	1. Adjust the temperature of the equipment room. 2. Verify that the air inlet and outlet are not blocked. 3. Log in to HDM, and verify that the fans are running correctly. If abnormal fans exist, replace them. 4. Log in to HDM, access the fan management page, and verify that the fan speed is appropriate. 5. If the issue persists, contact Technical Support.

Voltage

Dropped below the lower minor threshold

Event code	0x02000002
Message text	Dropped below the lower minor threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a minor low-voltage notification.
Severity level	Minor
Example	Dropped below the lower minor threshold---Current reading:8--Threshold reading:10
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too low.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Dropped below the lower major threshold

Event code	0x02200002
Message text	Dropped below the lower major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a major low-voltage notification.
Severity level	Major
Example	Dropped below the lower major threshold---Current reading:4--Threshold reading:5
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too low.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Dropped below the lower major threshold

Event code	0x02220002
Message text	Dropped below the lower major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a major low-voltage notification.
Severity level	Major
Example	Dropped below the lower major threshold---Current reading:10---Threshold reading:2
Impact	Memory and system performance degradation might occur.
Cause	This alarm is generated when the PMIC voltage reading of the memory is lower than the low voltage major alarm threshold.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the DIMM. 3. If the issue persists, contact Technical Support.

Dropped below the lower critical threshold

Event code	0x02400002
Message text	Dropped below the lower critical threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a critical low-voltage notification.
Severity level	Critical
Example	Dropped below the lower critical threshold---Current reading:0--Threshold reading:1
Impact	The device is running in an ultra-low voltage environment, which affects the system's power supply, or causes one board to power off, leading to a system crash.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x02700002
Message text	Exceeded the upper minor threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a minor high-voltage notification.
Severity level	Minor
Example	Exceeded the upper minor threshold---Current reading:85---Threshold reading:80
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too high.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x02900002
Message text	Exceeded the upper major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a major high-voltage notification.
Severity level	Major
Example	Exceeded the upper major threshold---Current reading:90---Threshold reading:88
Impact	Performance degradation and unstable operation might occur on the device components if the voltage is too high.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x02920002
Message text	Exceeded the upper major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a major high-voltage notification.
Severity level	Major
Example	Exceeded the upper major threshold---Current reading:10---Threshold reading:1
Impact	Memory and system performance degradation might occur.
Cause	This alarm is generated when the PMIC voltage of the memory is higher than the current major voltage alarm threshold.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the DIMM. 3. If the issue persists, contact Technical Support.

Exceeded the upper critical threshold

Event code	0x02b00002
Message text	Exceeded the upper critical threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the voltage sensor. $2: Threshold for triggering a critical high-voltage notification.
Severity level	Critical
Example	Exceeded the upper critical threshold---Current reading:95---Threshold reading:90
Impact	The device is operating in an ultra-high voltage environment, which affects the system's power supply, or causes a board to power off, resulting in a system crash.
Cause	Abnormal board voltage.
Recommended action	1. Verify whether the log was generated during device power-on or power-off. If it was, no action is required. 2. If device was running correctly when the log was generated, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x1530200e
Message text	Transition to Non-recoverable from less severe
Variable fields	N/A
Severity level	Critical
Example	Transition to Non-recoverable from less severe
Impact	HDD Bay does not operate correctly, which impacts the reliability of the system.
Cause	The HDD Bay voltage is abnormal.
Recommended action	1. Reconnect the HDD Bay node. Make sure the node is completely powered off from the AC power source before powering on the node. 2. If the issue persists, replace the HDD Bay component. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0230a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure ($1)
Variable fields	$1: AC lost
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure (AC lost)
Impact	The AC power supply is removed from the device.
Cause	CPLD detected ACFAIL signal from all PSUs.
Recommended action	1. Examine whether any abnormalities exist in the power supply network of the device, such as power grid fluctuations, PDU abnormality, or poor contact of the national standard power cord. 2. Examine the PSUs for errors. If problems are detected, replace the PSUs. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0230d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure on $1
Variable fields	$1: Mezz1, Mezz2, or Mezz3
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure on Mezz3
Impact	The device immediately shuts down and enters power failure state. The LEDs on the chassis ear are rapidly flashing (the power LED flashes red, the UID LED flashes blue, the NIC LED flashes green, and the health LED flashes red), and the status is no longer controllable. The LED status recovers after the issue is resolved and the device is re-powered on.
Cause	CPLD detects PGD signal of MEZZ.
Recommended action	1. Reconnect the power cords. Verify that the server can be powered on correctly. If the server cannot be powered on, replace the corresponding MEZZ module. 2. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0230200e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure on $1
Variable fields	$1: CPU power supply failure type. Options include PVCCD_HV_CPU3, PVPP_HBM_CPU3, PVCCFA_EHV_CPU3, PVCCFA_EHV_FIVRA_CPU3, PVCCINFAON_CPU3, PVNN_MAIN_CPU3, PVCCIN_CPU3, PVCCD_HV_CPU4, PVPP_HBM_CPU4, PVCCFA_EHV_CPU4, PVCCFA_EHV_FIVRA_CPU4, PVCCINFAON_CPU4, PVNN_MAIN_CPU4, and PVCCIN_CPU4.
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure (PVCCD_HV_CPU3)
Impact	The device is immediately shut down and enters power failure state. The LEDs on the chassis ear are rapidly flashing (the power LED flashes red, the UID LED flashes blue, the NIC LED flashes green, and the health LED flashes red), and the status is no longer controllable. The LED status recovers after the issue is resolved and the device is re-powered on.
Cause	The internal power supply of the CPU experiences faults such as overcurrent, overvoltage, or undervoltage in the VR chip on the processor mezzanine board that corresponds to the CPU power supply.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the processor mezzanine board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0231d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure on $1
Variable fields	$1: RAID card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure on RAID card
Impact	The device immediately shuts down and enters power failure state. The LEDs on the chassis ear are rapidly flashing (the power LED flashes red, the UID LED flashes blue, the NIC LED flashes green, and the health LED flashes red), and the status is no longer controllable. The LED status recovers after the issue is resolved and the device is re-powered on.
Cause	CPLD detects PGD signal of RAID controller.
Recommended action	1. Reconnect the power cords. Verify that the server can be powered on correctly. If the server cannot be powered on, replace the corresponding RAID controller. 2. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0231190e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_DIMM_PMIC_ERROR_1-6, P1_DIMM_AF_PMIC_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_DIMM_PMIC_ERROR_1-6)
Impact	System power-off might occur.
Cause	The power supply is abnormal.
Recommended action	1. Re-install the memory modules in slots 1 through 6 under CPU1. 2. Examine whether the CPU is securely fastened. 3. Examine the CPU socket for bent pins or foreign objects. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x02311a0e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_DIMM_PMIC_ERROR_7-12, P1_DIMM_GL_PMIC_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_DIMM_PMIC_ERROR_7-12)
Impact	System power-off might occur.
Cause	The power supply is abnormal.
Recommended action	1. Re-install the memory modules in slots 7 through 12 under CPU1. 2. Examine whether the CPU is securely fastened 3. Examine the CPU socket for bent pins or foreign objects. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x02311b0e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_DIMM_PMIC_ERROR_1-6, P2_DIMM_AF_PMIC_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_DIMM_PMIC_ERROR_1-6)
Impact	System power-off might occur.
Cause	The power supply is abnormal.
Recommended action	1. Re-install the memory modules in slots 1 through 6 under CPU2. 2. Examine whether the CPU is securely fastened 3. Examine the CPU socket for bent pins or foreign objects. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x02311c0e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_DIMM_PMIC_ERROR_7-12, P2_DIMM_GL_PMIC_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_DIMM_PMIC_ERROR_7-12)
Impact	System power-off might occur.
Cause	The power supply is abnormal.
Recommended action	1. Re-install the memory modules in slots 7 through 12 under CPU2. 2. Examine whether the CPU is securely fastened 3. Examine the CPU socket for bent pins or foreign objects. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0231500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: OCP1 network card, OCP2 network card, or OCP3 network card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(OCP1 network card)
Impact	System power-off might occur.
Cause	The power supply of the OCP network adapter is abnormal.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the OCP network adapter. 3. If the issue persists, replace the adapter. 4. If the issue persists, replace the system board. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0233000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: BMC_network_PHY_P1V0, BMC_network_PHY_P1V8
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(BMC_network_PHY_P1V0)
Impact	System power-off might occur.
Cause	The power supply of the BMC card is abnormal.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the BMC card. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0233a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: DSD card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(DSD card)
Impact	System power-off might occur.
Cause	Abnormal DSD card voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the DSD card. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0233d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V)
Impact	System power-off might occur.
Cause	Abnormal P12V voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. Sequentially check the PSU, fans, RISER, drive backplane, and system board. 3. Replace the faulty component. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0233e00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P5V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P5V)
Impact	System power-off might occur.
Cause	Abnormal P5V voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the BMC card. 4. If the issue persists, replace the rear backplane. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0233f00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P5V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P5V_STBY)
Impact	System power-off might occur.
Cause	Abnormal P5V voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_STBY)
Impact	System power-off might occur.
Cause	Abnormal P12V voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the OCP3 module. 4. If the issue persists, replace the fan module. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234100e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V Overcurrent
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V Overcurrent)
Impact	System power-off might occur.
Cause	Abnormal P12V signal current.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the fan module. 4. If the issue persists, replace the memory module. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234200e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCD_HV_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCD_HV_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234300e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVPP_HBM_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVPP_HBM_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234400e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCFA_EHV_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCFA_EHV_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCFA_EHV_FIVRA_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCFA_EHV_FIVRA_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234600e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCINFAON_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCINFAON_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234700e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVNN_MAIN_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVNN_MAIN_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234800e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCIN_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCIN_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234900e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCD_HV_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCD_HV_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVPP_HBM_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVPP_HBM_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234b00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCFA_EHV_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCFA_EHV_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234c00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCFA_EHV_FIVRA_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCFA_EHV_FIVRA_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCINFAON_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCINFAON_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234e00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVNN_MAIN_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVNN_MAIN_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0234f00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCIN_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCIN_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P3V3_STBY_A
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P3V3_STBY_A)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235100e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P5V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P5V_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the BMC card. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235200e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the PSUs. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235300e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the PSUs. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235400e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_1V8_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_1V8_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_3V3_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_3V3_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235600e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_1V8_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_1V8_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. If the server is not powered off, replace the system board. 2. If the issue persists, replace CPU2. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235700e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_3V3_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_3V3_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235800e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_VDDCR1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_VDDCR1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. If the server is not powered off, replace the system board. 2. If the issue persists, replace CPU1. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235900e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_VDDCR0
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_VDDCR0)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_VDDCR_SOC
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_VDDCR_SOC)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235b00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_VDDIO
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_VDDIO)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235c00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU1_1V1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU1_1V1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_VDDCR1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_VDDCR1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235e00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_VDDCR0
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_VDDCR0)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0235f00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_VDDCR_SOC
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_VDDCR_SOC)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_VDDIO
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_VDDIO)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236100e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU2_1V1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU2_1V1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236200e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: OCP1 network card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(OCP1 network card)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the OCP1 network adapter. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236300e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: OCP2 network card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(OCP2 network card)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the OCP2 network adapter. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236400e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: OCP3 network card
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(OCP3 network card)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the OCP3 network adapter. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: AC lost
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(AC lost)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the power supply. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236600e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236700e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236900e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: RISER_P12V_OCP
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(RISER_P12V_OCP)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the riser card. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU_DIMM_P12V_OCP
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU_DIMM_P12V_OCP)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If power supply is not removed from the server, verify if any DIMM, processor, or system board alarms are present. If a corresponding component is faulty, replace the component. 3. If no component alarms are present, replace the DIMM. 4. If the issue persists, replace the CPU. 5. If the issue persists, replace the system board. 6. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236b00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_BP_FRONT
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_BP_FRONT)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the front backplane. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236c00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_BP_REAR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_BP_REAR)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the rear backplane. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P5V_BP
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P5V_BP)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the backplane. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0236e00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V Overcurrent
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V Overcurrent)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the fan. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237100e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the riser card. 3. If the issue persists, replace the backplane. 4. If the issue persists, replace the system board. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237200e
Message text	Transition to Non-recoverable from less severe($1)
Variable fields	$1: CPU1_THERMTRIP, CPU2_THERMTRIP
Severity level	Critical
Example	Transition to Non-recoverable from less severe(CPU1_THERMTRIP)
Impact	The device gets stuck before this message is generated. Then, the device is powered off and enters the standby mode.
Cause	The CPU actively lower its frequency when its actual temperature exceeds the upper limit. If the CPU continues to overheat even after the frequency is lowered, the Thermtrip signal will be triggered and the CPU will stop running.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the CPU and CPU heatsink. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237300e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: REAR_4SFF_EFUSE, P12V_BP_REAR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(REAR_4SFF_EFUSE)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the 4SFF backplane. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237400e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: RISER2_GPU_EFUSE, P12V_SLOT_2_3
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(RISER2_GPU_EFUSE)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the riser card 2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: RISER1_GPU_EFUSE, P12V_SLOT_0_1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(RISER1_GPU_EFUSE)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the riser card 1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237600e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1) ---SW CpldReg 0x30:$2, 0x31:$3
Variable fields	$1: SW $2: Value of register 0x30 in SW. $3: Value of register 0x31 in SW.
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(SW) ---SW CpldReg 0x30:0x01, 0x31:0x40
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the switch card. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237900e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: UART_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(UART_ERROR)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237c00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: SWCPLD_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(SWCPLD_ERROR)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the switch card. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237d00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P5V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P5V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the BMC card. 4. If the issue persists, replace the rear backplane. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237a00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the issue persists, replace the power supply. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237b00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: BMCCPLD_ERROR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(BMCCPLD_ERROR)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the BMC card. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237e00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: RISER_P12V_PWR
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(RISER_P12V_PWR)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the riser card. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0237f00e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCD_HV_CPU1, PVPP_HBM_CPU1, PVCCFA_EHV_CPU1, PVCCFA_EHV_FIVRA_CPU1, PVCCINFAON_CPU1, PVNN_MAIN_CPU1, PVCCIN_CPU1
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCD_HV_CPU1)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU1. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238000e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: PVCCD_HV_CPU2，PVPP_HBM_CPU2，PVCCFA_EHV_CPU2，PVCCFA_EHV_FIVRA_CPU2，PVCCINFAON_CPU2，PVNN_MAIN_CPU2，PVCCIN_CPU2
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(PVCCD_HV_CPU2)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, replace the CPU2. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238100e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the system board. 3. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238400e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: FAN_P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(FAN_P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the fan board. 3. If the issue persists, replace the power board. 4. If the issue persists, replace the fan. 5. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238500e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: USB_HUB_P1V2_STBY
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(USB_HUB_P1V2_STBY)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the upper and lower USB ports of the BMC card. 3. If the issue persists, replace the iFIST module. 4. If the issue persists, replace the system board. 5. If the issue persists, replace the internal USB. 6. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238600e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: DIMM_P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(DIMM_P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the DIMM. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238700e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: CPU_P12V
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(CPU_P12V)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the issue persists, replace the CPU. 3. If the issue persists, replace the system board. 4. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x0238800e
Message text	Transition to Non-recoverable from less severe---System detected a power supply failure($1)
Variable fields	$1: P12V_BP_REAR_GPU
Severity level	Critical
Example	Transition to Non-recoverable from less severe---System detected a power supply failure(P12V_BP_REAR_GPU)
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Identify whether the server is powered off. If the server is powered off, re-connect the power cord and identify whether the server can be powered on properly. 2. If the server is not powered off, replace the power board. 3. If the issue persists, replace the rear GPU. 4. If the issue persists, contact Technical Support.

Current

Transition to Critical from less severe

Event code	0x0320000e
Message text	Transition to Critical from less severe
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe
Impact	Powering off a module affects system operations.
Cause	The current of the corresponding component is abnormal.
Recommended action	1. Check for any abnormal alarms on the power supply and the system board through the HDM Web alarm page. 2. Make sure the power supply system is functioning properly and the voltage is stable. 3. If the issue persists, contact Technical Support.

Exceeded the upper minor threshold

Event code	0x03700002
Message text	Exceeded the upper minor threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the current sensor. $2: Threshold for triggering a minor current notification.
Severity level	Minor
Example	Exceeded the upper minor threshold---Current reading:85---Threshold reading:80
Impact	Performance degradation and unstable operation might occur on the device components if the current is too high.
Cause	The current of the corresponding component is abnormal.
Recommended action	1. Replace the component. 2. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x03900002
Message text	Exceeded the upper major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the current sensor. $2: Threshold for triggering a major current notification.
Severity level	Major
Example	Exceeded the upper major threshold---Current reading:90---Threshold reading:88
Impact	Performance degradation and unstable operation might occur on the device components if the current is too high.
Cause	Abnormal board current.
Recommended action	1. Replace the component. 2. If the issue persists, contact Technical Support.

Exceeded the upper major threshold

Event code	0x03920002
Message text	Exceeded the upper major threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the current sensor. $2: Threshold for triggering a major current notification.
Severity level	Major
Example	Exceeded the upper major threshold---Current reading:0.50---Threshold reading:0.20
Impact	Memory and system performance degradation might occur.
Cause	This alarm is triggered when the current reading of the PMIC for the memory exceeds the major alarm threshold.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Exceeded the upper critical threshold

Event code	0x03b00002
Message text	Exceeded the upper critical threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current reading of the current sensor. $2: Threshold for triggering a critical current notification.
Severity level	Critical
Example	Exceeded the upper critical threshold---Current reading:95---Threshold reading:90
Impact	This could potentially cause component damage, leading to a system crash.
Cause	Abnormal board current.
Recommended action	1. Replace the component. 2. If the issue persists, contact Technical Support.

Fan

Predictive Failure deasserted

Event code	0x04000008
Message text	Predictive Failure deasserted
Variable fields	N/A
Severity level	Info
Example	Predictive Failure deasserted
Impact	No negative impact.
Cause	The status of the power fan has returned to normal.
Recommended action	No action is required.

Predictive Failure asserted

Event code	0x04000008
Message text	Predictive Failure asserted
Variable fields	N/A
Severity level	Minor
Example	Predictive Failure asserted
Impact	Predictive failure.
Cause	The state of the power supply fan is abnormal.
Recommended action	1. If the power supply fan stops due to foreign objects in the power supply, remove the foreign objects. 2. If the issue persists, Re-install the power supplies, 3. If the issue persists, replace the faulty power supply. 4. If the issue persists, contact Technical Support.

Transition to Running

Event code	0x04000014
Message text	Transition to Running
Variable fields	N/A
Severity level	Info
Example	Transition to Running
Impact	No negative impact.
Cause	The fan is operating correctly.
Recommended action	No action is required.

Transition to Off Line

Event code	0x04400014
Message text	Transition to Off Line
Variable fields	N/A
Severity level	Info
Example	Transition to Off Line
Impact	This affects system heat dissipation and reduces the performance of the system board components.
Cause	The fan module has been unplugged or the fan module and the system board have poor contact.
Recommended action	1. If the fan has been removed, reinstall the fan as a best practice. 2. Check if the pins of the fan and system board connector are normal. If an abnormality is present, replace the component. Otherwise, reinsert or reattach the fan to ensure proper contact. 3. Replace the fan. 4. If the issue persists, contact Technical Support.

Transition to Degraded

Event code	0x04600014
Message text	Transition to Degraded
Variable fields	N/A
Severity level	Major
Example	Transition to Degraded
Impact	This affects system heat dissipation and reduces the performance of the system board components.
Cause	The fan speed is abnormal.
Recommended action	1. Use the HDM Web page to check the fan speed and confirm the cause of the fan failure. If the speed is too low, it may be due to fan aging. If the speed is close to zero, it may be due to the fan being blocked by foreign objects or a fan failure. 2. Verify that the fan is not blocked. 3. Replace the fan. 4. If the issue persists, contact Technical Support.

Fully Redundant

Event code	0x04000016
Message text	Fully Redundant
Variable fields	N/A
Severity level	Info
Example	Fully Redundant
Impact	No negative impact.
Cause	All fan slots are equipped with fans.
Recommended action	No action is required.

Non-redundant:Sufficient Resources from Redundant

Event code	0x04300016
Message text	Non-redundant:Sufficient Resources from Redundant
Variable fields	N/A
Severity level	Major
Example	Non-redundant:Sufficient Resources from Redundant
Impact	This issue does not affect system heat dissipation.
Cause	The fan is invalid or is absent.
Recommended action	1. If the fan has been removed, reinstall the fan as a best practice. 2. Reinsert or reattach the fan to ensure proper contact. 3. If the fan status sensor reports a malfunction, it means that the fan has failed. Replace the fan. 4. If the issue persists, contact Technical Support.

Non-redundant:Insufficient Resources

Event code	0x04500016
Message text	Non-redundant:Insufficient Resources
Variable fields	N/A
Severity level	Critical
Example	Non-redundant:Insufficient Resources
Impact	This affects system heat dissipation, causing the system to overheat and automatically shut down.
Cause	The fan is invalid or is absent.
Recommended action	1. If the fan has been removed, reinstall the fan as a best practice. 2. Reinsert or reattach the fan to ensure proper contact. 3. If the fan status sensor reports a malfunction, it means that the fan has failed. Replace the fan. 4. If the issue persists, contact Technical Support.

Physical Security

General Chassis Intrusion

Event code	0x050000de
Message text	General Chassis Intrusion
Variable fields	N/A
Severity level	Minor
Example	General Chassis Intrusion
Impact	No negative impact.
Cause	The chassis access panel is removed.
Recommended action	1. Check if the access panel was removed manually. 2. Check if the access panel is installed properly. If necessary, open the access panel and then close it to see if the error log is cleared. 3. Check if the connection between the access-open alarm module and the chassis ear is normal. 4. If the issue persists, contact Technical Support.

LAN Leash Lost

Event code	0x054000de
Message text	LAN Leash Lost
Variable fields	N/A
Severity level	Info
Example	LAN Leash Lost
Impact	No negative impact.
Cause	BMC's NCSI channel detects a physical disconnection in the network.
Recommended action	1. Check if the network adapter is disabled in the operating system. If it is disabled, no action is required. 2. If the system reports this log during the power on/off phase, it can be ignored. 3. Check if the shared network port cable is properly connected. 4. If the shared network port is not needed, disable it. 5. If the issue persists, contact Technical Support.

Processor

IERR

Event code	0x070000de
Message text	$1 IERR err---Socket $2
Variable fields	$1: Signal type. Options include MSMI and CATERR. $2: Faulty CPU.
Severity level	Critical
Example	CATERR IERR err---Socket 1
Impact	It can cause system crash. Then, the system will automatically restart by default.
Cause	A CPU internal error was detected. For example, if the Package Control Unit (PCU) encounters an unrecoverable error, this alarm will be triggered.
Recommended action	1. Upgrade the BIOS and HDM firmware to the up-to-date version. 2. Process it in conjunction with the specific component event logs reported at the same time as this log. 3. If the issue persists, contact Technical Support.

MCERR

Event code	0x070010de
Message text	$1 MCERR err---Socket $2
Variable fields	$1: Signal type. Options include MSMI and CATERR. $2: Faulty CPU number.
Severity level	Critical
Example	CATERR MCERR err---Socket 1
Impact	It can cause system crash.
Cause	A CPU internal error was detected. For example, if an uncorrectable error occurs on the memory, this alarm will be triggered.
Recommended action	1. Upgrade the BIOS and HDM firmware to the up-to-date version. 2. CPU detects internal error and generates this log. Further check hardware information and sensor pages for errors or disabled components based on the description information. 3. Check for memory, PCIe, and UPI failures using the contextual logs, and perform troubleshooting steps based on the corresponding recommended actions.

Thermal Trip

Event code	0x071000de
Message text	Thermal Trip
Variable fields	N/A
Severity level	Critical
Example	Thermal Trip
Impact	It can cause host power-off.
Cause	When the CPU overheats, this event is triggered, which may result in shutdown and poweroff.
Recommended action	1. Log in to HDM, and verify that the fan is in normal state. 2. Re-install or replace the fan module with a speed alarm. 3. Identify whether the ambient temperature is too high. Keep the server operating within its normal temperature range. 4. Check for any blockages at the air inlet/outlet and remove any obstructions. 5. Power off the server, check for poor contact of the CPU heatsink, reapply the thermal grease, reinstall the heatsink, and power on the server again. 6. For a liquid-cooled server model, identify whether liquid-cooled component-related alarms have occurred. 7. If the issue persists, contact Technical Support.

FRB1/BIST failure

Event code	0x072000de
Message text	FRB1/BIST failure.
Variable fields	N/A
Severity level	Minor
Example	FRB1/BIST failure
Impact	The operating system might fail to start up and hardware downsizing applies.
Cause	This alarm is generated when the CPU self-check detects an error during system startup.
Recommended action	1. Power cycle the device. 2. If the issue persists, replace the CPU. 3. If the issue persists, contact Technical Support.

FRB2/Hang in POST failure

Event code	0x073000de
Message text	FRB2/Hang in POST failure
Variable fields	N/A
Severity level	Major
Example	FRB2/Hang in POST failure
Impact	The operating system might fail to start up.
Cause	The BIOS startup timed out.
Recommended action	1. Upgrade the BIOS. 2. If the issue persists, contact Technical Support.

FRB3/Processor Startup/Initialization failure

Event code	0x074000de
Message text	FRB3/Processor Startup/Initialization failure
Variable fields	N/A
Severity level	Minor
Example	FRB3/Processor Startup/Initialization failure
Impact	The operating system might fail to start up.
Cause	The BIOS startup timed out.
Recommended action	1. Upgrade the BIOS. 2. If the issue persists, contact Technical Support.

Configuration Error

Event code	0x075000de
Message text	Configuration Error---$1, ErrorType: $2,Severity: $3, Component: $4, IIO Stack: $5, Location: Socket: $6 or Configuration Error--- ErrorType: $2,Severity: $3, Failed Core: $7, Location: Socket: $6
Variable fields	$1: Time at which the error occurred. It can beCurrent Boot Error or Last Boot Error. $2: Fault type. It can be IIO Internal Error or Spare core Error. $3: Fault severity. $4: Faulty component. $5: I/O number. $6: CPU number. $7: core number.
Severity level	Minor
Example	Configuration Error---Current Boot Error, ErrorType: IIO Internal Error,Severity:Correctable, Component:VTD, IIO Stack: 1, Location: Socket: 1
Impact	The operating system might fail to start up.
Cause	The main system CPU detected internal correctable error information during operation.
Recommended action	This log is generated when correctable internal errors are detected during server operation, such as IIO internal errors or CPU core errors. No action is required for correctable internal errors.

Processor Presence detected

Event code	0x077000df
Message text	Processor Presence detected
Variable fields	N/A
Severity level	Info/Critical
Example	Processor Presence detected
Impact	If the primary CPU is not in place, it may result in system startup failure.
Cause	This event log is triggered when the primary CPU is not in place or installed incorrectly.
Recommended action	1. Verify that the primary CPU is installed correctly. 2. If the primary CPU fails, replace the CPU. 3. If the issue persists, contact Technical Support.

Processor Automatically Throttled

Event code	0x07a000de
Message text	Processor Automatically Throttled---due to fan error
Variable fields	N/A
Severity level	Minor
Example	Processor Automatically Throttled---due to fan error
Impact	System performance decreases due to CPU throttling.
Cause	The CPU throttles due to fan failure.
Recommended action	1. Log in to HDM, and verify that the fans are running correctly. 2. Verify that the air conditioner in the equipment room is running correctly.

Processor Automatically Throttled

Event code	0x07a010de
Message text	Processor Automatically Throttled---prochot
Variable fields	N/A
Severity level	Minor
Example	Processor Automatically Throttled---prochot
Impact	System performance decreases due to CPU throttling.
Cause	CPU throttling might occur due to CPU overheating.
Recommended action	1. Log in to HDM, and verify that the fan is in normal state. 2. Re-install or replace the fan module with a speed alarm. 3. Identify whether the ambient temperature is too high. Keep the server operating within its normal temperature range. 4. Check for any blockages at the air inlet/outlet and remove any obstructions. 5. Power off the server, check for poor contact of the CPU heatsink, reapply the thermal grease, reinstall the heatsink, and power on the server again. 6. For a liquid-cooled server model, identify whether liquid-cooled component-related alarms have occurred. 7. If the issue persists, contact Technical Support.

Processor Automatically Throttled

Event code	0x07a020de
Message text	Processor Automatically Throttled---memhot
Variable fields	N/A
Severity level	Minor
Example	Processor Automatically Throttled---memhot
Impact	System performance decreases due to CPU throttling.
Cause	CPU throttling may occur due to memory overheating.
Recommended action	1. Log in to HDM, and verify that the fan is in normal state. 2. Re-install or replace the fan module with a speed alarm. 3. Identify whether the ambient temperature is too high. Keep the server operating within its normal temperature range. 4. Check for any blockages at the air inlet/outlet and remove any obstructions. 5. For a liquid-cooled server model, identify whether liquid-cooled component-related alarms have occurred. 6. If the issue persists, contact Technical Support.

Machine Check Exception

Event code	0x07b000de
Message text	Machine Check Exception---$1---$2---Location: Socket:$3
Variable fields	$1: Fault type. $2: Specifies if the error occurred during the current boot or the previous boot. $3: Faulty CPU.
Severity level	Critical
Example	Machine Check Exception---PIE---Last Boot Error---Location: Socket:1
Impact	The system might stop responding.
Cause	Only in AMD models, when the CPU generates an uncorrectable error, this event will be triggered.
Recommended action	1. Check if any corresponding faults exist in the operating system. 2. Check the CPU microcode to identify whether to upgrade the microcode. 3. Upgrade the BIOS and BMC to the latest versions. 4. If the issue persists, contact Technical Support.

Triggered a uncorrectable error

Event code	0x07b201de
Message text	CPU $1 triggered a uncorrectable error.
Variable fields	$1: CPU number.
Severity level	Critical
Example	CPU 1 triggered a uncorrectable error.
Impact	The system might stop responding.
Cause	Triggering IERR or MCERR errors, the diagnosis result of HDM is CPU uncorrectable error.
Recommended action	1. Upgrade the BIOS and HDM firmware to the up-to-date version. 2. Safely power off the server and replace the CPU to identify whether the alarm disappears. 3. If the issue persists, contact Technical Support.

Machine Check Exception

Event code	0x07b100de
Message text	Machine Check Exception---HBM error---Location: Socket:$1
Variable fields	$1: CPU number.
Severity level	Critical
Example	Machine Check Exception---HBM error---Location: Socket:1
Impact	The system might stop responding.
Cause	HBM failed.
Recommended action	1. Check if there are any corresponding faults in the operating system. 2. Check the CPU microcode and upgrade the BIOS and BMC to the latest versions. 3. Safely power off the server and replace the CPU to identify whether the alarm disappears. 4. If the issue persists, contact Technical Support.

Triggered a correctable error

Event code	0x07c201de
Message text	CPU $1 triggered a correctable error.
Variable fields	$1: CPU number.
Severity level	Minor
Example	CPU 1 triggered a correctable error.
Impact	No negative impact.
Cause	Triggering IERR or MCERR errors, the diagnosis result of HDM is CPU correctable error.
Recommended action	No action is required.

Correctable Machine Check Error

Event code	0x07c000de
Message text	Correctable Machine Check Error---$1---$2---Location: Socket:$3
Variable fields	$1: Fault type. $2: Specifies if the error occurred during the current boot or the previous boot. $3: Faulty CPU.
Severity level	Minor
Example	Correctable Machine Check Error---PIE---Current Boot Error---Location: Socket:1
Impact	No negative impact.
Cause	This alarm is generated only in AMD models when correctable errors such as TWIX, WAFL, or SMU occur.
Recommended action	No action is required.

Correctable Machine Check Error

Event code	0x07c100de
Message text	Correctable Machine Check Error---HBM error---Location: Socket:$1
Variable fields	$1: CPU number.
Severity level	Minor
Example	Correctable Machine Check Error---HBM error---Location: Socket:1
Impact	No negative impact.
Cause	An correctable error was detected on HBM.
Recommended action	No action is required.

Machine Check Exception

Event code	0x07b001de
Message text	Machine Check Exception---$1, Bank: $2,Severity:$3, Error Info:$4, Location: Socket: $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. $2: Fault bank. $3: Fault severity. $4: Fault information. $5: CPU number.
Severity level	Critical
Example	Machine Check Exception---Current Boot Error, Bank: IFU,Severity:FATAL, Error Info:Cache, Location: Socket: 1
Impact	The system might stop responding.
Cause	This event occurs when there is an internal fault in the CPU.
Recommended action	1. Check if there are any corresponding faults present in the operating system. 2. Check the CPU microcode and upgrade the BIOS and HDM to the latest versions. 3. If the issue persists, preliminarily determine the range of the fault based on the bank location and check if any other warning logs have been generated. 4. Power off the server safely and replace the CPU or peripheral with a known working one to see if the warning disappears. 5. Replace the system board.

Correctable Machine Check Error

Event code	0x07c001de
Message text	Correctable Machine Check Error---$1, Bank: $2,Severity:$3, Error Info:$4, Location: Socket: $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. $2: Fault bank. $3: Fault severity. $4: Fault information. $5: CPU number.
Severity level	Minor
Example	Correctable Machine Check Error---Current Boot Error, Bank: IFU,Severity:Corrected, Error Info:Cache, Location: Socket: 1
Impact	No negative impact.
Cause	An internal correctable error occurred in the CPU.
Recommended action	No action is required.

Power Supply

Presence detected

Event code	0x080000de
Message text	Presence detected
Variable fields	N/A
Severity level	Info
Example	Presence detected
Impact	No negative impact.
Cause	When the power supply is detected as being inserted, this event is triggered, indicating a transition from the power supply not being in place to being in place. When the power supply is detected as being removed, this event is cleared, indicating a transition from the power supply being in place to not being in place.
Recommended action	If the power supply is removed, install the power supply again.

Power Supply Failure detected

Event code	0x081000de
Message text	Power Supply Failure detected
Variable fields	N/A
Severity level	Major
Example	Power Supply Failure detected
Impact	It affects system power supply and may result in abnormal system power-off.
Cause	A power supply fault was detected.
Recommended action	1. Re-install the power supply. 2. If the issue persists, replace the power supply. 3. If the issue persists, contact Technical Support.

Power Supply Predictive Failure

Event code	0x082000de
Message text	Power Supply Predictive Failure
Variable fields	N/A
Severity level	Minor
Example	Power Supply Predictive Failure
Impact	The power supply may have malfunctions that affect system power supply.
Cause	A power supply fault was detected.
Recommended action	1. Identify whether any foreign objects have obstructed and stop the power supply fan. If yes, remove the foreign objects. 2. If the issue persists, re-install the power supply. 3. If the issue persists, replace the power supply. 4. If the issue persists, contact Technical Support.

Power Supply input lost (AC/DC)

Event code	0x083000de
Message text	Power Supply input lost (AC/DC)
Variable fields	N/A
Severity level	Major
Example	Power Supply input lost (AC/DC)
Impact	It may cause the server to power off abnormally.
Cause	The AC power cable of the power supply is unplugged or there is an abnormal AC input.
Recommended action	1. Verify that the power input is normal. 2. Verify that all power cables are undamaged and properly connected. 3. Verify that all power supplies are correctly installed. 4. If the issue persists, contact Technical Support.

Power Supply input lost or out-of-range

Event code	0x084000de
Message text	Power Supply input lost or out-of-range
Variable fields	N/A
Severity level	Major
Example	Power Supply input lost or out-of-range
Impact	This may cause the server to power off abnormally.
Cause	The input voltage of the power supply exceeded the rated range.
Recommended action	1. Verify that the power input is normal. 2. Verify that all power cables are undamaged and properly connected. 3. Verify that all power supplies are correctly installed. 4. If the issue persists, contact Technical Support.

Power Supply input out-of-range - but present

Event code	0x085000de
Message text	Power Supply input out-of-range - but present
Variable fields	N/A
Severity level	Major
Example	Power Supply input out-of-range - but present
Impact	Abnormal power input beyond the supported range may cause the server to power off.
Cause	The input voltage of the power supply is too high.
Recommended action	1. Check if the input voltage of the power supply is normal. 2. Verify that the power cables and power supplies are installed correctly. 3. Unplug and re-plug the power supply to ensure a good power connection. 4. Check if the fans of the power supply are spinning. 5. If the issue persists, contact Technical Support.

Configuration error ---Vendor mismatch

Event code	0x086000de
Message text	Configuration error ---Vendor mismatch
Variable fields	N/A
Severity level	Minor
Example	Configuration error ---Vendor mismatch
Impact	An unknown risk exists due to non-original certified components.
Cause	Non-original certified power supplies are installed.
Recommended action	Install original certified power supplies.

Configuration error---Power Supply rating mismatch

Event code	0x086030de
Message text	Configuration error --- Power Supply rating mismatch
Variable fields	N/A
Severity level	Minor
Example	Configuration error --- Power Supply rating mismatch
Impact	This may result in unstable power supply and abnormal system shutdown.
Cause	Original certified power supplies are installed, but the models of the two power supplies do not match.
Recommended action	1. Make sure all the power supplies are of the same model. 2. If the issue persists, contact Technical Support.

Configuration error---Power supply rating mismatch

Event code	0x086200de
Message text	Configuration error---Power supply rating mismatch:PSU$1,POUT:$2W
Variable fields	$1: PSU ID, which can be 1 or 2. $2: Output power of the power supply.
Severity level	Minor
Example	Configuration error---Power supply rating mismatch:PSU1,POUT:2000W
Impact	This may result in unstable power supply and abnormal system shutdown.
Cause	The rated power of the installed power supplies may be inconsistent.
Recommended action	1. Make sure all the power supplies are of the same model. 2. If the issue persists, contact Technical Support.

Power Supply Inactive/standby state

Event code	0x087000de
Message text	Power Supply Inactive/standby state
Variable fields	N/A
Severity level	Info
Example	Power Supply Inactive/standby state
Impact	No negative impact.
Cause	The power supply exits cold standby mode. When the function of standby power supply is enabled, if the current device is running at a high power, the standby power supply will automatically exit cold backup mode and provide power to the device.
Recommended action	No action is required.

PSU failure detected by CPLD

Event code	0x088000de
Message text	PSU failure detected by CPLD
Variable fields	N/A
Severity level	Critical
Example	PSU failure detected by CPLD
Impact	This may result in unstable power supply and abnormal system shutdown.
Cause	The server has experienced an AC power failure.
Recommended action	1. Check for environmental issues such as high temperature or abnormal power supply fan. 2. Replug the power supply and check if the alarm disappears. 3. If the issue persists, replace the power supply.

Redundancy Lost

Event code	0x08100016
Message text	Redundancy Lost
Variable fields	N/A
Severity level	Major
Example	Redundancy Lost
Impact	Power redundancy failure reduces the reliability of device power supply.
Cause	Power redundancy got lost.
Recommended action	1. Check if the power supply environment is normal. 2. Check if any power supply has been removed. 3. Check for poor contact between power supplies and power cables. 4. Check for power-related fault alarm logs to determine if it is a power failure. 5. If the issue persists, contact Technical Support.

Power Unit

Power limit is exceeded over correction time limit

Event code	0x095010de
Message text	Power limit is exceeded over correction time limit---$1 Current Power: $2W.
Variable fields	$1: GPU/Not available for chassis power consumption $2: Current power value.
Severity level	Minor
Example	GPU: Power limit is exceeded over correction time limit---GPU Current Power: 2000W Chassis: Power limit is exceeded over correction time limit---Current Power: 2000W
Impact	Power capping failed and the corresponding policy will be executed.
Cause	Power capping triggers this alarm after a certain amount of time elapsed when the power output exceeds the limit.
Recommended action	1. Adjust the power capping threshold or adjust the GPU workload. 2. If the issue persists, contact Technical Support.

Cooling Device

Transition to OK

Event code	0x0a00000e
Message text	Transition to OK
Variable fields	N/A
Severity level	Info
Example	Transition to OK
Impact	No negative impact.
Cause	The liquid-cooled module is in place and free of faults.
Recommended action	No action is required.

Transition to Non-recoverable---Liquid leakage occurred

Event code	0x0a60000e
Message text	Transition to Non-recoverable---Liquid leakage occurred
Variable fields	N/A
Severity level	Critical
Example	Transition to Non-recoverable---Liquid leakage occurred
Impact	For the server model that supports only processor liquid-cooled module, processor heal dissipation is affected. For the server model that supports processor liquid-cooled module and GPU liquid-cooled module, processor or GPU heat dissipation is affected.
Cause	This message is generated when liquid leakage occurs.
Recommended action	1. Check if the liquid cooling device is functioning properly or if there is any liquid leakage. 2. Replace the liquid-cooled module.

Transition to Non-recoverable from less severe

Event code	0x0a30000e
Message text	Transition to Non-recoverable from less severe--- Liquid Cooler not present
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-recoverable from less severe--- Liquid Cooler not present
Impact	Heat dissipation of the components in the liquid-cooled device on which the alarm is present is affected.
Cause	In a server that supports multiple liquid-cooled devices, one liquid-cooled device is not present (or the corresponding liquid leakage detection cable is not plugged in).
Recommended action	1. Examine if the liquid leakage detection cable of the liquid-cooled device is loose. If it is loose, remove the AC power and reconnect the liquid leakage detection cable. 2. If the issue persists, contact Technical Support.

Transition to Non-Critical from OK--- Liquid leakage detection cable is disconnected

Event code	0x0a10000e
Message text	Transition to Non-Critical from OK--- Liquid leakage detection cable is disconnected
Variable fields	N/A
Severity level	Major
Example	Transition to Non-Critical from OK--- Liquid leakage detection cable is disconnected
Impact	Unable to detect coolant leakage.
Cause	Liquid leakage sensor cannot be detected.
Recommended action	1. Check if the liquid cooling device is present. 2. Check if the liquid leakage sensor is installed correctly. 3. Replace the liquid-cooled module.

Other Units-based Sensor

Exceeded the upper minor threshold

Event code	0x0b700002
Message text	Exceeded the upper minor threshold---Current reading:$1---Threshold reading:$2
Variable fields	$1: Current power value. $2: Threshold for triggering a minor power notification.
Severity level	Minor
Example	Exceeded the upper minor threshold---Current reading:20---Threshold reading:18
Impact	Exceeding the maximum power limit will cause the system to shut down.
Cause	The power exceeds the limit.
Recommended action	1. Log in to HDM, and verify that the threshold value is appropriate. 2. Check if the total power consumption of the server is too high through the HDM web page. 3. Check if the total power consumption of the power supply meets the service requirements. 4. If the issue persists, contact Technical Support.

Memory

Correctable ECC or other correctable memory error

Event code	0x0c0000de
Message text	Correctable ECC or other correctable memory error--$1-Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Time at which the error occurred, Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Minor
Example	Correctable ECC or other correctable memory error---Current Boot Error-Location:CPU:1 CH:1 DIMM:0 A1
Impact	No negative impact.
Cause	Correctable memory errors.
Recommended action	No action is required.

Correctable ECC or other correctable memory error

Event code	0x0c0020de
Message text	Correctable ECC or other correctable memory error---$1---Location:CPU:$2 CH:$3 DIMM:$4
Variable fields	$1: Fault type, which can be ECC, Parity, CRC, or Other $2: CPU number. $3: Channel number. $4: DIMM number.
Severity level	Minor
Example	Correctable ECC or other correctable memory error---CRC---Location:CPU:1 CH:1 DIMM:0
Impact	No negative impact.
Cause	A correctable error occurred on the memory.
Recommended action	No action is required.

Correctable ECC or other correctable memory error

Event code	0x0c0600de
Message text	Correctable ECC or other correctable memory error---$1---$2---Location:CPU$2 CH:$3 DIMM:$4
Variable fields	$1: Fault type, which can be ECC, Parity, or CRC. $2: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $3: CPU number. $4: Channel number. $5: DIMM number.
Severity level	Minor
Example	Correctable ECC or other correctable memory error---ECC---Current Boot Error---Location:CPU1 CH:8 DIMM:0
Impact	No negative impact.
Cause	A correctable error occurred on the memory.
Recommended action	No action is required.

CPU triggered a correctable error

Event code	0x0c0500de
Message text	CPU $1 $2 triggered a correctable error
Variable fields	$1: CPU number. $2: DIMM mark.
Severity level	Minor
Example	CPU 1 A0 triggered a correctable error
Impact	No negative impact.
Cause	Triggering IERR or MCERR errors, the HDM diagnostic result shows correctable errors in memory.
Recommended action	No action is required.

Uncorrectable ECC or other uncorrectable memory error

Event code	0x0c1000de
Message text	Uncorrectable ECC or other uncorrectable memory error--$1-Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Major
Example	Uncorrectable ECC or other uncorrectable memory error---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A1
Impact	It can cause the system to stop sending responses, unless the memory is in certain RAS modes, such as mirror or MCA recovery.
Cause	A non-correctable (multiple bit flip) ECC error has occurred.
Recommended action	1. Verify that the temperature and humidity are appropriate. 2. Clean the memory slots and memory contacts, ensuring that there are no foreign objects in the memory slots and the contacts are not contaminated. Then, reinstall the corresponding DIMM. 3. Replace the DIMM. 4. If the issue persists, contact Technical Support.

Uncorrectable ECC or other uncorrectable memory error

Event code	0x0c1020de
Message text	Uncorrectable ECC or other uncorrectable memory error--$1-Location:CPU:$2 CH:$3 DIMM:$4
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number.
Severity level	Major
Example	Uncorrectable ECC or other uncorrectable memory error---Current Boot Error-Location:CPU:1 MEM CTRL:1 CH:1 DIMM:0 A1
Impact	It can cause the system to stop sending responses.
Cause	An uncorrectable (multiple bit flip) ECC error has occurred.
Recommended action	1. Verify that the temperature and humidity are appropriate. 2. Clean the memory slots and memory contacts, ensuring that there are no foreign objects in the memory slots and the contacts are not contaminated. Then, reinstall the corresponding DIMM. 3. Replace the DIMM. 4. If the issue persists, contact Technical Support.

Triggered an uncorrectable error

Event code	0x0c1500de
Message text	CPU$1 $2 triggered an uncorrectable error
Variable fields	$1: CPU number. $2: DIMM mark.
Severity level	Major
Example	CPU1 A0 triggered an uncorrectable error
Impact	The system might restart or stop responding.
Cause	Triggering IERR or MCERR errors, the BMC diagnostic result shows uncorrectable errors in memory.
Recommended action	1. Verify that the temperature and humidity are appropriate. 2. Clean the memory slots and memory contacts, ensuring that there are no foreign objects in the memory slots and the contacts are not contaminated. Then, reinstall the corresponding DIMM. 3. If the issue persists, replace the DIMM. 4. If the issue persists, contact Technical Support.

Uncorrectable ECC or other uncorrectable memory error

Event code	0x0c1600de
Message text	Uncorrectable ECC or other uncorrectable memory error---$1---$2---Location:CPU$3 CH:$4 DIMM:$5
Variable fields	$1: Fault type, which can be ECC, Parity, or CRC. $2: Startup time upon error occurrence. It can be Current Boot Error or Last Boot Error. $3: CPU number. $4: Channel number. $5: DIMM number.
Severity level	Major
Example	Uncorrectable ECC or other uncorrectable memory error---ECC---Last Boot Error---Location:CPU1 CH:8 DIMM:0
Impact	The system might restart or stop responding.
Cause	Uncorrectable ECC or other uncorrectable errors occur.
Recommended action	1. Verify that the temperature and humidity are appropriate. 2. Clean the memory slots and memory contacts, ensuring that there are no foreign objects in the memory slots and the contacts are not contaminated. Then, reinstall the corresponding DIMM. 3. If the issue persists, replace the DIMM. 4. If the issue persists, contact Technical Support.

Parity

Event code	0x0c2000de
Message text	Parity ---$1---Location: Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Minor
Example	Parity---Current Boot Error-Location:CPU:1 CH:1 DIMM:0 A0
Impact	No negative impact.
Cause	This error message is generated when there is a failure in data parity on the command/address lines while reading the memory cell data, resulting in abnormal data access to the memory.
Recommended action	No action is required.

Parity

Event code	0x0c2020de
Message text	Parity---Location:CPU:$1 CH:$2 DIMM:$3
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number.
Severity level	Minor
Example	Parity---Location:CPU:1 CH:1 DIMM:0
Impact	No negative impact.
Cause	This message is generated when there is a failure in data parity on the command/address lines while reading the memory cell data, resulting in abnormal data access to the memory. The SEL records the command/address parity error and logs the accessed DIMM.
Recommended action	No action is required.

Parity---An uncorrectable error occurs during the memory test phase

Event code	0x0c20b1c4
Message text	Parity---An uncorrectable error occurs during the memory test phase---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---An uncorrectable error occurs during the memory test phase---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	An UCE occurred in the memtest phase.
Recommended action	1. Verify that the temperature and humidity are appropriate. 2. Clean the memory slots and memory contacts, ensuring that there are no foreign objects in the memory slots and the contacts are not contaminated. Then, reinstall the corresponding DIMM. 3. If the issue persists, replace the DIMM. 4. If the issue persists, contact Technical Support.

Parity---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c20e014
Message text	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	Configuration error. Memory interleave configuration does not meet the requirements of the server.
Recommended action	Contact Technical Support.

Parity---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c20e024
Message text	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	Configuration error. Memory interleave configuration does not meet the requirements of the server.
Recommended action	Contact Technical Support.

Parity---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c20e0e4
Message text	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation or system startup failure might occur.
Cause	Configuration error. Memory interleave configuration does not meet the requirements of the server.
Recommended action	Contact Technical Support.

Parity---CMD eye width is too small

Event code	0x0c226014
Message text	Parity---CMD eye width is too small---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---CMD eye width is too small---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation or system startup failure might occur.
Cause	Memory parity error. CMD eye width is too small.
Recommended action	1. Confirm the memory slot according to the alarm information. 2. Check if there are foreign objects on the memory gold finger and memory slot and clean them. 3. If the issue persists, replace the DIMM. 4. If the issue persists, contact Technical Support.

Parity---CmdPiGroup: No Eye width

Event code	0x0c226024
Message text	Parity---CmdPiGroup: No Eye width---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---CmdPiGroup: No Eye width---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	CMD eye width does not exist.
Recommended action	1. Confirm the memory slot according to the alarm information. 2. Check if there are foreign objects on the memory gold finger and memory slot and clean them. 3. If the issue persists, replace the DIMM. 4. If the issue persists, contact Technical Support.

Parity---The command is not in the FNv table

Event code	0x0c228004
Message text	Parity---The command is not in the FNv table---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---The command is not in the FNv table---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	The command sent is not in the FNv table.
Recommended action	Contact Technical Support.

Parity---Memory read DqDqs training failed

Event code	0x0c231134
Message text	Parity---Memory read DqDqs training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Memory read DqDqs training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Memory read Dq or Dqs training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Memory Receive Enable Training Error

Event code	0x0c231144
Message text	Parity---Memory Receive Enable Training Error---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Memory Receive Enable Training Error---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	Memory Faulty Parts Tracking failure. The Receive Enable signal of the memory fails to train to the corresponding timing.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Memory write DqDqs training failed

Event code	0x0c231164
Message text	Parity---Memory write DqDqs training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Memory write DqDqs training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Memory read Dq or Dqs training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---An error occurs during memory test, and the rank is disabled

Event code	0x0c2311c4
Message text	Parity---An error occurs during memory test, and the rank is disabled---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---An error occurs during memory test, and the rank is disabled---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	An error occurred during memory testing, and that rank has been disabled.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---LRDIMM RCVEN training failed

Event code	0x0c231264
Message text	Parity---LRDIMM RCVEN training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---LRDIMM RCVEN training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	LRDIMM RCVEN training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Read delay training failed

Event code	0x0c231284
Message text	Parity---Read delay training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Read delay training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Read delay training has failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Write delay training failed

Event code	0x0c2312b4
Message text	Parity---Write delay training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Write delay training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Write delay training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Mapped out because failed critical mask test at cold boot

Event code	0x0c28c024
Message text	Parity---Mapped out because failed critical mask test at cold boot---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Mapped out because failed critical mask test at cold boot---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation or system startup failure might occur.
Cause	The DIMM failed in key mask test and mapped out as a failed area during cold boot.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Invalid SPD contents

Event code	0x0c2ed094
Message text	Parity---Invalid SPD contents---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Invalid SPD contents---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Invalid SPD content.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---The DCPMM memory modules of the unexpected model are installed

Event code	0x0c2ed0c4
Message text	Parity---The DCPMM memory modules of the unexpected model are installed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---The DCPMM memory modules of the unexpected model are installed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Unsupported CDPMMs are inserted.
Recommended action	1. Based on the alarm type, confirm the specifications of the DCPMM and replace the DCPMM memory. 2. If the issue persists, contact Technical Support.

Parity---Failed to set the VDD voltage of the DIMM

Event code	0x0c2f0014
Message text	Parity---Failed to set the VDD voltage of the DIMM---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Failed to set the VDD voltage of the DIMM---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	Software data structure abnormality.
Recommended action	1. Replace the DIMM. 2. If the issue persists, replace the system board. 3. If the issue persists, contact Technical Support.

Parity---Delay exceeded

Event code	0x0c214024
Message text	Parity---Delay exceeded---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Delay exceeded---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	The program execution timed out.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---Timing error occurred during signal line adjustment for memory write leveling training

Event code	0x0c215014
Message text	Parity---Timing error occurred during signal line adjustment for memory write leveling training---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---Timing error occurred during signal line adjustment for memory write leveling training---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	System performance degradation might occur.
Cause	Timing abnormality occurs to write leveling adjustment signal line.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---CS is not consistent with clock in timing, and the channel is isolated

Event code	0x0c229044
Message text	Parity---CS is not consistent with clock in timing, and the channel is isolated---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---CS is not consistent with clock in timing, and the channel is isolated---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	A timing mismatch occurred between CS and clock.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---CA is not consistent with clock in timing, and the channel is isolated

Event code	0x0c229054
Message text	Parity---CA is not consistent with clock in timing, and the channel is isolated---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---CA is not consistent with clock in timing, and the channel is isolated---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	A timing mismatch might occur between CA and clock.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---LRDIMM external coarse training failed

Event code	0x0c231204
Message text	Parity---LRDIMM external coarse training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---LRDIMM external coarse training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	LRDIMM external fine training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---LRDIMM external fine training failed

Event code	0x0c231214
Message text	Parity---LRDIMM external fine training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---LRDIMM external fine training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	LRDIMM external fine training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---LRDIMM internal coarse training failed

Event code	0x0c231224
Message text	Parity---LRDIMM internal coarse training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Parity---LRDIMM internal coarse training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation might occur.
Cause	LRDIMM internal coarse training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Parity---LRDIMM internal fine training failed

Event code	0x0c231234
Message text	Parity---LRDIMM internal fine training failed---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: DIMM mark.
Severity level	Minor
Example	Parity---LRDIMM internal fine training failed---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	System performance degradation or system startup failure might occur.
Cause	LRDIMM internal fine training failed.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Memory Device Disabled---The Rank is disabled

Event code	0x0c40a034
Message text	Memory Device Disabled---The rank is disabled---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: DIMM mark.
Severity level	Minor
Example	Memory Device Disabled---The rank is disabled---Location:CPU:2 CH:1 DIMM:B1 Rank:1
Impact	System performance degradation might occur. This does not affect normal use of the system.
Cause	One rank of the memory is disabled, but it does not affect the use of the remaining ranks.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Memory Device Disabled---The DIMM is disabled

Event code	0x0c40a044
Message text	Memory Device Disabled---The DIMM is disabled---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number. $4: DIMM mark.
Severity level	Minor
Example	Memory Device Disabled---The DIMM is disabled---Location:CPU:1 CH:1 DIMM:0 Rank:0
Impact	System performance degradation might occur.
Cause	The DIMM is disabled.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Memory Device Disabled

Event code	0x0c4000de
Message text	Memory Device Disabled--$1---Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Major
Example	Memory Device Disabled---Current Boot Error---Location:CPU:1 CH:1 DIMM:0 A1
Impact	The DIMM is disabled. System performance degradation might occur.
Cause	A memory fault is detected during the system startup process.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Memory Device Disabled

Event code	0x0c4020de
Message text	Memory Device Disabled---Location:CPU:$2 CH:$3 DIMM:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number.
Severity level	Major
Example	Memory Device Disabled ---Location:CPU:1 CH:1 DIMM:0
Impact	The DIMM is disabled. System performance degradation might occur.
Cause	A memory fault is detected during the system startup process.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Correctable ECC or other memory error limit reached

Event code	0x0c5000de
Message text	Correctable ECC or other memory error limit reached--$1---Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Minor
Example	Correctable ECC or other memory error limit reached---Current Boot Error---Location:CPU:1 CH:1 DIMM:0 A1
Impact	The system might restart or stop responding.
Cause	The memory may not be installed correctly or there could be an internal memory failure. The correctable errors in the memory have reached the set threshold, and when the corresponding Memory RAS mode is enabled, the corresponding RAS features will be executed without causing a system crash. Even in the memory repair mode, the errors still exceed the threshold.
Recommended action	1. Reinstall the corresponding DIMM to ensure correct installation, clean the gold fingers, make sure no foreign objects exist in the memory slot, and that the temperature and humidity in the environment are normal. 2. Check the memory funnel threshold in the BIOS. If it is too low, adjust the funnel threshold value in the BIOS. 3. If the issue persists, contact Technical Support.

Correctable ECC or other memory error limit reached

Event code	0x0c5020de
Message text	Correctable ECC or other correctable memory error logging limit reached---$1 $2:$3---Location:CPU:$4 CH:$5 DIMM:$6
Variable fields	$1: MCA/UMC(Available in case of CE Count Overflow) $2: CE Count Overflow/Memory CE Storm Threshold/Memory CE Accumulation Threshold $3: Threshold. $4: CPU number. $5: Channel number. $6: DIMM number.
Severity level	Minor
Example	Correctable ECC or other correctable memory error logging limit reached---MCA CE Count Overflow:8769---Location:CPU:1 CH:5 DIMM:0
Impact	The system might restart or stop responding.
Cause	The memory may not be installed correctly or there could be an internal memory failure. The correctable errors in the memory have reached the set threshold and will not cause a system crash. Even in the memory repair mode, the errors still exceed the threshold.
Recommended action	1. Reinstall the corresponding DIMM. Make sure it is installed correctly, the gold contacts are not contaminated, no foreign objects exist in the memory slot, and the environmental temperature and humidity are normal. 2. Check whether the memory funnel threshold in the BIOS is too low. If so, adjust the funnel threshold value in the BIOS. 3. If the issue persists, contact Technical Support.

Presence detected

Event code	0x0c6000de
Message text	Presence detected
Variable fields	N/A
Severity level	Info
Example	Presence detected
Impact	No negative impact.
Cause	A DIMM is detected present.
Recommended action	No action is required.

Memory patrol scrub CE occurred

Event code	0x0c3010de
Message text	Memory patrol scrub CE occurred---$1---Location: Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Minor
Example	Memory patrol scrub CE occurred---Current Boot Error---Location:CPU:1 CH:1 DIMM:0 A0
Impact	Check failed for reading memory data. No negative impact.
Cause	CE Inspection. This error message indicates that there was a data parity error during the read operation of a memory cell. The error occurred on the command/address lines, resulting in abnormal data retrieval from the memory. The error is recorded in the SEL, along with the DIMM that was accessed during the error.
Recommended action	No action is required.

Memory patrol scrub UCE occurred and degraded to CE

Event code	0x0c3020de
Message text	Memory patrol scrub UCE occurred and degraded to CE---$1---Location: Location:CPU:$2 CH:$3 DIMM:$4 $5
Variable fields	$1: Specifies if the error occurred during the current boot or the previous boot. It can be Current Boot Error or Last Boot Error. $2: CPU number. $3: Channel number. $4: DIMM number. $5: DIMM mark.
Severity level	Minor
Example	Memory patrol scrub UCE occurred and degraded to CE---Current Boot Error---Location:CPU:1 CH:1 DIMM:0 A0
Impact	Check failed for reading memory data. No negative impact.
Cause	UCE Inspection: Degraded CE. This error message indicates that there was a data parity error during the read operation of a memory cell. The error occurred on the command/address lines, resulting in abnormal data retrieval from the memory. The error is recorded in the SEL, along with the DIMM that was accessed during the error.
Recommended action	No action is required.

Configuration error---RDIMMs are installed on the server that supports only UDIMMs

Event code	0x0c701014
Message text	Configuration error---RDIMMs are installed on the server that supports only UDIMMs---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---RDIMMs are installed on the server that supports only UDIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	An RDIMM was inserted into a CPU platform that only supports UDIMM.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---UDIMMs are installed on the server that supports only RDIMMs

Event code	0x0c702014
Message text	Configuration error---UDIMMs are installed on the server that supports only RDIMMs---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---UDIMMs are installed on the server that supports only RDIMMs---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	An UDIMM was inserted into a CPU platform that only supports RDIMM.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---SODIMMs are installed on the server that supports only RDIMMs

Event code	0x0c703014
Message text	Configuration error---SODIMMs are installed on the server that supports only RDIMMs-Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---SODIMMs are installed on the server that supports only RDIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	An SODIMM was inserted into a platform that only supports RDIMM.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks per channel can be only 1, 2, or 4

Event code	0x0c707024
Message text	Configuration error---The number of ranks per channel can be only 1, 2, or 4---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks per channel can be only 1, 2, or 4---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The number of ranks in the memory does not meet the requirements of the CPU platform. The current CPU platform supports a maximum of 4 ranks of memory.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported

Event code	0x0c707044
Message text	Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Columns, rows, or banks of the DIMM cannot meet the JEDEC standards, and LRDIMMs are not supported---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Unsupported memory type: · The memory design (COL, Row, Bank) does not comply with JEDEC standard design. · The LRDIMM is not on the server's supported list.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks in the channel exceeds 8

Event code	0x0c707054
Message text	Configuration error---The number of ranks in the channel exceeds 8---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks in the channel exceeds 8---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The total number of ranks of all memory in the channel exceeds the maximum supported number of ranks (8).
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server

Event code	0x0c707094
Message text	Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Support for ECC on the DIMMs is not consistent with support for ECC on the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The ECC support for the server's memory is inconsistent.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V

Event code	0x0c7070a4
Message text	Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The voltage for a DDR4 DIMM must be 12V, and the voltage for a DDR5 DIMM must be 11V---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The current voltage does not meet the supported voltage of the memory. · DDR4 memory supports a voltage of 12V. · DDR5 memory supports a voltage of 11V.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with 3DS DIMMs

Event code	0x0c707104
Message text	Configuration error---The CPU is not compatible with 3DS DIMMs---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with 3DS DIMMs-Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The current CPU does not support memory with 3DS packaging.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---NVDIMMs with stepping lower than 0x10 are not supported

Event code	0x0c707114
Message text	Configuration error---NVDIMMs with stepping lower than 0x10 are not supported---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---NVDIMMs with stepping lower than 0x10 are not supported---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Configuration error. NVDIMMs with a step value lower than 16 are not supported.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with the DIMMs

Event code	0x0c707144
Message text	Configuration error---The CPU is not compatible with the DIMMs---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with the DIMMs---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Configuration error: CPU and DIMM are not compatible.
Recommended action	Contact Technical Support.

Configuration error---The frequency of the DIMM is not supported on the server

Event code	0x0c707154
Message text	Configuration error---The frequency of the DIMM is not supported on the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The frequency of the DIMM is not supported on the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The current platform configuration does not support the frequency of the memory module.
Recommended action	1. The current configuration does not support the memory frequency settings. Confirm whether the Enforce Population POR/Enforce DDR Memory Frequency POR option in the Setup menu is enabled and whether the supported frequency of the memory module is within the supported range. 2. If the issue persists, contact Technical Support.

Configuration error---24Gb or higher Capacity DRAMs not supported with this CPU

Event code	0x0c7071f4
Message text	Configuration error---24Gb or higher Capacity DRAMs not supported with this CPU---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---24Gb or higher Capacity DRAMs not supported with this CPU---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The CPU does not support memory modules with a capacity of 24GB or above.
Recommended action	1. The current CPU does not support memory modules with a capacity of 24GB or above. Check the error message for the corresponding DIMM and replace the DIMM with a supported capacity. 2. If the issue persists, contact Technical Support.

Configuration error---The CPU is not compatible with LRDIMMs

Event code	0x0c707214
Message text	Configuration error---The CPU is not compatible with LRDIMMs---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The CPU is not compatible with LRDIMMs---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The CPU does not support LRDIMMs.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error--- DCPMM + HBM config is not supported. Disable DCPMM populated channel

Event code	0x0c707224
Message text	Configuration error--- DCPMM + HBM config is not supported. Disable DCPMM populated channel---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error--- DCPMM + HBM config is not supported. Disable DCPMM populated channel---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	DCPMM and HBM cannot coexist. The channel for DCPMM detected by memory installation check must be disabled.
Recommended action	1. Remove the DCPMM. 2. If the issue persists, contact Technical Support.

Configuration error--- Failed to enable the lockstep mode The memory RAS mode has degraded to independent

Event code	0x0c709014
Message text	Configuration error--- Failed to enable the lockstep mode The memory RAS mode has degraded to independent ---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable the lockstep mode The memory RAS mode has degraded to independent---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The memory configuration cannot enable Lockstep mode. It will be downgraded to independent mode.
Recommended action	1. Lockstep configuration has been downgraded. Check if the memory installation satisfies Lockstep mode. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable the full mirror mode

Event code	0x0c70c014
Message text	Configuration error---Failed to enable the full mirror mode---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable the full mirror mode---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Enabling Full Mirror RAS mode for memory has failed. The Mirror configuration will be downgraded.
Recommended action	1. Mirror configuration has been downgraded. Check if the memory installation satisfies mirror mode. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable the partial mirror mode The memory RAS mode degraded to independent

Event code	0x0c70d014
Message text	Configuration error--- Failed to enable the partial mirror mode The memory RAS mode degraded to independent---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error--- Failed to enable the partial mirror mode The memory RAS mode degraded to independent---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Unable to start partial mirror mode. The system is switched to Independent channel mode.
Recommended action	1. Partial mirror configuration has been downgraded. Check if the memory installation satisfies Partial mirror mode. 2. If the issue persists, contact Technical Support.

Configuration error---The memory interleaving configuration cannot meet the requirements of the server

Event code	0x0c70e034
Message text	Configuration error---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The memory interleaving configuration cannot meet the requirements of the server---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Memory configuration error. The memory interleaving configuration does not meet the requirements of the server.
Recommended action	1. Check the memory interleaving configuration in the setup (NUMA and interleaving). 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent

Event code	0x0c710014
Message text	Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable the rank sparing mode The memory RAS mode has degraded to independent---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Rank Sparing mode cannot be enabled. The memory RAS mode has been downgraded to independent mode.
Recommended action	1. The Sparing configuration has been downgraded. Check if the memory installation satisfies the Sparing mode. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable patrol scrubbing

Event code	0x0c711004
Message text	Configuration error---Failed to enable patrol scrubbing---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable patrol scrubbing---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Memory patrol/scrubbing cannot be enabled.
Recommended action	1. Enabling patrol scrub has failed. Check the RAS (Reliability, Availability, Serviceability) features supported by the CPU specifications. 2. If the issue persists, contact Technical Support.

Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty

Event code	0x0c717014
Message text	Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The number of ranks in the black slot is greater than that in the white slot, or the DIMM is installed in the black slot with the white slot empty---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	1. The principle of having larger rank memory in front (white slot) is not met under the channel configuration. 2. The principle of white slot preferred with memory is not met.
Recommended action	1. The memory installation is incorrect. Refer to the Intel PDG for DDR5/DCPMM and other relevant resources for proper memory installation guidelines. 2. If the issue persists, contact Technical Support.

Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel

Event code	0x0c717034
Message text	Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---DIMM population error Two DDR-T memory modules cannot be installed in a channel---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	Two DCPMMs are installed in a channel, which does not meet DIMM installation requirements.
Recommended action	1. The DIMMs are installed incorrectly. For more information, see DDR5/DCPMM related information from Intel PDG. 2. If the issue persists, contact Technical Support.

Configuration error---The DDR-T memory module is installed in the white slot

Event code	0x0c717054
Message text	Configuration error---The DDR-T memory module is installed in the white slot---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---The DDR-T memory module is installed in the white slot---Location:CPU:1 CH:1 DIMM:A1 Rank:0
Impact	The system might restart or stop responding.
Cause	The DCPMM is installed in a white slot, which does not meet DIMM installation requirements.
Recommended action	1. The DIMMs are installed incorrectly. For more information, see DDR5/DCPMM related information from Intel PDG. 2. If the issue persists, contact Technical Support.

Configuration error---ODT configuration errorThe channel is isolated

Event code	0x0c729034
Message text	Configuration error---ODT configuration error The channel is isolated---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---ODT configuration errorThe channel is isolated---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	The system might restart or stop responding.
Cause	Memory ODT is configured incorrectly, and the channel is isolated.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---REQ is not consistent with clock in timing

Event code	0x0c729064
Message text	Configuration error---REQ is not consistent with clock in timing---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---REQ is not consistent with clock in timing---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	The system might restart or stop responding.
Cause	REQ and the clock input have inconsistent timing.
Recommended action	1. Replace the DIMM. 2. If the issue persists, contact Technical Support.

Configuration error---Failed to enable ADDDC

Event code	0x0c73a014
Message text	Configuration error---Failed to enable ADDDC---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---Failed to enable ADDDC---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	The system might restart or stop responding.
Cause	Failed to enable ADDDC due to incorrect memory configuration.
Recommended action	1. Verify that the memory configuration meets the ADDDC requirements. 2. If the issue persists, contact Technical Support.

Configuration error---NVMCTRL_MEDIA_NOTREADY

Event code	0x0c784024
Message text	Configuration error---NVMCTRL_MEDIA_NOTREADY---Location:CPU:$1 CH:$2 DIMM:$3 Rank:$4
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM mark. $4: Rank number.
Severity level	Minor
Example	Configuration error---NVMCTRL_MEDIA_NOTREADY---Location:CPU:1 CH:2 DIMM:A0 Rank:0
Impact	The system might restart or stop responding.
Cause	The DCPMM firmware medium is not ready.
Recommended action	1. Access the BIOS setup utility to identify the DCPMM status and update the DCPMM firmware. 2. Replace the DIMM. 3. If the issue persists, contact Technical Support.

Drive Slot

Drive Presence

Event code	0x0d0000de
Message text	Drive Presence
Variable fields	N/A
Severity level	Info
Example	Drive Presence
Impact	The drive presence changed.
Cause	The drive presence changed.
Recommended action	No action is required.

Drive Fault

Event code	0x0d1000de
Message text	Drive Fault HDDBay upper drive: Drive Fault --- Bay Slot: $1, HDD Slot: $2
Variable fields	$1: Bay slot number. $2: HDD slot number.
Severity level	Major
Example	Drive Fault HDDBay upper drive: Drive Fault --- Bay Slot: 1, HDD Slot: 2
Impact	The drive is faulty, which might cause data loss.
Cause	The drive is faulty.
Recommended action	1. Verify that the status of the drive is Unconfigured Good. 2. Verify that drive LEDs are normal, and the drive can be identified and is accessible in the OS. If a drive LED is orange, the drive is faulty. Replace the faulty components, if any. 3. Verify that the storage controller is in normal state. 4. If the issue persists, contact Technical Support.

Drive Fault---The disk is missing

Event code	0x0d1520de
Message text	Drive Fault---The disk is missing---Bay slot:$1---HDD slot:$2
Variable fields	$1: Bay slot number. $2: HDD slot number.
Severity level	Major
Example	Drive Fault---The disk is missing---Bay slot:14---HDD slot:37
Impact	The drive is removed or not installed correctly, which impacts the storage system stability of the system.
Cause	The drive cannot be identified by the storage controller or drive cables are connected incorrectly.
Recommended action	1. Log in to HDM, and verify that the drive can be identified successfully. 2. Re-install the drive. 3. Replace the drive. 4. If the issue persists, contact Technical Support.

Predictive Failure

Event code	0x0d2000de
Message text	Predictive Failure
Variable fields	N/A
Severity level	Minor
Example	Predictive Failure
Impact	The drive reliability decreases, which might impact the OS storage performance and service operation.
Cause	The RAID controller or NVMe SSD reports a predictive failure, which can be a storage medium reserved block alarm, drive lifetime alarm, Prefail alarm, or bad sector alarm.
Recommended action	1. Replace the drive. 2. If the issue persists, contact Technical Support.

In Critical Array

Event code	0x0d5000de
Message text	In Critical Array---PCIe slot:$1---LDDevno:$2
Variable fields	$1: PCIe slot where the logical drive resides. $2: Logical drive number.
Severity level	Major
Example	In Critical Array---PCIe slot:1---LDDevno:1
Impact	The logical drive degraded, which might impact data reliability.
Cause	A drive in a logical drive was removed or failed and the logical drive degraded.
Recommended action	1. Verify that no drive is removed. If a drive is removed, re-install the drive and recreate the RAID array. 2. Log in to HDM, view drive information from the storage page, and verify that all drives in the logical drive are identified correctly. If a drive cannot be identified, re-install the drive. If the drive cannot be identified after re-installation, replace the drive. 3. Log in to HDM, view drive information, and verify that the status of the drive is Unconfigured Good. 4. After the drive is identified correctly, recreate the RAID array. 5. If the issue persists, contact Technical Support.

In Failed Array

Event code	0x0d6000de
Message text	In Failed Array---PCIe slot:$1---LDDevno:$2
Variable fields	$1: PCIe slot where the logical drive resides. $2: Logical drive number.
Severity level	Major
Example	In Failed Array---PCIe slot:1---LDDevno:1
Impact	The RAID array becomes invalid, causing data loss offline.
Cause	A drive in a logical drive was removed or failed and the logical drive was totally corrupted.
Recommended action	1. Verify that no drive is removed. If a drive is removed, re-install the drive. 2. If the drive is installed correctly, log in to HDM. View drive information from the storage page, and verify that the drive can be identified correctly. If the drive cannot be identified, re-install the drive. If the drive cannot be identified after re-installation, replace the drive. 3. If the drive is installed correctly, log in to HDM, view drive information from the storage page, and verify that the status of the drive is Unconfigured Good. 4. After the drive is identified correctly, verify that the RAID array is normal. If the RAID array is faulty, recreate the RAID array. 5. If the issue persists, contact Technical Support.

Rebuild/Remap in progress

Event code	0x0d7000de
Message text	Rebuild/Remap in progress
Variable fields	N/A
Severity level	Info
Example	Rebuild/Remap in progress
Impact	No negative impact.
Cause	This message is generated during RAID rebuilding after a drive is installed.
Recommended action	No action is required.

The disk triggered an media error

Event code	0x0da000de
Message text	The disk triggered an media error--$1
Variable fields	$1: Drive location.
Severity level	Info
Example	The disk triggered an media error--Front 1
Impact	A media error on the storage media might cause data loss.
Cause	The number of media errors exceeded the threshold.
Recommended action	1. Update the drive firmware. 2. Replace the drive. 3. If the issue persists, contact Technical Support.

The disk triggered an uncorrectable error

Event code	0x0db000de
Message text	The disk triggered an uncorrectable error--$1
Variable fields	$1: Drive location.
Severity level	Minor
Example	The disk triggered an uncorrectable error--Front 1
Impact	An uncorrectable error on the storage media might cause data loss.
Cause	The number of uncorrectable errors exceeded the threshold.
Recommended action	1. Update the drive firmware. 2. Replace the drive. 3. If the issue persists, contact Technical Support.

The disk is missing

Event code	0x0dc000de
Message text	The disk is missing
Variable fields	N/A
Severity level	Major
Example	The disk is missing
Impact	The drive is removed or not installed correctly, which impacts the storage system stability of the system.
Cause	The drive cannot be identified by the storage controller or drive cables are connected incorrectly.
Recommended action	1. Log in to HDM, and verify that the drive can be identified successfully. 2. Verify that the drive data cables, power cords, and signal cables are connected correctly. 3. Re-install the drive. 4. Replace the drive. 5. If the issue persists, contact Technical Support.

System Firmware Progress

System Firmware Error (POST Error)---Run sense AMP HW FSM failed

Event code	0x0f0fe044
Message text	System Firmware Error (POST Error)---Run sense AMP HW FSM failed
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Run sense AMP HW FSM failed
Impact	System startup failure might occur.
Cause	A memory configuration error occurred.
Recommended action	1. Update the BIOS firmware. 2. Verify that the CPUs and DIMMs are installed correctly. 3. Reduce interleaving configuration (memory interleaving and NUMA).

System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM symmetry on the socket

Event code	0x0f017134
Message text	System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM population rules--- Location: cpu $1
Variable fields	$1: CPU number.
Severity level	Major
Example	System Firmware Error (POST Error)--- Memory population enforcement mismatch, Please check the DIMM population rules--- Location: cpu 1
Impact	System performance degradation might occur.
Cause	The DIMM population is incorrect.
Recommended action	See the DIMM population schemes in the user guide for the corresponding product.

System Firmware Error (POST Error)---No Dimm on socket$1

Event code	0x0f017184
Message text	System Firmware Error (POST Error)---No Dimm on socket$1
Variable fields	$1: CPU number.
Severity level	Major
Example	System Firmware Error (POST Error)---No Dimm on socket1
Impact	If no DIMMs are installed for CPU 1, the system cannot start up. If no DIMMs are not installed for other CPUs, system performance degradation might occur.
Cause	No DIMMs are installed for the CPU.
Recommended action	1. Verify that the DIMMs are installed correctly as required in the user guide for the server. Re-install all DIMMs if needed. 2. Re-install the DIMM. Verify that the gold contacts on the DIMM are not contaminated and the DIMM slot does not contain any foreign objects. 3. Replace the DIMM, and then power on the server. 4. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---No memory found

Event code	0x0f0e8014
Message text	System Firmware Error (POST Error)---No memory found
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No memory found
Impact	The system cannot start up correctly.
Cause	No DIMMs are available.
Recommended action	Verify that the DIMMs are available in the system.

System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation

Event code	0x0f0e8024
Message text	System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---No DIMM is available for memory-mapping operation
Impact	System performance degradation might occur.
Cause	No DIMMs is available for memory mapping.
Recommended action	1. Log in to HDM, access the memory page, and verify that available DIMMs exist. 2. If the issue persists, contact Technical Support.

System Firmware Error (POST Error)---DIMM population error

Event code	0x0f0ed024
Message text	System Firmware Error (POST Error)---DIMM population error
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---DIMM population error
Impact	System startup failure might occur.
Cause	A DIMM compatibility error occurred.
Recommended action	See the HDM maintenance guide for the server.

System Firmware Error (POST Error)---Some CPU links failed to train. UPI topology changed across reset

Event code	0x0f003ff4
Message text	System Firmware Error (POST Error)---Some CPU links failed to train. UPI topology changed across reset
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---Some CPU links failed to train. UPI topology changed across reset
Impact	System startup failure might occur.
Cause	A CPU error occurred.
Recommended action	Verify that CPUs are installed correctly.

System Firmware Error (POST Error)---CPU stepping mismatch detected

Event code	0x0f010ff4
Message text	System Firmware Error (POST Error)---CPU stepping mismatch detected
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU stepping mismatch detected
Impact	System startup failure might occur.
Cause	The CPUs were installed incorrectly and CPU stepping mismatch occurred.
Recommended action	Verify that the CPU stepping is consistent between the installed CPUs.

System Firmware Error (POST Error)---KTI Topology Change Logged

Event code	0x0f0ffff4
Message text	System Firmware Error (POST Error)---KTI Topology Change Logged
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---KTI Topology Change Logged
Impact	System startup failure might occur.
Cause	A CPU error occurred.
Recommended action	Verify that the CPUs are installed correctly.

System Firmware Error (POST Error)---CPU matching failure---CPU stepping is detected

Event code	0x0f0d00de
Message text	System Firmware Error (POST Error)---CPU matching failure---CPU stepping is detected
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU matching failure---CPU stepping is detected
Impact	System startup failure occurs.
Cause	A CPU stepping mismatch error occurred at POST.
Recommended action	1. Verify that the CPU has the same model as the primary CPU. 2. Verify that CPU stepping of the CPU matches the primary CPU.

System Firmware Error (POST Error)---CPU matching failure---CPU frequency is detected

Event code	0x0f0d10de
Message text	System Firmware Error (POST Error)---CPU matching failure---CPU frequency is detected
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU matching failure---CPU frequency is detected
Impact	System startup failure might occur.
Cause	A CPU frequency mismatch error occurred at POST.
Recommended action	Verify that the CPU has the same model as the primary CPU.

System Firmware Error (POST Error)---CPU matching failure---CPU Microcode is detected

Event code	0x0f0d20de
Message text	System Firmware Error (POST Error)---CPU matching failure---CPU Microcode is detected
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU matching failure---CPU Microcode is detected
Impact	System startup failure might occur.
Cause	A CPU microcode mismatch error occurred at POST.
Recommended action	Verify that the CPU has the same model as the primary CPU.

System Firmware Error (POST Error)---CPU matching failure---UPI Topology is detected

Event code	0x0f0d30de
Message text	System Firmware Error (POST Error)---CPU matching failure---UPI Topology is detected
Variable fields	N/A
Severity level	Major
Example	System Firmware Error (POST Error)---CPU matching failure---UPI Topology is detected
Impact	System startup failure might occur.
Cause	A CPU UPI mismatch error occurred at POST.
Recommended action	Verify that the CPU has the same model as the primary CPU.

System Firmware Error(POST Error)---Unrecoverable video controller failure

Event code	0x0f0090de
Message text	System Firmware Error(POST Error)---Unrecoverable video controller failure
Variable fields	N/A
Severity level	Minor
Example	System Firmware Error(POST Error)---Unrecoverable video controller failure
Impact	KVM video display is abnormal.
Cause	Two VGA screen captures are the same during the host startup process.
Recommended action	1. Replace the BMC card. 2. If the issue persists, contact Technical Support.

System Firmware Hang

Event code	0x0f1000de
Message text	System Firmware Hang
Variable fields	N/A
Severity level	Critical
Example	System Firmware Hang
Impact	System operation failure might occur.
Cause	The BIOS hangs during startup.
Recommended action	1. Resolve the issue based on other event logs reported simultaneously for the component. 2. If the issue persists, contact Technical Support.

System software triggered an uncorrectable error

Event code	0x0f1a00de
Message text	System software triggered an uncorrectable error
Variable fields	N/A
Severity level	Major
Example	System software triggered an uncorrectable error
Impact	An IERR or MCERR error is triggered, which may cause service unavailability.
Cause	An IERR or MCERR error is triggered, and the HDM diagnosis result shows that a system software uncorrectable error occurred.
Recommended action	Usually an IERR or MCERR error is triggered by an abnormality in the system or system software. Contact Technical Support.

System software triggered a correctable error

Event code	0x0f0a00de
Message text	System software triggered a correctable error
Variable fields	N/A
Severity level	Minor
Example	System software triggered a correctable error
Impact	An IERR or MCERR error is triggered, which may cause service unavailability.
Cause	An IERR or MCERR error is triggered, and the HDM diagnosis result shows that a system software correctable error occurred.
Recommended action	Usually an IERR or MCERR error is triggered by an abnormality in the system or system software. Contact Technical Support.

System Firmware Progress---Video initialization---Detection unsuccessful

Event code	0x0f2090de
Message text	System Firmware Progress---Video initialization---Detection unsuccessful
Variable fields	N/A
Severity level	Minor
Example	System Firmware Progress---Video initialization---Detection unsuccessful
Impact	No negative impact.
Cause	A video controller check failed.
Recommended action	Contact Technical Support.

System Firmware Progress---Secondary processor(s) initialization---Detection unsuccessful

Event code	0x0f2030de
Message text	System Firmware Progress---Secondary processor(s) initialization---Detection unsuccessful
Variable fields	N/A
Severity level	Minor
Example	System Firmware Progress---Secondary processor(s) initialization---Detection unsuccessful
Impact	No negative impact.
Cause	The TPM/TCM self-test signal is lost or a device access failure occurs.
Recommended action	Contact Technical Support.

Event Logging Disabled

Log Area Reset/Cleared

Event code	0x102000de
Message text	Log Area Reset/Cleared
Variable fields	N/A
Severity level	Info
Example	Log Area Reset/Cleared
Impact	No negative impact.
Cause	This message is generated when all event log entries are cleared.
Recommended action	No action is required.

SEL Full

Event code	0x104000de
Message text	SEL Full
Variable fields	N/A
Severity level	Minor
Example	SEL Full
Impact	The system stops logging new events.
Cause	This message is generated when one of the following occurs: · If the event log reaches its maximum size, the system stops logging new events. · A user disables event logging.
Recommended action	Log in to HDM, enter the Event Log page, and clear all event logs.

SEL Almost Full

Event code	0x105000de
Message text	SEL Almost Full
Variable fields	N/A
Severity level	Minor
Example	SEL Almost Full
Impact	No negative impact.
Cause	The log file is reaching its maximum size when the logging policy is configured to stop login at full storage.
Recommended action	Log in to HDM, enter the Event Log page, and clear all event logs.

System Event

System Reconfigured---BIOS load default. CMOS cleared

Event code	0x120000de
Message text	System Reconfigured---BIOS load default. CMOS cleared
Variable fields	N/A
Severity level	Minor
Example	System Reconfigured---BIOS load default. CMOS cleared
Impact	The BIOS loads the default settings and the user-configured settings get lost.
Cause	The system board battery is abnormal.
Recommended action	1. Verify that the BIOS boot mode meets the requirements of secure boot. If not, change the boot mode to UEFI. 2. Verify that the BIOS firmware is upgraded successfully. 3. Upgrade the BIOS with the factory defaults (if any) or default settings of the BIOS restored. 4. If the issue persists, contact Technical Support.

Limit Exceeded---CPU usage exceeds the threshold

Event code	0x1210100a
Message text	Limit Exceeded---CPU usage exceeds the threshold---Current usage $1, Threshold $2
Variable fields	$1: Current CPU usage. $2: CPU usage threshold.
Severity level	Info
Example	Limit Exceeded---Cpu usage exceeds the threshold---Current usage 82%, Threshold 80%
Impact	System performance degradation might occur.
Cause	The CPU usage exceeds the threshold.
Recommended action	No action is required.

Limit Exceeded---Mem usage exceeds the threshold

Event code	0x120200de
Message text	Limit Exceeded---Mem usage exceeds the threshold---Current usage $1, Threshold $2
Variable fields	$1: Current memory usage. $2: Memory usage threshold.
Severity level	Major
Example	Limit Exceeded---Mem usage exceeds the threshold---Current usage 81%, Threshold 80%
Impact	System performance degradation might occur.
Cause	The memory usage exceeds the threshold.
Recommended action	No action is required.

Limit Exceeded---Network usage exceeds the threshold

Event code	0x120300de
Message text	Limit Exceeded---Network usage exceeds the threshold---Current usage $1, Threshold $2
Variable fields	$1: Current network usage. $2: Network usage threshold.
Severity level	Major
Example	Limit Exceeded---Network usage exceeds the threshold---Current usage 81%, Threshold 80%
Impact	The network might get lost.
Cause	The network usage exceeds the threshold.
Recommended action	This message is triggered by FIST SMS according to the system resource usage.

Limit Exceeded---Hard disk usage exceeds the threshold

Event code	0x120400de
Message text	Limit Exceeded---Hard disk usage exceeds the threshold---OS:Linux/Unix,See disk details about Logical disk name,Current usage $1, Threshold $2
Variable fields	$1: Current drive usage. $2: Drive usage threshold.
Severity level	Major
Example	Limit Exceeded---Hard disk usage exceeds the threshold---OS:Linux/Unix,See disk details about Logical disk name,Current usage 81%, Threshold 80%
Impact	The drive reliability decreases, which might impact the storage performance and service operation of the OS.
Cause	The drive usage exceeds the threshold.
Recommended action	This message is triggered by FIST SMS according to the system resource usage.

Timestamp clock synch---BMC Time SYNC succeed

Event code	0x125000de
Message text	Timestamp Clock Synch---BMC Time SYNC succeed.
Variable fields	N/A
Severity level	Info
Example	Timestamp Clock Synch---BMC Time SYNC succeed.
Impact	No negative impact.
Cause	BMC synchronized ME clock successfully.
Recommended action	No action is required.

Timestamp clock synch

Event code	0x128000de
Message text	Timestamp Clock Synch---event is $1 of pair---SEL Timestamp Clock updated
Variable fields	$1: In the format of first/second, where first represents the event before time synchronization and second represents the event after time synchronization.
Severity level	Info
Example	Timestamp Clock Synch---event is first of pair---SEL Timestamp Clock updated
Impact	No negative impact.
Cause	HDM synchronizes time with the server when the server is powered on. The first event is triggered before time synchronization and the second event is triggered after time synchronization.
Recommended action	No action is required.

Critical Interrupt

PCI PERR

Event code	0x134000de
Message text	PCI PERR ---Slot $1---PCIE Name:$2
Variable fields	$1: Slot number. $2: PCIe name.
Severity level	Major
Example	PCI PERR ---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An internal parity error occurs on the PCIe module. This message is generated when the PERR signal (parity check) on the PCIe module is abnormal.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe module based on the slot number. 4. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Verify that the golden plating on the PCIe module is not contaminated. c. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. d. Update all firmware and drivers, including non-Intel components. e. If the error occurs on the PCIe slot, verify that the slot is normal and the gold plating on the riser card is not contaminated. f. Replace the PCIe module. 5. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and drivers. b. Replace the system board.

PCI SERR

Event code	0x13500000
Message text	PCI SERR---Slot $1---PCIE Name:$2
Variable fields	$1: Slot number. $2: PCIe name.
Severity level	Major
Example	PCI SERR---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An internal system error occurred on the PCIe module. This message is generated when the SERR signal on the PCIe module is abnormal. A system error includes an address parity error, data parity error within a period, and other fatal errors.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe module based on the slot number. 4. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Verify that the golden plating on the PCIe module is not contaminated. c. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. d. Update all firmware and drivers, including non-Intel components. e. If the error occurs on the PCIe slot, verify that the slot is normal and the gold plating on the riser card is not contaminated. f. Replace the PCIe module. 5. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and drivers. b. Replace the system board.

Bus Correctable Error

Event code	0x137000de
Message text	Bus Correctable Error ---Slot $1---PCIE Name:$2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Minor
Example	Bus Correctable Error---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	If this message is generated occasionally, no negative impact occurs on the system. If this message is generated frequently, the PCIe module performance might be affected.
Cause	An internal correctable error occurred on the PCIe module.
Recommended action	1. Ignore this message if it is generated during access to the PCIe module, ignore it. 2. If the same message is generated repeatedly, use the slot number to locate the faulty PCIe module. 3. If the PCIe module is removable, verify that the PCIe module is installed correctly or install the PCIe module to another slot to identify the cause. 4. Replace the PCIe module.

Bus Correctable Error

Event code	0x137800de
Message text	Bus Correctable Error ---Slot $1---PCIE Name:$2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Minor
Example	Bus Correctable Error---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	If this message is generated occasionally, no negative impact occurs on the system. If this message is generated frequently, the PCIe module performance might be affected.
Cause	An internal correctable error occurred on the PCIe module on an AMD model.
Recommended action	1. Ignore this message if it is generated during access to the PCIe module, ignore it. 2. If the same message is generated repeatedly, use the slot number to locate the faulty PCIe module. 3. If the PCIe module is removable, verify that the PCIe module is installed correctly or install the PCIe module to another slot to identify the cause. 4. Replace the PCIe module.

Bus Uncorrectable Error

Event code	0x138000de
Message text	Bus Uncorrectable Error ---Slot $1---PCIE Name:$2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Major
Example	Bus Uncorrectable Error---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An internal uncorrectable error occurred on the PCIe module.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe module based on the slot number. 4. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Verify that the golden plating on the PCIe module is not contaminated. c. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. d. If the error occurs on the PCIe module, update all firmware and drivers. e. If the error occurs on the slot, verify that the gold plating on the riser card is not contaminated. f. If the issue persists, replace the PCIe module. 5. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and drivers. b. Replace the system board. 6. If the issue persists, contact Technical Support.

Bus Uncorrectable Error

Event code	0x138800de
Message text	Bus Uncorrectable Error ---Slot $1---PCIE Name:$2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Major
Example	Bus Uncorrectable Error---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An internal uncorrectable error identified by SHD occurred on the PCIe module on an AMD model.
Recommended action	1. Locate the PCIe module based on the slot number. 2. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Install the PCIe module is another slot. c. Update the firmware and driver of the PCIe module. 3. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and driver. b. Replace the system board.

Bus Fatal Error

Event code	0x13a000de
Message text	Bus Fatal Error ------Slot $1---PCIE Name: $2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Major
Example	Bus Fatal Error---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An internal fatal error occurred on the PCIe module.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe module based on the slot number. 4. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Verify that the golden plating on the PCIe module is not contaminated. c. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. d. If the error occurs on the PCIe module, upgrade firmware and drivers of the PCIe module. e. If the error occurs on the slot, verify that the gold plating on the riser card is not contaminated. f. If the issue persists, replace the PCIe module. 5. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and drivers. b. Replace the system board. 6. If the issue persists, contact Technical Support.

Bus Degraded

Event code	0x13b000de
Message text	Bus Degraded ------Slot $1---PCIE Name: $2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Major
Example	Bus Degraded ---Slot 3---PCIE Name: RAID-LSI-9361-8i
Impact	System performance degradation might occur.
Cause	The speed and bandwidth of the PCIe module decreased.
Recommended action	1. If the message is reported serval times during a period of time, ensure that the riser card is securely connected to the system board. 2. Reboot the server. 3. Locate the PCIe module based on the slot number. 4. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Verify that the golden plating on the PCIe module is not contaminated. c. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. d. Update all firmware and drivers, including non-Intel components. e. If the error occurs on the PCIe slot, verify that the slot is normal and the gold plating on the riser card is not contaminated. f. Replace the PCIe module. 5. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and drivers. b. Replace the system board.

$1 triggered an uncorrectable error

Event code	0x138400de
Message text	$1 triggered an uncorrectable error
Variable fields	$1: PCIe module type.
Severity level	Major
Example	NIC triggered an uncorrectable error
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An IERR or MCERR error occurred, which is identified as a PCIe uncorrectable error by HDM.
Recommended action	1. Locate the PCIe module based on the slot number. 2. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. c. Update all firmware and drivers, including non-Intel components. 3. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and driver. b. Replace the system board.

$1 triggered a correctable error

Event code	0x137400de
Message text	$1 triggered a correctable error
Variable fields	$1: PCIe module type.
Severity level	Major
Example	NIC triggered a correctable error
Impact	An error occurred on the PCIe module, which might lead to the system-level failure if the error is severe enough.
Cause	An IERR or MCERR error occurred, which is identified as a PCIe correctable error by HDM.
Recommended action	1. Locate the PCIe module based on the slot number. 2. If the PCIe module is a removable component, perform the following operations: a. Verify that the PCIe module is installed correctly. b. Install the PCIe module to another slot to identify whether the error is present on the PCIe module or the slot. c. Update all firmware and drivers, including non-Intel components. 3. If the PCIe module is embedded on the system board, perform the following operations: a. Update the BIOS, firmware, and driver. b. Replace the system board.

Button / Switch

Power Button pressed---Physical button---Button pressed

Event code	0x140000de
Message text	Power Button pressed---$1---$2
Variable fields	N/A
Severity level	$1: Button type, including Physical button and Virtual button. $2: Action, including Power off command, Power on command, and Soft off command.
Example	Power Button pressed---Physical button---Power off command
Impact	No negative impact.
Cause	This message is generated in the following conditions: · The physical power button on the front panel of the server is pressed. · Commands are executed to forcedly power off the server, gracefully power off the server, and power cycle the server.
Recommended action	No action is required.

Reset Button pressed

Event code	0x142000de
Message text	Reset Button pressed---Virtual button---reset command
Variable fields	N/A
Severity level	Info
Example	Reset Button pressed---Virtual button---reset command
Impact	No negative impact.
Cause	This message is generated when one of the following conditions exists: · The reset command is executed. · An IERR event occurs.
Recommended action	No action is required.

Module / Board

Transition to Non-Critical from OK

Event code	0x1510000e
Message text	Transition to Non-Critical from OK
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK
Impact	No negative impact if this message is generated occasionally.
Cause	An internal correctable error occurred on the PCIe BUS0 device.
Recommended action	1. Verify that the power supply for the system is normal. 2. If the issue persists, contact Technical Support.

Transition to Critical from less severe

Event code	0x1520000e
Message text	Transition to Critical from less severe
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe
Impact	An error occurred on the PCIe BUS0 device, which might lead to the system-level failure if the error is severe enough.
Cause	An internal uncorrectable error occurred on the PCIe BUS0 device.
Recommended action	1. Verify that the power supply for the system is normal 2. Verify that all components are operating correctly. 3. If the issue persists, contact Technical Support.

Transition to Non- Recoverable from less severe

Event code	0x1530000e
Message text	Transition to Non- Recoverable from less severe $1($2).
Variable fields	$1: Faulty component, such as the system board (---System detected a power supply failure on Motherboard), PDB (---System detected a power supply failure on PDB), compute module (---System detected a power supply failure on CMOD), and riser card (---System detected a power supply failure on Riser). If this alarm is not related to specific components, this field is empty. $2: Specific faulty component, such as P5V, P5V_STBY, CPU1_PVCSA, CPU2_PVCCIO.
Severity level	Critical
Example	Transition to Non- Recoverable from less severe---System detected a power supply failure on Motherboard(P5V).
Impact	System power-off might occur.
Cause	Abnormal board voltage.
Recommended action	1. Ignore this message if it is triggered by a system power-on or power-off event. 2. Reconnect power cords and identify whether the server can be powered on correctly. ¡ If the server can be powered on, the message might be generated because the detection signals were interfered. No action is required. ¡ If the server cannot be powered on, review the SDS logs to locate the fault and replace the faulty component. 3. If the message is generated again during the operation, replace the faulty component. 4. If the issue persists, contact Technical Support.

Transition to Non-Critical from OK---System is operating in KTI Link Slow Speed Mode

Event code	0x15101ff4
Message text	Transition to Non-Critical from OK---System is operating in KTI Link Slow Speed Mode
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK---System is operating in KTI Link Slow Speed Mode
Impact	System startup failure might occur.
Cause	The system is operating in Keizer Technology Interconnect (KTI) low speed mode.
Recommended action	Verify that the signal quality and hardware parameters are correct.

Transition to Non-Critical from OK---Requested Link Speed is not supported. Defaulting to 18GT

Event code	0x15102ff4
Message text	Transition to Non-Critical from OK---Requested Link Speed is not supported. Defaulting to 18GT
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK---Requested Link Speed is not supported. Defaulting to 18GT
Impact	System startup failure might occur.
Cause	The link speed is not supported.
Recommended action	Verify that hardware parameters are correct.

Transition to Non-Critical from OK---One or more per Link option mismatch detected. Forcing to common setting

Event code	0x15104ff4
Message text	Transition to Non-Critical from OK---One or more per Link option mismatch detected. Forcing to common setting
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK---One or more per Link option mismatch detected. Forcing to common setting
Impact	System startup failure might occur.
Cause	Some CPU links are faulty.
Recommended action	Verify that the UPI configuration is correct on the BIOS setup utility.

Transition to Non-Critical from OK---Some CPU has more than one link connecting to other CPU. Disable one of the Dual-Link

Event code	0x15105ff4
Message text	Transition to Non-Critical from OK---Some CPU has more than one link connecting to other CPU. Disable one of the Dual-Link
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK---Some CPU has more than one link connecting to other CPU. Disable one of the Dual-Link
Impact	System startup failure might occur.
Cause	A UPI link error occurred.
Recommended action	Verify that the UPI link is connected as required.

Transition to Non-Critical from OK---KTI Adaptation is in progress, or High Speed adaptation is failed

Event code	0x15106ff4
Message text	Transition to Non-Critical from OK---KTI Adaptation is in progress, or High Speed adaptation is failed
Variable fields	N/A
Severity level	Minor
Example	Transition to Non-Critical from OK---KTI Adaptation is in progress, or High Speed adaptation is failed
Impact	System startup failure might occur.
Cause	KTI adaption is in progress.
Recommended action	Verify that the signal quality and hardware parameters are correct.

System board triggered an uncorrectable error

Event code	0x1521000e
Message text	System board triggered an uncorrectable error
Variable fields	N/A
Severity level	Major
Example	System board triggered an uncorrectable error
Impact	An IERR or MCERR error occurred in the system, which causes services to become unavailable.
Cause	An IERR or MCERR error was triggered. The error was identified as an uncorrectable error on the system board (including backplanes) by HDM.
Recommended action	If the issue persists, contact Technical Support.

System board triggered a correctable error

Event code	0x1521000e
Message text	System board triggered a correctable error
Variable fields	N/A
Severity level	Minor
Example	System board triggered a correctable error
Impact	An IERR or MCERR error occurred in the system, which causes services to become unavailable.
Cause	An IERR or MCERR error was triggered. The error was identified as an uncorrectable error on the system board (including backplanes) by HDM.
Recommended action	If the issue persists, contact Technical Support.

Add-in Card

Transition to OK

Event code	0x1700000e
Message text	Transition to OK---PCIe slot: $1---LDDevno:$2
Variable fields	$1: PCIe slot where the logical drive resides. $2: Logical drive number.
Severity level	Info
Example	Transition to OK---PCIe slot:1---LDDevno:0
Impact	No negative impact.
Cause	This message is generated if the logical drive managed by the storage controller changes from abnormal to normal.
Recommended action	No action is required.

Transition to Critical from less severe

Event code	0x1720000e
Message text	Transition to Critical from less severe
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe
Impact	System power-off might occur.
Cause	The backplane power supply is faulty.
Recommended action	1. Ignore this message if it is triggered by a system power-on or power-off event. 2. Reconnect power cords and identify whether the server can be powered on correctly. ¡ If the server can be powered on, the message might be generated because the detection signals were interfered. No action is required. ¡ If the server cannot be powered on, review the SDS logs to locate the fault and replace the faulty component. 3. If the issue persists, replace the faulty component. 4. If the issue persists, contact Technical Support.

Transition to Critical from less severe

Event code	0x172a000e
Event code	Transition to Critical from less severe---PCIe slot:$1---LDDevno::$2
Message text	The logical drive degraded.
Variable fields	Major
Severity level	Transition to Critical from less severe---PCIe slot: 1---LDDevno:0
Example	The logical drive degraded, which might impact data reliability.
Impact	This message is generated when the logical drive managed by the storage controller is degraded or faulty.
Cause	1. Log in to HDM to identify whether the logical drive is degraded or faulty. 2. If the logical drive is degraded, perform the following operations: a. Verify that all member drives in the logical drive are operating correctly. b. Re-install member drives to identify whether the drives can be correctly identified. c. Access the BIOS to identify whether all member drives have been configured correctly. d. Check the error logs for the drives. e. Replace the faulty drive. f. If the issue persists, contact Technical Support. 3. If the logical drive is faulty, perform the following operations: a. Verify that the drive has not been uninstalled. b. Re-install the member drives and rebuild the RAID. c. Replace the faulty drive, and then reboot the server. d. If the issue persists, contact Technical Support.

Transition to Non-recoverable from less severe

Event code	0x1730000e
Message text	Transition to Non-recoverable from less severe
Variable fields	N/A
Severity level	Critical
Example	Transition to Non-recoverable from less severe
Impact	System power-off might occur.
Cause	The backplane power supply or the riser power supply is faulty.
Recommended action	1. Ignore this message if it is triggered by a system power-on or power-off event. 2. Reconnect power cords and identify whether the server can be powered on correctly. ¡ If the server can be powered on, the message might be generated because the detection signals were interfered. No action is required. ¡ If the server cannot be powered on, review the SDS logs to locate the fault and replace the faulty component. 3. If the issue persists, replace the faulty component. 4. If the issue persists, contact Technical Support.

ChipSet

Transition to Critical from less severe

Event code	0x1920000e
Message text	Transition to Critical from less severe
Variable fields	N/A
Severity level	Major
Example	Transition to Critical from less severe
Impact	System performance degradation might occur.
Cause	The PCH status was abnormal.
Recommended action	1. If this message is generated during the host restart process, ignore this message. 2. If this message is repeatedly generated during the operation, replace the system board. 3. If the issue persists, contact Technical Support.

Cable/Interconnect

Configuration Error - Incorrect cable connected / Incorrect interconnection

Event code	0x1b1000de
Message text	Configuration Error - Incorrect cable connected / Incorrect interconnection
Variable fields	N/A
Severity level	Minor
Example	Configuration Error - Incorrect cable connected / Incorrect interconnection
Impact	The network is abnormal, which might cause network disconnectivity in the system.
Cause	Incorrect cable configuration.
Recommended action	1. Verify that the cables are connected to the correct interfaces. 2. Verify that the cables connected properly for power connection.

Configuration Error - Incorrect cable connected / Incorrect interconnection

Event code	0x1b1800de
Message text	Configuration Error - Incorrect cable connected / Incorrect interconnection---$1
Variable fields	$1: Incorrect cable configuration.
Severity level	Minor
Example	Configuration Error - Incorrect cable connected / Incorrect interconnection---Incorrect SATA cable connection to the backplane
Impact	A communication exception might occur on the backplane.
Cause	Incorrect cable configuration.
Recommended action	1. Verify that the cables are connected to the correct interfaces. 2. Verify that the cables connected properly for power connection.

Configuration Error - Incorrect cable connected / Incorrect interconnection

Event code	0x1b1400de
Message text	Configuration Error - Incorrect cable connected / Incorrect interconnection ($1)
Variable fields	$1: Cable connection location.
Severity level	Minor
Example	Configuration Error-Incorrect cable connected / Incorrect interconnection(FrontBackplane1)
Impact	A communication exception might occur on the backplane.
Cause	Incorrect cable configuration.
Recommended action	1. Verify that the cables are connected to the correct interfaces. 2. Verify that the cables connected properly for power connection.

System Boot / Restart Initiated

Initiated by power up

Event code	0x1d0000de
Message text	Initiated by power up
Variable fields	N/A
Severity level	Info
Example	Initiated by power up
Impact	No negative impact.
Cause	This event is triggered by a system power-on.
Recommended action	No action is required.

Initiated by hard reset

Event code	0x1d1000de
Message text	Initiated by hard reset
Variable fields	N/A
Severity level	Info
Example	Initiated by hard reset
Impact	No negative impact.
Cause	This event is triggered by a system restart.
Recommended action	No action is required.

Initiated by warm reset

Event code	0x1d2000de
Message text	Initiated by warm reset
Variable fields	N/A
Severity level	Info
Example	Initiated by warm reset
Impact	No negative impact.
Cause	This event is triggered by a system warm restart.
Recommended action	No action is required.

System restart

Event code	0x1d7000de
Message text	System Restart---$1:$2
Variable fields	$1: Reboot cause. $2: Power mode. Options include power off, power reset, and power cycle. This option may be empty.
Severity level	Info
Example	System Restart---due to power button pressed:power off
Impact	No negative impact.
Cause	The system restarts.
Recommended action	No action is required.

Boot Error

No bootable media

Event code	0x1e0000de
Message text	No bootable media
Variable fields	N/A
Severity level	Info
Example	No bootable media
Impact	No negative impact.
Cause	Status description to indicate no bootable media, which typically has no negative impact.
Recommended action	1. Specify an available boot device. 2. If the issue persists, contact Technical Support.

OS_BOOT

C: boot completed

Event code	0x1f1000de
Message text	C: boot completed
Variable fields	N/A
Severity level	Info
Example	C: boot completed
Impact	No negative impact.
Cause	The operating system booted from a hard drive. This event happens for most Windows OSs.
Recommended action	No action is required.

Boot completed - boot device not specified

Event code	0x1f6000de
Message text	Boot completed - boot device not specified
Variable fields	N/A
Severity level	Info
Example	Boot completed - boot device not specified
Impact	No negative impact.
Cause	This message is generated when the server exits the BIOS boot phase.
Recommended action	No action is required.

OS Stop / Shutdown

Run-time Critical Stop

Event code	0x201000de
Message text	Run-time Critical Stop
Variable fields	N/A
Severity level	Critical
Example	Run-time Critical Stop
Impact	The system crashes.
Cause	A critical error occurred during operating system operation.
Recommended action	1. Verify that the installed system, drivers, firmware, and software do not have bugs and are compatible with the server. 2. Update the versions if bugs or compatibility issues exist. 3. Verify that the installed hardware options are compatible with the server. For more information about component and server compatibility, access the component compatibility query tool at the official website. 4. If the issue persists, contact Technical Support.

OS Graceful Stop

Event code	0x202000de
Message text	OS Graceful Stop
Variable fields	N/A
Severity level	Info
Example	OS Graceful Stop
Impact	The system shut down.
Cause	The Windows OS was forcedly stopped.
Recommended action	No action is required.

OS Graceful Shutdown

Event code	0x203000de
Message text	OS Graceful Shutdown
Variable fields	N/A
Severity level	Info
Example	OS Graceful Shutdown
Impact	The system shut down.
Cause	The Windows OS was shut down gracefully.
Recommended action	No action is required.

Slot / Connector

Device disabled: PCIe module information not obtained

Event code	0x21000012
Message text	Device disabled: PCIe module information not obtained---Slot $1
Variable fields	$1: PCIe slot number.
Severity level	Major
Example	Device Disabled: PCIe module information not obtained---Slot 1
Impact	The PCIe module cannot be identified, which decrease the system performance.
Cause	The PCIe module is faulty.
Recommended action	1. Verify that the server starts up with the minimum configuration. For more information, see H3C Servers Troubleshooting Guide. 2. Verify that port is disabled in the BIOS. 3. Verify that the PCIe module is compatible with the server. 4. Verify that the PCIe module is installed correctly. 5. Install the PCIe module into another slot to verify that the PCIe module is not faulty. 6. If the issue persists, contact Technical Support.

Fault Status asserted

Event code	0x210000de
Message text	Fault Status asserted:---fan error in slot $1
Variable fields	$1: Slot number.
Severity level	Major
Example	Fault Status asserted:---fan error in slot 15
Impact	The system might crash due to a PCIe module error.
Cause	This message is generated when the OCP fan is absent or blocked.
Recommended action	1. Re-install the OCP fan. 2. If the issue persists, replace the OCP fan.

Transition to Non-Critical from OK

Event code	0x2110000e
Message text	Transition to Non-Critical from OK---slot $1----PCIe Name:$2
Variable fields	$1: PCIe slot number. $2: PCIe module name.
Severity level	Major
Example	Transition to Non-Critical from OK---slot 15----PCIe Name:NIC-620F-B2-25Gb-2P-1-X
Impact	The system might crash due to a PCIe module error.
Cause	This message is generated when the system fails to obtain information about network adapter connection.
Recommended action	1. Verify that the network adapter is no faulty. 2. Verify that the related links are operating correctly, for example, I2C or MCTP.

System ACPI Power State

S0 / G0 "working"

Event code	0x220000de
Message text	S0 / G0 "working"
Variable fields	N/A
Severity level	Info
Example	S0 / G0 "working"
Impact	No negative impact.
Cause	S0/G0 indicate that the system is operating correctly, where G(0-2) indicate the global states (G-States) and S(0-5) indicate the sleep states (S-States). G0 operating status: In this state, you can run the applications. S0 sleep state: Normal operating status.
Recommended action	No action is required.

S0 / G0 "working"

Event code	0x220800de
Message text	S0 / G0 "working"---$1
Variable fields	$1: Reason for a power-on operation, including: · due to virtual power button pressed · due to physical power button pressed · due to ipmi cmd · due to redfish cmd · due to AC lost · due to kvm button pressed · due to snmp cmd
Severity level	Info
Example	S0 / G0 "working"--- due to virtual power button pressed
Impact	No negative impact.
Cause	The system is powered on.
Recommended action	No action is required.

S5 / G2 "soft-off"

Event code	0x225000de
Message text	S5 / G2 "soft-off"
Variable fields	N/A
Severity level	Info
Example	S5 / G2 "soft-off"
Impact	No negative impact.
Cause	S5 / G2 indicates the software shutdown state. You cannot run applications or the operating system in this state. Software shutdown shuts down the entire operating system except the main power supply unit. Almost no power is consumed during software shutdown. The waking time will be longer to reboot the system after a soft shutdown.
Recommended action	No action is required.

S5 / G2 "soft-off"

Event code	0x225000de
Message text	S5 / G2 "soft-off"---$1
Variable fields	$1: Reason for a power-off operation, including: · due to virtual power button pressed · due to physical power button pressed · due to ipmi cmd · due to redfish cmd · due to AC lost · due to kvm button pressed · due to snmp cmd
Severity level	Info
Example	S5 / G2 "soft-off"--- due to virtual power button pressed
Impact	No negative impact.
Cause	S5 / G2 indicates the software shutdown state. You cannot run applications or the operating system in this state. Software shutdown shuts down the entire operating system except the main power supply unit. Almost no power is consumed during software shutdown. The waking time will be longer to reboot the system after a soft shutdown.
Recommended action	No action is required.

S4 / S5 soft-off, particular S4 / S5 state cannot be determined

Event code	0x226000de
Message text	S4 / S5 soft-off, particular S4 / S5 state cannot be determined
Variable fields	N/A
Severity level	Info
Example	S4 / S5 soft-off, particular S4 / S5 state cannot be determined
Impact	No negative impact.
Cause	S4/S5 indicates the software shutdown state, but you cannot identify whether the current state is S4 or S5. S(0-5) indicate the sleep states (S-States). S4 state: · All components are closed including ARM. · Only the platform settings are retained, while other settings are saved in a special location on the drive. · After a successful switch to S4, the system will shut down. · Due to the cessation of almost all programs and configurations, the power consumption is less than 3W. · Upon wake-up, the system needs to enter BIOS Boot Sequence again. · No system restart is required. The system will continue with the S5 shutdown state.
Recommended action	No action is required.

LPC Reset occurred

Event code	0x22d000de
Message text	LPC Reset occurred
Variable fields	N/A
Severity level	Info
Example	LPC Reset occurred
Impact	No negative impact.
Cause	The server was reset. This message is available only for servers that use Intel processors.
Recommended action	No action is required.

Watchdog2

Watchdog overflowAction:Timer expired

Event code	0x230000de
Message text	Watchdog overflow.Action:Timer expired - status only (no action and no interrupt)---interrupt type:$1---timer use at expiration:$2
Variable fields	$1: Interrupt type. Options include none, SMI, NMI, Messaging Interrupt, and unspecified. $2: Watchdog. Options include reserved, BIOS FRB2, BIOS POST, OS Load, SMS OS, OEM, and unspecified.
Severity level	Info
Example	Watchdog overflow.Action:Timer expired - status only (no action and no interrupt)---interrupt type:none---timer use at expiration:BIOS FRB2
Impact	System startup failure might occur.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires. · The timeout action is set to no action.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to identify whether software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Identify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Hard Reset

Event code	0x231000de
Message text	Watchdog overflow.Action:Hard Reset---interrupt type:$1---timer use at expiration:$2
Variable fields	$1: Interrupt type. Options include none, SMI, NMI, Messaging Interrupt, and unspecified. $2: Watchdog. Options include reserved, BIOS FRB2, BIOS POST, OS Load, SMS OS, OEM, and unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Hard Reset---interrupt type:none---timer use at expiration:BIOS FRB2
Impact	System startup failure might occur.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to hard reset.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to identify whether software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Identify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Power Down

Event code	0x232000de
Message text	Watchdog overflow.Action:Power Down---interrupt type:$1---timer use at expiration:$2
Variable fields	$1: Interrupt type. Options include none, SMI, NMI, Messaging Interrupt, and unspecified. $2: Watchdog. Options include reserved, BIOS FRB2, BIOS POST, OS Load, SMS OS, OEM, and unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Power Down---interrupt type:none---timer use at expiration:BIOS FRB2
Impact	System startup failure might occur.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to power down. The watchdog powered off the system forcibly. Services are interrupted and the data that has not been saved will get lost.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to identify whether software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Identify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Watchdog overflowAction:Power Cycle

Event code	0x233000de
Message text	Watchdog overflow.Action:Power Cycle---interrupt type:$1---timer use at expiration:$2
Variable fields	$1: Interrupt type. Options include none, SMI, NMI, Messaging Interrupt, and unspecified. $2: Watchdog. Options include reserved, BIOS FRB2, BIOS POST, OS Load, SMS OS, OEM, and unspecified.
Severity level	Major
Example	Watchdog overflow.Action:Power Cycle---interrupt type:none---timer use at expiration:BIOS FRB2
Impact	System startup failure might occur.
Cause	This message is generated when the following conditions are met: · The watchdog is enabled in the BIOS. · The watchdog timer expires during the BIOS POST, OS Load, or SMS/OS phase (indicated by the watchdog timer type). · The timeout action is set to power cycle.
Recommended action	1. For a BIOS POST watchdog timeout, review the event logs to identify hardware errors or BIOS startup errors, and troubleshoot the errors as instructed in the logs. 2. For an OS Load watchdog timeout, verify that no error is present in the system startup environment. If no error is present, proceed to step 5. 3. For an OS Running watchdog timeout, review the OS logs to identify whether software exceptions occurred and troubleshoot the exceptions as instructed in the logs. 4. Identify whether data storms have occurred. If yes, troubleshoot network errors. 5. If the issue persists, contact Technical Support.

Entity Presence

Entity Present---License is about to expire

Event code	0x250000de
Message text	Entity Present---License is about to expire
Variable fields	N/A
Severity level	Minor
Example	Entity Present---License is about to expire
Impact	No negative impact.
Cause	This message is generated when the remaining validity period of the license is less than 10 days.
Recommended action	The temporary license is about to expire. Please purchase the formal license.

Entity Disabled---License has expired

Event code	0x252000de
Message text	Entity Disabled---$1
Variable fields	$1: Certificate state: · License has expired. · License is unavailable.
Severity level	Minor
Example	Entity Disabled---License has expired
Impact	No negative impact.
Cause	The certificate has expired or is not available.
Recommended action	1. If the temporary license has expired, purchase and activate the formal license. 2. If the license is not available, re-install and activate the existing license or contact Technical Support.

Management Subsystem Health

Controller access degraded or unavailable

Event code	0x281000de
Message text	Controller access degraded or unavailable---$1
Variable fields	$1: Possible options include Failed to access the SD card and SD card partitions are missing.
Severity level	Major
Example	Controller access degraded or unavailable---Failed to access the SD card.
Impact	No negative impact.
Cause	SD card reading failed or the SD card was missing.
Recommended action	1. Restart HDM. 2. Reset the SD module for BMC. 3. If the issue persists, contact Technical Support.

Controller access degraded or unavailable

Event code	0x282000de
Message text	Management controller off-line ---$1
Variable fields	$1: BMC reboot cause.
Severity level	Info
Example	Management controller off-line---BMC reset
Impact	No negative impact.
Cause	BMC was restarted.
Recommended action	No action is required.

Battery

Battery low (predictive failure)

Event code	0x290000de
Message text	Battery low (predictive failure)---PCIe slot:$1
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Minor
Example	Battery low (predictive failure)---PCIe slot:1
Impact	The reliability of the RAID controller will degrade, which might cause system performance degradation.
Cause	The supercapacitor of the storage controller has a low charge, overtemperature, overvoltage, or overcurrent condition.
Recommended action	1. Power on the server to charge the supercapacitor. Log in to HDM, and verify that the supercapacitor of the RAID controller is in normal state and identify whether the alarm is cleared. 2. Verify that the power fail safeguard module is installed correctly. 3. Replace the corresponding components, including the battery, supercapacitor, or flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

Battery failed

Event code	0x291000de
Message text	Battery failed---PCIe slot:$1
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Minor
Example	Battery failed---PCIe slot:1
Impact	The reliability of the RAID controller will degrade, which might cause system performance degradation.
Cause	An internal error occurred on the power fail safeguard module of the storage controller. Possible reasons include: · The supercapacitor is exhausted or has expired. · The power fail safeguard module failed to be initialized. · The power fail safeguard module subsystem failed. · The supercapacitor failed to be charged. · The battery or supercapacitor fails.
Recommended action	1. Log in to HDM, and verify that the supercapacitor of the RAID controller is in normal state. 2. Verify that the power fail safeguard module is installed correctly. 3. Replace the corresponding components, including the battery, supercapacitor, or flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

Battery presence detected

Event code	0x292000de
Message text	Battery presence detected---PCIe slot:$1
Variable fields	$1: PCIe slot number of the storage controller.
Severity level	Info
Example	Battery presence detected---PCIe slot:1
Impact	The reliability of the RAID controller will degrade, which might cause system performance degradation.
Cause	The battery or supercapacitor of the RAID controller is absent.
Recommended action	1. Log in to HDM, and verify that the supercapacitor of the RAID controller is in normal state. 2. Verify that the supercapacitor is installed correctly and the supercapacitor cable is connected correctly. 3. Replace the corresponding components, including the battery, supercapacitor, or flash card (if any), and then restart the server. 4. If the issue persists, contact Technical Support.

Version Change

Hardware incompatibility detected with associated Entity---Memory is not certified

Event code	0x2b2000de
Message text	Hardware incompatibility detected with associated Entity---Memory is not certified---Location:CPU:$1 CH:$2 DIMM:$3
Variable fields	$1: CPU number. $2: Channel number. $3: DIMM number.
Severity level	Minor
Example	Hardware incompatibility detected with associated Entity---Memory is not certified---Location:CPU:1 CH:1 DIMM:0
Impact	No negative impact.
Cause	This message is generated when the DIMM is not certified.
Recommended action	1. Install H3C certified DIMMs. 2. If the issue persists, contact Technical Support.