12-Network Management and Monitoring Configuration Guide

HomeSupportSwitchesS7500E SeriesConfigure & DeployConfiguration GuidesH3C S7500E Switch Series Configuration Guides-Release758X-6W10012-Network Management and Monitoring Configuration Guide
09-Process monitoring and maintenance configuration

Monitoring and maintaining processes

The system software of the device is a full-featured, modular, and scalable network operating system based on the Linux kernel. The system software features run the following types of independent processes:

·     User process—Runs in user space. Most system software features run user processes. Each process runs in an independent space so the failure of a process does not affect other processes. The system automatically monitors user processes. The system supports preemptive multithreading. A process can run multiple threads to support multiple activities. Whether a process supports multithreading depends on the software implementation.

·     Kernel thread—Runs in kernel space. A kernel thread executes kernel code. It has a higher security level than a user process. If a kernel thread fails, the system breaks down. You can monitor the running status of kernel threads.

Displaying and maintaining processes

Commands described in this section apply to both user processes and kernel threads. You can execute these commands in any view.

The system identifies a process that consumes excessive memory or CPU resources as an anomaly source.

(In standalone mode.) To display and maintain processes:

 

Task

Command

Display memory usage.

display memory [ slot slot-number [ cpu cpu-number ] ]

Display process state information.

display process [ all | job job-id | name process-name ] [ slot slot-number [ cpu cpu-number ] ]

Display CPU usage for all processes.

display process cpu [ slot slot-number [ cpu cpu-number ] ]

Monitor process running state.

monitor process [ dumbtty ] [ iteration number ] [ slot slot-number [ cpu cpu-number ] ]

Monitor thread running state.

monitor thread [ dumbtty ] [ iteration number ] [ slot slot-number [ cpu cpu-number ] ]

 

For more information about the display memory [ slot slot-number ] command, see Fundamentals Command Reference.

(In IRF mode.) To display and maintain processes:

 

Task

Command

Display memory usage.

display memory [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display process state information.

display process [ all | job job-id | name process-name ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display CPU usage for all processes.

display process cpu [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Monitor process running state.

monitor process [ dumbtty ] [ iteration number ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Monitor thread running state.

monitor thread [ dumbtty ] [ iteration number ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

 

For more information about the display memory [ chassis chassis-number slot slot-number ] command, see Fundamentals Command Reference.

Displaying and maintaining user processes

(In standalone mode.) Execute display commands in any view and other commands in user view.

 

Task

Command

Remarks

Display log information for all user processes.

display process log [ slot slot-number [ cpu cpu-number ] ]

N/A

Display memory usage for all user processes.

display process memory [ slot slot-number [ cpu cpu-number ] ]

N/A

Display heap memory usage for a user process.

display process memory heap job job-id [ verbose ] [ slot slot-number [ cpu cpu-number ] ]

N/A

Display the addresses of memory blocks with a specified size used by a user process.

display process memory heap job job-id size memory-size [ offset offset-size ] [ slot slot-number [ cpu cpu-number ] ]

N/A

Display memory content starting from a specified memory block for a user process.

display process memory heap job job-id address starting-address length memory-length [ slot slot-number [ cpu cpu-number ] ]

N/A

Display context information for process exceptions.

display exception context [ count value ] [ slot slot-number [ cpu cpu-number ] ]

N/A

Display the core file directory.

display exception filepath [ slot slot-number [ cpu cpu-number ] ]

N/A

Enable or disable a process to generate core files for exceptions and set the maximum number of core files.

process core { maxcore value | off } { job job-id | name process-name } [ slot slot-number [ cpu cpu-number ] ]

By default, a process generates a core file for the first exception and does not generate any core files for subsequent exceptions.

Specify the directory for saving core files.

exception filepath directory

The default directory is the root directory of the flash: file system on the active MPU.

This command is supported only on the default MDC.

Clear context information for process exceptions.

reset exception context [ slot slot-number [ cpu cpu-number ] ]

N/A

(In IRF mode.) Execute display commands in any view and other commands in user view.

 

Task

Command

Remarks

Display log information for all user processes.

display process log [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display memory usage for all user processes.

display process memory [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display heap memory usage for a user process.

display process memory heap job job-id [ verbose ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display the addresses of memory blocks with a specified size used by a user process.

display process memory heap job job-id size memory-size [ offset offset-size ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display memory content starting from a specified memory block for a user process.

display process memory heap job job-id address starting-address length memory-length [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display context information for process exceptions.

display exception context [ count value ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Display the core file directory.

display exception filepath [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Enable or disable a process to generate core files for exceptions and set the maximum number of core files (which defaults to 1).

process core { maxcore value | off } { job job-id | name process-name } [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

By default, a process generates a core file for the first exception and does not generate any core files for subsequent exceptions.

Specify the directory for saving core files.

exception filepath directory

The default directory is the root directory of the flash: file system on the global active MPU.

This command is supported only on the default MDC.

Clear context information for process exceptions.

reset exception context [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

N/A

Monitoring kernel threads

This feature is supported only on the default MDC.

Tasks in this section help you quickly identify thread deadloop and starvation problems and their causes.

Configuring kernel thread deadloop detection

CAUTION

CAUTION:

As a best practice, use the default settings. Inappropriate configuration of kernel thread deadloop detection can cause service problems or system breakdown. Make sure you understand the impact of this configuration on your network before you configure kernel thread deadloop detection.

 

Kernel threads share resources. If a kernel thread monopolizes the CPU, other threads cannot run, resulting in a deadloop.

This feature enables the device to detect deadloops. If a thread occupies the CPU for a specific interval, the device considers that a deadloop has occurred and takes the specified deadloop protection action.

(In standalone mode.) To configure kernel thread deadloop detection:

 

Step

Command

Remarks

1.     Enter system view.

system-view

N/A

2.     Enable kernel thread deadloop detection.

monitor kernel deadloop enable [ slot slot-number [ cpu cpu-number [ core core-number&<1-64> ] ] ]

By default, kernel thread deadloop detection is enabled.

3.     (Optional.) Set the interval for identifying a kernel thread deadloop.

monitor kernel deadloop time time [ slot slot-number [ cpu cpu-number ] ]

The default is 45 seconds.

IMPORTANT IMPORTANT:

The undo monitor kernel deadloop time command sets the interval to 20 seconds.

4.     (Optional.) Disable kernel thread deadloop detection for a kernel thread.

monitor kernel deadloop exclude-thread tid [ slot slot-number [ cpu cpu-number ] ]

After enabled, kernel thread deadloop detection monitors all kernel threads by default.

5.     (Optional.) Specify the action to be taken in response to a kernel thread deadloop.

monitor kernel deadloop action { reboot | record-only } [ slot slot-number [ cpu cpu-number ] ]

The default action is to log the kernel thread deadloop event.

(In IRF mode.) To configure kernel thread deadloop detection:

 

Step

Command

Remarks

1.     Enter system view.

system-view

N/A

2.     Enable kernel thread deadloop detection.

monitor kernel deadloop enable [ chassis chassis-number slot slot-number [ cpu cpu-number [ core core-number&<1-64> ] ] ]

By default, kernel thread deadloop detection is enabled.

3.     (Optional.) Set the interval for identifying a kernel thread deadloop.

monitor kernel deadloop time time [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

The default is 45 seconds.

IMPORTANT IMPORTANT:

The undo monitor kernel deadloop time command sets the interval to 20 seconds.

4.     (Optional.) Disable kernel thread deadloop detection for a kernel thread.

monitor kernel deadloop exclude-thread tid [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

After enabled, kernel thread deadloop detection monitors all kernel threads by default.

5.     (Optional.) Specify the action to be taken in response to a kernel thread deadloop.

monitor kernel deadloop action { reboot | record-only } [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

The default action is to log the kernel thread deadloop event.

Configuring kernel thread starvation detection

CAUTION

CAUTION:

The system detects whether or not kernel thread starvation occurs after the device is powered up. Inappropriate configuration of kernel thread starvation detection can cause service problems or system breakdown. Make sure you understand the impact of this configuration on your network before you configure kernel thread starvation detection.

 

Starvation occurs when a thread is unable to access shared resources.

Kernel thread starvation detection enables the system to detect and report thread starvation. If a thread is not executed within a specific interval, the system considers that a starvation has occurred, and generates a starvation message.

Thread starvation does not impact system operation. A starved thread can automatically run when certain conditions are met.

(In standalone mode.) To configure kernel thread starvation detection:

 

Step

Command

Remarks

1.     Enter system view.

system-view

N/A

2.     Enable kernel thread starvation detection.

monitor kernel starvation enable [ slot slot-number [ cpu cpu-number ] ]

By default, the function is disabled.

3.     (Optional.) Set the interval for identifying a kernel thread starvation.

monitor kernel starvation time time [ slot slot-number [ cpu cpu-number ] ]

The default is 120 seconds.

4.     (Optional.) Disable kernel thread starvation detection for a kernel thread.

monitor kernel starvation exclude-thread tid [ slot slot-number [ cpu cpu-number ] ]

After enabled, kernel thread starvation detection monitors all kernel threads by default.

(In IRF mode.) To configure kernel thread starvation detection:

 

Step

Command

Remarks

1.     Enter system view.

system-view

N/A

2.     Enable kernel thread starvation detection.

monitor kernel starvation enable [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

By default, the function is disabled.

3.     (Optional.) Set the interval for identifying a kernel thread starvation.

monitor kernel starvation time time [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

The default is 120 seconds.

4.     (Optional.) Disable kernel thread starvation detection for a kernel thread.

monitor kernel starvation exclude-thread tid  [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

After enabled, kernel thread starvation detection monitors all kernel threads by default.

Displaying and maintaining kernel threads

(In standalone mode.) Execute display commands in any view and reset commands in user view.

 

Task

Command

Display kernel thread deadloop information.

display kernel deadloop show-number [ offset ] [ verbose ] [ slot slot-number [ cpu cpu-number ] ]

Display kernel thread deadloop detection configuration.

display kernel deadloop configuration [ slot slot-number [ cpu cpu-number ] ]

Display kernel thread exception information.

display kernel exception show-number [ offset ] [ verbose ] [ slot slot-number [ cpu cpu-number ] ]

Display kernel thread reboot information.

display kernel reboot show-number [ offset ] [ verbose ] [ slot slot-number [ cpu cpu-number ] ]

Display kernel thread starvation information.

display kernel starvation show-number [ offset ] [ verbose ] [ slot slot-number [ cpu cpu-number ] ]

Display kernel thread starvation detection configuration.

display kernel starvation configuration [ slot slot-number [ cpu cpu-number ] ]

Clear kernel thread deadloop information.

reset kernel deadloop [ slot slot-number [ cpu cpu-number ] ]

Clear kernel thread exception information.

reset kernel exception [ slot slot-number [ cpu cpu-number ] ]

Clear kernel thread reboot information.

reset kernel reboot [ slot slot-number [ cpu cpu-number ] ]

Clear kernel thread starvation information.

reset kernel starvation [ slot slot-number [ cpu cpu-number ] ]

(In IRF mode.) Execute display commands in any view and reset commands in user view.

 

Task

Command

Display kernel thread deadloop information.

display kernel deadloop show-number [ offset ] [ verbose ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display kernel thread deadloop detection configuration.

display kernel deadloop configuration [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display kernel thread exception information.

display kernel exception show-number [ offset ] [ verbose ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display kernel thread reboot information.

display kernel reboot show-number [ offset ] [ verbose ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display kernel thread starvation information.

display kernel starvation show-number [ offset ] [ verbose ] [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Display kernel thread starvation detection configuration.

display kernel starvation configuration [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Clear kernel thread deadloop information.

reset kernel deadloop [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Clear kernel thread exception information.

reset kernel exception [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Clear kernel thread reboot information.

reset kernel reboot [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

Clear kernel thread starvation information.

reset kernel starvation [ chassis chassis-number slot slot-number [ cpu cpu-number ] ]

 

  • Cloud & AI
  • InterConnect
  • Intelligent Computing
  • Security
  • SMB Products
  • Intelligent Terminal Products
  • Product Support Services
  • Technical Service Solutions
All Services
  • Resource Center
  • Policy
  • Online Help
All Support
  • Become a Partner
  • Partner Resources
  • Partner Business Management
All Partners
  • Profile
  • News & Events
  • Online Exhibition Center
  • Contact Us
All About Us
新华三官网