02-System Management Configuration Guide

HomeSupportSwitchesS6550X-HI SeriesConfigure & DeployConfiguration GuidesH3C S6550X-HI Configuration Guides-R1116Pxx-6W10002-System Management Configuration Guide
05-Process monitoring and maintenance configuration

Monitoring and maintaining processes

About monitoring and maintaining processes

The system software of the device is a full-featured, modular, and scalable network operating system based on the Linux kernel. The system software features run the following types of independent processes:

·     User process—Runs in user space. Most system software features run user processes. Each process runs in an independent space so the failure of a process does not affect other processes. The system automatically monitors user processes. The system supports preemptive multithreading. A process can run multiple threads to support multiple activities. Whether a process supports multithreading depends on the software implementation.

·     Kernel thread—Runs in kernel space. A kernel thread executes kernel code. It has a higher security level than a user process. If a kernel thread fails, the system breaks down. You can monitor the running status of kernel threads.

Restrictions and guidelines: Process monitoring and maintenance configuration

In the current software version, this switch series does not support CPU parameters.

Process monitoring and maintenance tasks at a glance

To monitor and maintain processes, perform the following tasks:

·     Monitoring and maintaining user processes and kernel threads

The commands in this section apply to both user processes and kernel threads.

·     Monitoring and maintaining user processes

The commands in this section apply only to user processes.

·     Monitoring and maintaining kernel threads

The commands in this section apply only to kernel threads.

Monitoring and maintaining user processes and kernel threads

About monitoring and maintaining user processes and kernel threads

The commands in this section apply to both user processes and kernel threads. You can use the commands for the following purposes:

·     Display the overall memory usage.

·     Display the running processes and their memory and CPU usage.

·     Locate abnormal processes.

If a process consumes excessive memory or CPU resources, the system identifies the process as an abnormal process.

·     If an abnormal process is a user process, troubleshoot the process as described in "Monitoring and maintaining user processes."

·     If an abnormal process is a kernel thread, troubleshoot the process as described in "Monitoring and maintaining kernel threads."

Displaying memory usage of processes

To display memory usage of processes, execute the following command in any view:

display memory [ summary ]

For information about this command, see System Management Command Reference.

Displaying CPU usage of processes

To display CPU usage of processes, execute the following command in any view:

display process cpu

Monitoring process status

Execute the following commands in any view.

·     Display process status information.

display process [ all | job job-id | name process-name ]

·     Monitor process running status.

monitor process [ dumbtty ] [ iteration number ]

·     Monitor thread running status.

monitor thread [ dumbtty ] [ iteration number ]

Monitoring and maintaining user processes

About monitoring and maintaining user processes

Use this feature to monitor the operating status of user processes. If a user process is excessively busy or consumes excessive resources, use this feature to locate problems.

Configuring core dump

About this task

The core dump feature enables the system to generate a core dump file each time a process crashes until the maximum number of core dump files is reached. A core dump file stores information about the process. You can send the core dump files to H3C technical support staff to troubleshoot the problems.

Restrictions and guidelines

Core dump files consume storage resources. Enable core dump only for processes that might have problems.

Procedure

Execute the following commands in user view:

1.     (Optional.) Specify the directory for saving core dump files.

exception filepath directory

By default, the directory for saving core dump files is the root directory of the default file system. For more information about the default file system, see file system management in Fundamentals Configuration Guide.

2.     Enable core dump for a process and specify the maximum number of core dump files, or disable core dump for a process.

process core { maxcore value | off } { job job-id | name process-name }

By default, a process generates a core dump file for the first exception and does not generate any core dump files for subsequent exceptions.

Verifying the configuration

To display the core dump file directory, execute the following command in user view:

display exception filepath

Locating user process memory usage exceptions

Execute display commands in any view and other commands in user view.

·     Display memory usage for all user processes.

display process memory

·     Display heap memory usage for a user process.

display process memory heap job job-id [ tag [ tag-id ] | verbose ]

·     Display memory content starting from a specified memory block for a user process.

display process memory heap job job-id address starting-address length memory-length

·     Display the addresses of memory blocks with a specified size used by a user process.

display process memory heap job job-id [ tag tag-id ] size memory-size [ offset offset-size ]

Displaying log information for user processes

To display log information for user processes, execute the following command in any view:

display process log

Displaying context information for process exceptions

To display context information for process exceptions, execute the following command in any view:

display exception context [ count value ]

Clearing context information for process exceptions

To clear context information for process exceptions, execute the following command in user view:

reset exception context

Monitoring and maintaining kernel threads

About monitoring and maintaining kernel threads

Use this feature to monitor the operating status of kernel threads. If a kernel thread is excessively busy or consumes excessive resources, use this feature to locate problems.

Detecting kernel thread deadloops

About this task

Kernel threads share resources. If a kernel thread monopolizes the CPU, other threads cannot run, resulting in a deadloop.

This feature enables the device to detect deadloops. If a thread occupies the CPU for a specific interval, the device determines that a deadloop has occurred and generates a deadloop message.

Restrictions and guidelines

Change kernel thread deadloop detection settings only under the guidance of H3C Support. Inappropriate configuration can cause system breakdown. As a best practice, leave the default unchanged.

Configuring kernel thread deadloop detection

1.     Enter system view.

system-view

2.     Enable kernel thread deadloop detection.

monitor kernel deadloop enable [ cpu cpu-number ]

By default, kernel thread deadloop detection is enabled.

3.     (Optional.) Set the threshold for identifying a kernel thread deadloop.

monitor kernel deadloop time time

By default, the threshold for identifying a kernel thread deadloop is 10 seconds.

4.     (Optional.) Exclude a kernel thread from kernel thread deadloop detection.

monitor kernel deadloop exclude-thread tid

When enabled, kernel thread deadloop detection monitors all kernel threads by default.

Displaying kernel thread deadloop detection information

Perform display tasks in any view.

·     Display kernel thread deadloop information.

display kernel deadloop show-number [ offset ] [ verbose ]

·     Display kernel thread deadloop detection configuration.

display kernel deadloop configuration

Clearing kernel thread deadloop information

To clear kernel thread deadloop information, execute the following command in user view:.

reset kernel deadloop

Detecting kernel thread starvation

About this task

Starvation occurs when a thread is unable to access shared resources.

Kernel thread starvation detection enables the system to detect and report thread starvation. If a thread is not executed within a specific interval, the system determines that a starvation has occurred and generates a starvation message.

Thread starvation does not impact system operation. A starved thread can automatically run when certain conditions are met.

Restrictions and guidelines

CAUTION

CAUTION:

Configure kernel thread starvation detection only under the guidance of H3C Support. Inappropriate configuration can cause system breakdown.

 

Configuring kernel thread starvation detection

1.     Enter system view.

system-view

2.     Enable kernel thread starvation detection.

monitor kernel starvation enable

By default, kernel thread starvation detection is disabled.

3.     (Optional.) Set the threshold for identifying a kernel thread starvation.

monitor kernel starvation time time

By default, the threshold for identifying a kernel thread starvation is 120 seconds.

4.     (Optional.) Exclude a kernel thread from kernel thread starvation detection.

monitor kernel starvation exclude-thread tid

When enabled, kernel thread starvation detection monitors all kernel threads by default.

Displaying kernel thread starvation information

Perform display tasks in any view.

·     Display kernel thread starvation detection configuration.

display kernel starvation configuration

·     Display kernel thread starvation information.

display kernel starvation show-number [ offset ] [ verbose ]

Clearing kernel thread starvation information

To clear kernel thread starvation information, execute the following command in user view:.

reset kernel starvation

Displaying kernel thread exception information

To display kernel thread exception information, execute the following command in any view:

display kernel exception show-number [ offset ] [ verbose ]

Displaying kernel thread reboot information

To display kernel thread reboot information, execute the following command in any view:

display kernel reboot show-number [ offset ] [ verbose ]

Clearing kernel thread exception information

To clear kernel thread exception information, execute the following command in user view:

reset kernel exception

Clearing kernel thread reboot information

To clear kernel thread reboot information, execute the following command in user view:

reset kernel reboot

 

  • Cloud & AI
  • InterConnect
  • Intelligent Computing
  • Security
  • SMB Products
  • Intelligent Terminal Products
  • Product Support Services
  • Technical Service Solutions
All Services
  • Resource Center
  • Policy
  • Online Help
All Support
  • Become a Partner
  • Partner Resources
  • Partner Business Management
All Partners
  • Profile
  • News & Events
  • Online Exhibition Center
  • Contact Us
All About Us
新华三官网