Country / Region
Background
China Pacific Insurance (Group) Co., Ltd. (hereinafter referred to as "CPIC") is an insurance group based on China Pacific Insurance Co., Ltd. established on May 13, 1991. Headquartered in Shanghai, it is an outstanding comprehensive insurance group in China and the first insurance company listed in Shanghai, Hong Kong and London.
In the digital financial era, CPIC, guided by the strategy of "Transformation 2.0", has effectively implemented the construction of CPIC cloud, and has engaged in intensive cooperation with H3C Group in such fields as financial cloud and infrastructure planning. With the foundation of cloud server, storage and network innovations provided by H3C, the CPIC cloud can provide users with integrated, self-service and full-stack cloud services, and guarantee flexible resource management and high-reliability performance. In the first phase of construction of the CPIC cloud, there are two data centers: Chengdu M3 and Shanghai Luojing D3, with 180 ONEStor distributed storage nodes, which have expanded to 494 as of 2022. With large-scale services and wide variety of applications, CPIC requires better storage scalability, reliability and technical excellence.
Challenges
- The cloud service systems are complex (from the insurance production system to different service systems such as development O&M and website APPs) and have different requirements for storage. They cover large scale and challenge the storage capacity at any time.
- The continuity of storage services is also important for customers. The storage system must have an enterprise-level high-available architecture.
- Decoupling hardware from software can adapt to multi-brand standard servers uniformly purchased by customers.
Solution
To ensure greater performance, CPIC cloud ONEStor uses every 15 nodes as a cluster. These 15 nodes are located in three cabinets in the 5*3 mode. Taking the cabinet as the fault domain, the servers are distributed as follows:
Divided by role: All nodes are divided into three roles: Handy * 2, MON * 7, and OSD * 15. To ensure the high reliability of each node, all nodes are located in different cabinets. The specific distribution is as follows:
Unified O&M: ONEStor storage is connected to the CPIC O&M platform, second-level monitoring, and real-time reporting of system errors.
Benefits
- Self-developed SCache cache acceleration algorithm to meet the performance requirements of multi-service storage:
ONEStor's self-developed SCache cache acceleration algorithm provides high-concurrency and high-performance storage services to meet the storage requirements of production services such as auto insurance and property insurance.
The cache algorithm is crucial to improve the performance of the SSD cache disk. In combination with the cache usage and foreground IO load, the SCache algorithm dynamically adjusts the watermark and flush rate. At the same time, it provides a more intelligent cold and hot data separation mechanism to improve the cache hit ratio.
- Cluster fail-slow management mechanism to identify and isolate hardware risks in advance:
As the scale of a storage cluster increases, the frequency of hardware faults rises. The key to build CPIC cloud storage is to handle hardware faults effectively and realize automatic fault tolerance.
ONEStor provides hard disk fail-slow detection (slow disk I/O access, disk I/O error, buffer I/O error), network fail-slow detection (excessive latency, shock/packet loss or malformed packet), and other intelligent management mechanisms on excessive usage of node CPU and memory, to sense hardware faults in advance, and take repair or isolation measures to ensure high availability of storage services.
- Decoupling hardware from software to avoid hardware binding:
The ONEStor distributed storage software is compatible with the mainstream server brands in the industry. In the subsequent expansion of CPIC cloud, various hardware models can be flexibly selected, ensuring storage performance and reliability, and avoiding the binding of a single hardware manufacturer.
- Online upgrade without service interruption:
As the data base of CPIC cloud, the storage system will trigger domino effect in case of any change. When the storage is shut down for an upgrade, multiple departments need to be coordinated, which is time and labor consuming.
ONEStor provides the online upgrade function and supports the parallel node upgrade, reducing the cluster upgrade time and decreasing the workload of the IT O&M personnel.