
REDnote Partners With H3C to Deploy the First DDC Architecture AI Computing Network Cluster Globally
- As a leading community platform in the industry, REDnote has always been committed to the innovation and application of AI technology. The platform has not only deeply integrated AIGC technology into content recommendation and intelligent creation processes to continuously enhance user experience, but has also actively pursued high-performance networking solutions since 2023. Balancing technological advancement and versatility, the platform promotes the large-scale implementation of efficient AI infrastructure. To address the new challenges in computing power networks brought by the development of large models, REDnote collaborated with H3C to successfully complete the large-scale validation of an intelligent computing network based on the DDC architecture, achieving the first cluster deployment of its kind globally.
With the rapid development of large models such as DeepSeek, high-performance AI computing networks are currently facing unprecedented performance challenges. Traditional network architectures are increasingly revealing critical bottlenecks when handling AI models based on novel architectures like MoE:
First, the proportion of inter-machine communication has surged, exacerbating network congestion risks. Under the MoE architecture, inter-machine communication is expected to account for up to 50% of total traffic. The high bandwidth occupancy makes network congestion a key bottleneck in training efficiency, significantly increasing the complexity and cost of end-network collaborative optimization.
Second, there is a dual challenge of low latency and high throughput. Dynamic traffic patterns are extremely sensitive to network latency, where any additional delay can affect the collaborative efficiency between GPUs. At the same time, ultra-high throughput must be guaranteed to avoid idle computing resources.
Third, dynamic traffic patterns render traditional tuning strategies ineffective. Unlike the periodic fixed traffic of traditional PP/DP (pipeline parallelism/data parallelism), the All-to-All communication pattern of MoE models is highly dynamic, leading to uneven network load. Traditional static tuning solutions struggle to adapt to this variability.
To address these challenges, H3C, leveraging years of technical expertise, has redefined the DDC architecture (Diversified Dynamic-Connectivity) and introduced a new-generation lossless network solution. This solution effectively resolves the performance bottlenecks of traditional Ethernet networks in supporting large model training through distributed architecture design, promoting better adaptation of Ethernet technology to the demands of AI computing networks.
The DDC architecture adopts a distributed design, achieving 100% network load balancing through cell switching technology, significantly improving bandwidth utilization between NCP and NCF. Combined with VOQ+Credit intelligent traffic scheduling, it not only ensures non-blocking forwarding within the training cluster network but also meets the stringent requirements of large model training and inference businesses for low latency and zero packet loss.
The revolutionary aspect of the DDC architecture lies in its integration of cell switching with the Ethernet protocol for the first time, building Ethernet-native global scheduling capabilities and achieving complete decoupling from the end side. It is not only compatible with mainstream GPUs in the industry but also adapts to the trend of domestic GPU production, fully unleashing the maximum performance of domestic GPUs. Moreover, the network requires no manual parameter tuning, with the only adjustments needed limited to parameters related to end-side collective communication testing, significantly reducing the operation, maintenance, and deployment difficulties.
Additionally, the architecture fully supports mainstream collective communications such as All-Reduce and All-to-All, providing stable and efficient network support for large models of different architectures, like Dense and MoE. Its open design ensures forward-looking compatibility with emerging AI training paradigms.
Rigorous Selection and Joint Validation
Given the characteristics of REDnote's intelligent computing business, such as high concurrency, large traffic volume, and complex traffic patterns, the technical teams of both parties engaged in multiple in-depth discussions. REDnote set clear requirements for the technical solution: first, it must achieve ultimate load balancing; second, it must eliminate the complex configuration and maintenance burdens of traditional lossless networks; third, the architecture design must possess good scalability to support future business development; and finally, rigorous testing must be conducted to ensure the solution's stability and reliability.
REDnote’s infrastructure network team and H3C have maintained long-term close cooperation in the field of high-performance networking. Centering around REDnote's high-performance intelligent computing network needs, they conducted multiple rounds of technical solution validation. After comprehensively comparing solutions from multiple vendors, REDnote ultimately selected H3C's DDC solution as the foundation for building its intelligent computing network.
In-Depth Acceptance and Efficient Delivery
During the acceptance testing phase, the two teams collaborated closely. As this was the first DDC cluster deployment project globally, there were no existing acceptance standards to reference. The team based their approach on traditional RoCE network acceptance plans and combined them with the technical characteristics of DDC's lossless network implementation to jointly develop a targeted acceptance system. By systematically adjusting parameters such as collective communication libraries, QP, ECN, PFC ratio, Headroom, and PXN, they comprehensively validated the network performance of the DDC cluster. They also based on business scenarios to conduct in-depth testing of system redundancy, efficiently completing the DDC cluster delivery validation and disaster recovery, clearing the final obstacles for its official launch and production business bearing.
REDnote had extremely high requirements for project delivery efficiency. Before the project started, they coordinated multiple times with H3C's supply chain, striving for and ultimately achieving early delivery of equipment. The first batch of equipment was racked and basically configured on the day it arrived. To ensure the DDC cluster went online on schedule, the on-site team coordinated resources to prioritize the debugging of the first batch of equipment. This move not only accumulated standard process documentation for subsequent equipment deliveries but also significantly accelerated the overall launch progress. Throughout the project delivery process, the two teams worked closely together, promptly resolving on-site issues, and finally completed all deployment work on time with guaranteed quality and quantity.
Test Plan and Data Results
Test Environment Configuration
The online testing was based on the deployed DDC intelligent computing network cluster. The training network validation environment used 2 NCF switches, 8 NCP switches, and 30 GPU servers. Each GPU server was configured with 8 GPU cards, connected to the 8 NCPs respectively. Each NCP was connected to two NCFs, ensuring an equal number of connections between each NCP and each NCF, forming the DDC cluster.
[Figure 1 Test Network Architecture Diagram]
Main Test Content
- RDMA baseline performance testing: Covering bandwidth and latency indicators.
- All-to-All and All-Reduce collective communication testing: Focusing on NCCL performance benchmarks.
- System disaster recovery testing: Validating fault response capabilities.
[Figure 2 GPU Single-Card Throughput vs. Data Size Diagram]
Analysis of Test Results
The tests showed that the cell spraying load balancing optimization technology and VOQ+Credit traffic scheduling mechanism, the DDC architecture significantly improved network utilization, effectively avoiding latency and jitter issues caused by network congestion. In terms of performance, in the All-to-All scenario, the single-GPU card throughput reached up to 381.83 Gbps. In the All-Reduce scenario, the single-GPU card throughput could reach 385.98 Gbps. Furthermore, the architecture could respond quickly to various hardware failure scenarios (local and remote) and intelligently schedule network bandwidth resources. While achieving performance improvements, the architecture also featured plug-and-play capabilities and successfully realized its design goal of "no in-network parameter tuning," simplifying operational maintenance.
These test results prove that the H3C S12500AI series intelligent computing switches based on the DDC architecture demonstrated exceptional value in practical deployment. They not only significantly enhanced the load capacity of large-scale intelligent computing networks and effectively reduced model training time but also helped REDnote deepen the integration of AI and its content ecosystem, embedding large model technology into every user note and every search. In the future, REDnote's technical team will continue to deepen cooperation with H3C, exploring areas such as content recommendation algorithm optimization, intelligent creation tool development, and real-time data analysis based on the AI acceleration capabilities of the DDC architecture, continuously improving user experience and creation efficiency.
[Figure 3 H3C S12500AI Series Intelligent Computing Switch]
Regarding this project, Cheng Junfeng, Head of Infrastructure Network at REDnote stated: "We always adhere to network openness and will continue to explore high-performance networking solutions based on open Ethernet. This joint testing with H3C on the intelligent computing network solution based on the DDC architecture not only verified the technical feasibility of the new-generation network architecture but also laid a solid foundation for REDnote's subsequent innovative research on large model training network optimization. This solution strikes a good balance between advancement and universality, providing the industry with a new network choice that is high-performance, low-cost, and easy to deploy."
Qiao Yan, Senior Vice President of H3C Group and President of the Network Product Line, also stated: "We are very pleased to collaborate with REDnote on the validation testing of the DDC architecture. H3C has always been committed to innovative breakthroughs in intelligent computing network technology. The DDC architecture is our revolutionary network solution for the era of AI large models. The test results fully demonstrate the comprehensive advantages of DDC in terms of performance, tuning-free operation, and operational costs, providing a new option for the construction of large-scale intelligent computing centers. We look forward to continuing to deepen our cooperation with REDnote, jointly promoting innovation and development in AI infrastructure, and contributing to the prosperity of the large model ecosystem."
As large model technology continues to develop, such innovative architectures will become an important support for AI infrastructure, aiding the construction of the domestic large model ecosystem.