
Intelligent Computing | The DeepSeek Catalyst: Competing in the Age of AI Inference
The emergence of DeepSeek is injecting vitality into the tech industry and profoundly reshaping the industrial landscape and daily life. As an open-source and highly efficient AI model, DeepSeek has not only significantly reduced the cost of training and inference but has also driven the widespread adoption and application of AI technology. However, this technological breakthrough has not diminished the demand for computing power. Instead, it has further highlighted the rigid need for more accessible computing resources amid new application scenarios and a thriving ecosystem. This demand places greater emphasis on flexibility and scalability, prompting the industry to shift its strategy from a "computing power race" to an "inference-centric" approach. How enterprises can find their development path in this transformation has become a critical issue in todays’ AI industry.
Large-Scale Computing Power: Still a Rigid Demand for Enterprise
Although DeepSeek has significantly reduced the cost of individual training and inference sessions through open-source initiatives and technological innovation, the overall demand for computing power continues to grow with the proliferation of AI applications and the emergence of super-large models in multimodal and complex inference scenarios. According to IDC data, the global AI server market size reached $125.1 billion in 2024, is expected to increase to $158.7 billion in 2025, and is projected to exceed $222.7 billion by 2028.
Shift in Enterprise AI Focus: From Training-Centric to Inference-Centric
In the past, global AI competition primarily focused on the training phase. Although the importance of inference was widely recognized, its lengthy process and uncertainty made enterprises hesitant to invest. The emergence of DeepSeek has changed this landscape, rapidly shifting the industry's competitive focus to the inference domain.
In fact, both training and inference of large models are crucial: training forms the foundation of large models, determining their capability ceiling, while inference is key to applying models in real-world scenarios. The speed and accuracy of inference directly determine a model's performance in practical applications and user experience. According to IDC predictions, inference computing power is poised for explosive growth. In 2025, the workload share of inference in China will reach 67%, and by 2028, it will further increase to 73%.
The emergence of DeepSeek marks a shift in the AI industry from "stacking computing power" to algorithmic optimization, with the importance of the inference phase becoming increasingly prominent. It significantly lowers the barrier to training and further reduces inference costs through techniques like model quantization and distillation, enabling small and medium-sized enterprises (SMEs) and developers to access large models at low costs and promoting the application of AI technology in more fields. In the future, as AI applications become more widespread, inference costs will account for the vast majority of overall AI investments, prompting enterprises to pay greater attention to performance optimization and cost control on the inference side. Meanwhile, enterprises in vertical industries will shift resources from training to improving inference computing power and scenario development, driving AI applications toward greater richness and maturity and spurring more scenario-orientated solutions.
From Training to Inference: A Roadmap for AI Compute Transformation
As the AI industry pivots from training to inference, enterprises must now navigate distinct strategic paths based on their size and capabilities
Leading Large-Sized Enterprises with Deep Training Needs
Enterprises with deep training needs will continue to dominate the training computing power domain. Although inference demand is growing rapidly, training remains the core of AI development. Currently, DeepSeek primarily focuses on text-based models, while other enterprises may launch differentiated products in multimodal domains such as image-text and video. To maintain their technological edge, these enterprises will continue to invest resources in optimizing training models and enhancing the network efficiency and computing power utilization of intelligent computing centers through optimized network architectures (e.g., adopting RoCE network technology). Simultaneously, they will build comprehensive ecosystems encompassing hardware, software, algorithms, and applications around their technological advantages to achieve technological loop and maximize commercial value. In the future, more leading AI enterprises will join the open-source camp, attracting developers through open-source strategies to build their own ecosystems and gain a competitive edge.
Medium-Sized Enterprises with Limited Resources
For medium-sized enterprises with limited resources, open-source models are often a more economical choice. Enterprises can directly use open-source models for fine-tuning and deployment or reduce training costs through techniques like model pruning, quantization, and knowledge distillation. At the same time, enterprises need to strengthen their inference capabilities, optimize inference architectures, and combine on-premise deployment with cloud services. Additionally, the emergence of DeepSeek has prompted enterprises that were previously hesitant due to high training costs to consider building their own intelligent computing centers. Enterprises can flexibly set up small intelligent computing centers based on business needs, deploying 1–8 servers (within 100 cards) with 64-port 200G/400G switches or 8–16 servers (100-card scale) with 128-port 200G/400G switches. On this basis, enterprises need to pay greater attention to network optimization capabilities to improve the efficiency of inference operations. For customers seeking to quickly experience DeepSeek's capabilities, they can choose the H3C LinSeerCube DeepSeek all-in-one machine (single unit) to meet diverse computing power needs.
Computing Power Leasing Service Providers, Lessee Enterprises, and Individual Developers
With the widespread application of DeepSeek's efficient inference models, the computing power leasing market has once again become highly active. It provides large enterprises with efficient computing power support and offers SMEs and startups a flexible, low-cost way to access computing power, enabling them to quickly launch AI projects without significant upfront investment. Meanwhile, competition in the computing power hardware market has intensified, reducing product differentiation advantages and making network optimization critical. Technologies such as end-network integration and path navigation can significantly enhance network performance and resource utilization, strengthening the competitiveness of computing power leasing services. DeepSeek's expert parallel technology places higher demands on network optimization, as efficient communication between multiple nodes relies on low-latency and high-bandwidth networks. H3C, with its strong network optimization capabilities, can meet DeepSeek's high requirements and provide efficient and reliable network support for computing power leasing. Additionally, the application scenarios for computing power leasing are continuously expanding, covering AI model training, inference deployment, and intelligent computing center operations, making it a vital part of the AI ecosystem.
For enterprises and individual developers leasing computing power, the leasing model supports rapid deployment of AI applications and allows flexible adjustment of computing resources based on business needs, significantly improving operational efficiency. Driven by DeepSeek, leasing costs have decreased. Enterprises can allocate more resources to technological R&D and business innovation, while individual developers and small teams can access computing power resources at lower costs and enjoy affordable API services.
In the future, intelligent computing centers may gradually transition from training-centric to inference-centric. This transformation will require adjustments to the architecture of intelligent computing centers, necessitating not only large-scale deployment of distributed systems (DS) to support efficient inference tasks but also flexible leasing services to allocate computing resources to enterprises and individual developers with genuine needs. However, in the short term, the architecture of intelligent computing centers will not undergo significant changes. For example, full-scale inference tasks for DeepSeek still rely on high-end GPU cards, indicating that high-performance hardware remains a critical support for inference tasks at this stage. It is worth noting that the overall demand for high computing power (including training and inference) is growing, a trend that will further drive industry investment in and optimization of computing power resources.
Overall, the transformation of AI computing power is in a critical transition phase from training-centric to inference-centric. In this process, enterprises of different scales and types need to accurately assess their needs and carefully choose appropriate development paths. Leading enterprises will continue to dominate the training computing power domain, medium-sized enterprises can reduce costs through open-source models and network optimization, and the computing power leasing market provides large enterprises, SMEs, and developers with flexible ways to access computing power. In the future, as intelligent computing centers transition to an inference-centric model, network optimization and flexible allocation of computing resources will become key.