This article is archived. For updated research and insights, please visit our new site Small Island Research Notes on Tech and Future.
AI chip market evolution: Cloud vs. edge in training and inference
Part 2: Edge training, inference, and market trends
The AI chip market is undergoing significant transformations, which can be understood through two key dimensions: deployment environments (cloud vs. edge) and market segments (training vs. inference).
Cloud-based training currently dominates the market and is expected to maintain strong growth in the future. Training is critical for AI model development, requiring immense computational power to process vast amounts of data, which is why it is primarily concentrated in cloud data centers. NVIDIA is the dominant player in this space, but competition from AWS, Google TPU, and Meta could reshape the market landscape in the coming years.
Alongside training, the cloud inference market is also experiencing rapid growth. Inference refers to applying trained AI models to real-world scenarios and making decisions based on real-time data. The increasing demand for inference is driven by advancements in AI models and emerging application scenarios. While inference requires lower computational power than training, it demands high accuracy and low latency, making cloud-based inference expansion essential for scaling AI applications.
On the edge, both edge training and edge inference cater to demands for low latency and data privacy. Compared to cloud-based training, edge training remains a smaller market but is expected to grow with the rise of smart devices and autonomous vehicles. Meanwhile, edge inference focuses on real-time, on-device processing, which is crucial for applications such as autonomous driving and other latency-sensitive use cases. While the edge market remains a niche segment, its importance will increase as these specialized applications continue to evolve.
In conclusion, training and inference in both cloud and edge environments present unique challenges and opportunities. Companies capable of delivering high-performance, cost-effective solutions in these areas will have the potential to challenge existing market leaders.
Following our previous discussion on the cloud AI training and inference market, this article will focus on the on-premises AI chip market for training and inference.
Our Analysis
Compared to the cloud market, on-premises AI solutions offer distinct advantages in low latency and data privacy. As emerging applications such as autonomous vehicles and smart devices grow, on-premises AI training and inference are expected to be key drivers of future market expansion. This article will analyze the trends shaping this segment and explore efficient, cost-effective solutions to address competitive challenges. Finally, we will summarize the key findings from both articles in a comparative table and discuss the future trajectory of the AI chip market, highlighting strategic areas that potential competitors and market players should closely monitor.
3. Edge Training
In the AI chip market, edge training represents the smallest segment, accounting for approximately 3–5% of the market. Edge training requires high computational efficiency and is highly sensitive to latency. These applications must perform AI training at the edge rather than relying on cloud-based computing resources. Examples include autonomous vehicles that require real-time learning and adaptation of their perception systems, industrial machines and robots that train and adapt based on local data, wearable devices such as smartwatches and health monitoring systems that continuously learn and adjust based on user data in real time, and smart city applications that require immediate data processing.
With growing concerns over data privacy and security, businesses are increasingly opting for edge training to ensure sensitive data remains protected. In this market, NVIDIA remains the dominant supplier, while AMD holds a smaller share. If emerging players like DeepSeek can provide cost-effective AI training solutions optimized for edge devices—especially amid growing demands for data privacy and real-time processing—they could become serious challengers to NVIDIA.
4. Edge Inference
Edge inference accounts for approximately 10–15% of the AI chip market, primarily serving enterprise inference needs by running trained models on endpoint devices or edge computing systems. These applications demand low latency and real-time responsiveness, with significantly lower computational requirements than AI training. Key use cases include:
- Smart security: real-time image analysis for detecting suspicious behavior or individuals
- Smart home: rapid response to user commands and real-time environmental adjustments
- Smart transportation: traffic monitoring, autonomous driving, and intersection surveillance
- Drones – real-time image analysis for navigation and filming
- Healthcare monitoring – real-time data processing to assess user health conditions
- Industrial IoT – data collection and analysis to ensure smooth production operations
As concerns over data privacy and security grow, more businesses and institutions are shifting inference processing to local devices to prevent sensitive data leakage. This trend is especially prominent in industries requiring strict privacy protection, such as finance, healthcare, and government.
NVIDIA remains the leading supplier, while AMD continues to gain traction with advancements in its products. Competitors offering high-efficiency, low-power edge inference solutions—tailored to smart home, security, and industrial IoT applications—could challenge NVIDIA and AMD in this evolving market.
Conclusion
The AI chip market is undergoing rapid transformation, with different market segments showing varying growth trends and challenges, as shown in Table 1.
Table 1 AI chip market structure analysis and comparison
Domain | Market share | Major companies | Potential competitors | Future development trends | Potential challenges |
---|---|---|---|---|---|
Cloud training | 50-70% | NVIDIA (H100, CUDA, TensorRT), Google (TPU) | AWS (Trainium), Meta, Anthropic, Tenstorrent, CSP ASIC | Market growth continues, but growth rate may slow down; technological optimization (e.g., sparsification, mixed precision training) reduces costs; expansion of generative AI applications | High AI training costs; uncertainty in technological optimization and market demand; need for high-performance chips to support training demands |
Cloud inference | 15-25% | NVIDIA, AMD, Intel (Habana Gaudi) | Qualcomm, Mythic, Cerebras, Groq, CSP ASIC, DeepSeek | Significant market growth; new applications (e.g., autonomous driving, financial risk assessment, medical diagnostics); lower inference costs and hardware innovations enhance efficiency | NVIDIA’s market leadership challenged; increasing data privacy and security requirements; significant investment required for hardware infrastructure innovation |
Edge training | 3-5% | NVIDIA, AMD | DeepSeek, Project Digits, Internal enterprise demand | Increasing edge training demand due to privacy protection focus; companies will emphasize real-time learning and enhanced perception systems | Higher hardware requirements and costs; need for low-latency and high-performance solutions |
Edge inference | 10-15% | NVIDIA, AMD | DeepSeek, Internal inference demand, self-built equipment | Increased edge inference demand due to growing data privacy and security concerns; growing applications in finance, healthcare, etc. | High performance and low-latency requirements; privacy protection issues; data security requirements must meet regulations |
Source: Researcher and Research
The growth of the edge market and its divergence from the cloud market compel us to consider the potential impact of emerging competitors. Among these, DeepSeek’s technological innovations are particularly noteworthy, as they have the potential to disrupt the current market landscape. Although large companies like NVIDIA and Google currently dominate the market, DeepSeek’s rise—whether in hardware acceleration or breakthroughs in AI training—could significantly alter this dynamic.
Despite the ongoing emergence of Cloud Service Providers (CSPs) developing their own ASICs, ICs, and other competitors, we believe that NVIDIA continues to maintain a strong market position and technological advantage. The primary reasons for this are:
1. GPU Advantage
NVIDIA has long dominated the GPU market, with accelerators like the A100 and H100 becoming industry standards for AI training and inference. NVIDIA’s GPUs not only support training but also handle large-scale inference, playing a critical role in many AI applications. As a result, even with the competition from CSPs developing their own ASICs and ICs, NVIDIA maintains a strong advantage in applications requiring general-purpose and high-performance computing.
2. Robust Software Ecosystem
NVIDIA boasts a comprehensive developer ecosystem, including tools like CUDA, cuDNN, and TensorRT, making it easy for developers to build AI applications on NVIDIA hardware. In contrast, competitors like CSPs with in-house ASICs and DeepSeek need to invest significant time and resources in developing their own software ecosystems, giving NVIDIA a clear edge.
3. Efficient AI Computing Platform
NVIDIA’s high-performance computing (HPC) and AI platforms offer highly optimized hardware and software integration, providing powerful acceleration for a variety of AI workloads, such as natural language processing and image recognition. The optimization of these platforms gives NVIDIA a performance advantage in processing large datasets and models, surpassing other competitors.
However, NVIDIA also faces significant challenges, primarily from two forces in the specialized competition space:
1. Development of CSP-Developed Chips
In an effort to reduce reliance on third-party chip suppliers and achieve cost control and hardware customization, many CSPs have opted to develop their own chips. For example, Google’s TPU focuses on neural network inference, while Amazon’s Inferentia is optimized for inference scenarios. These in-house chips offer more efficient, cost-competitive solutions for specific applications.
2. Breakthroughs in Specialized ASICs
Some competitors may pose a threat to NVIDIA by achieving breakthroughs in specific areas, such as low-latency inference or other specialized acceleration needs, and developing highly specialized ASICs. Especially in cost-sensitive markets or those with niche requirements, NVIDIA’s high-end GPUs (such as the A100 and H100) may not be as attractive as these specialized ASIC solutions due to their higher price points.
Therefore, NVIDIA currently maintains a significant competitive advantage in the general AI training and inference market. However, if CSPs aggressively develop in-house ASICs or competitors make breakthroughs in specialized areas, NVIDIA will face increased competitive pressure. Future competition will depend on whether these challengers can surpass NVIDIA products in terms of performance, efficiency, price, and ecosystem support.
In conclusion, when discussing the AI chip market, the training and inference needs in both cloud and edge markets each present different challenges and opportunities. The development of these four areas is interwoven, and the characteristics of each can influence the future direction of the market. As AI technology evolves, businesses that can provide higher-performance, cost-effective solutions in these areas will not only effectively address current market challenges but also capture growth opportunities in the future. Such solutions have the potential to challenge the current market leaders and carve out a strong position in these rapidly developing sectors.
This article is part of our Future Scenarios and Design series.
It explores how possible futures take shape through trend analysis, strategic foresight, and scenario thinking, including shifts in technology, consumption, infrastructure, and business models.