NVIDIA Archives | Researcher and Research

AI Strategy Shifts Among the Big Six: Four Core Trends from Compute Scale to Efficiency Competition

Jane Hsu — Wed, 06 Aug 2025 05:04:09 +0000

AI Strategy Shifts Among the Big Six: Four Core Trends from Compute Scale to Efficiency Competition

In less than three years, the focus of the AI race has shifted three times. It began with a contest to build the largest and most capable models, moved into a rush to secure computing power, and has now arrived at a phase defined by efficiency, the rise of AI agents, and the first real tests of commercial viability. Based on insights from the most recent earnings calls of six leading technology companies — Microsoft, Amazon, Google, Meta, Apple, and Tesla — the next 12 to 18 months will revolve around four core trends shaping the AI landscape.

Optimizing AI infrastructure: Cloud-oriented companies are entering the multi-gigawatt data center era and focusing on improving tokens-per-GPU efficiency, energy use, and latency. Hardware-oriented players are deepening their on-device AI strategies and embedding AI into their products.

The era of AI agents: AI is moving from conversational tools to agents that can take initiative, connect to tools, and carry out tasks in daily workflows. Three main paths are emerging: purely digital enterprise agents, hardware-enabled agents, and physical-world automation.

The commercial validation phase: From the second half of 2025 through the first half of 2026, companies will face proof points in high-stakes arenas, including enterprise AI agents, autonomous driving and robotics, AI wearables, and AI-powered advertising and e-commerce.

Efficiency as the new battleground: Competition is shifting from sheer GPU volume to performance per unit of resource, spanning hardware architecture (Tesla’s “intelligence per GB”), model-level efficiency (Microsoft and Google’s tokens-per-GPU gains), and algorithmic optimization in applications (Meta and Amazon).

Cloud-oriented giants are competing fiercely in enterprise AI agents, infrastructure build-out, and efficiency gains, while hardware-oriented companies are seeking breakthroughs in consumer access points and real-world automation. The year 2026 will be a pivotal test of commercial viability. Success in high-commitment use cases could spark a second wave of enthusiasm. Failure may slow both investment and technological momentum. In the end, leadership will not be decided by who has the largest models or the most GPUs, but by who can integrate AI most effectively into daily life and industry, turning it into sustainable business value.

Introduction

In less than three years, the focus of the AI race has shifted through several distinct phases. It began with a competition to build ever-larger models, moved into a rush to secure computing power, and has now reached a stage defined by efficiency, the rise of AI agents, and the first tests of commercial viability.

From 2022 to 2023, the generative AI wave ignited by ChatGPT pushed technology leaders into a model-building contest. Companies raced to release larger, faster, and more capable large language models. Victory was often measured by parameter counts and benchmark scores. Yet this contest came at an extraordinary cost and lacked sufficient commercial grounding.

From 2023 through the first half of 2025, companies began to recognize that the real bottleneck in AI development lay in computing resources. This led to a phase of capacity accumulation. Microsoft, Google, Meta, and Amazon made massive purchases of NVIDIA GPUs, locking in multi-year supply agreements and building multi-gigawatt data centers to meet training and inference demands. But simply stacking more compute proved costly, and performance gains did not always match the scale of investment.

In the second half of 2025, attention began to turn toward efficiency, the deployment of AI agents, and the validation of commercial models. The focus shifted from adding more GPUs to finding ways to accomplish more with the same resources. This included improving tokens-per-GPU throughput and strengthening inference performance. At the same time, AI began to move beyond conversational formats toward agents capable of taking action, connecting to tools, and embedding themselves in daily workflows, ranging from enterprise operations and autonomous driving to AI-enabled eyewear and e-commerce advertising.

While Apple, Amazon, Google, Meta, Microsoft, and Tesla have pursued different paths in AI investment since the generative AI wave began, these differences were less apparent in previous quarters. This quarter, as deployment models take shape, investment priorities diverge, and commercialization timelines become clearer, those distinctions have come sharply into focus.

Cloud-oriented vs. Hardware-oriented AI Leaders

Looking at the AI strategies of the six major technology companies, it is clear that while all are investing heavily, their deployment models and investment structures follow two distinct paths. These differences did not emerge overnight; they reflect long-standing business foundations and competitive strengths.

1. Cloud-oriented leaders (Microsoft, Amazon, Google, Meta)

Their strength lies in global cloud computing platforms, large-scale data center networks, and robust software ecosystems. Their AI strategies focus on building massive computing capacity while continuously improving infrastructure efficiency. In recent years, they have introduced proprietary AI chips such as Microsoft’s Maia AI accelerator and Cobalt cloud CPU, Google’s TPU v5, and Amazon’s Trainium 2 and Inferentia 2. These chips operate alongside NVIDIA GPUs, balancing performance with cost while reducing supply chain dependence. Their business models center on subscriptions and API usage, with advertising serving as an important AI monetization channel.

2. Hardware-oriented leaders (Apple, Tesla)

Their strength lies in integrating hardware products, ecosystems, and specialized computing architectures. Their AI strategies lean toward embedding AI deeply into devices (on-device AI) or physical products such as autonomous driving systems and humanoid robots. This approach reduces reliance on cloud infrastructure while strengthening user experience and ecosystem stickiness. Their business models are driven primarily by hardware sales and value-added services, with AI features playing a central role in driving device upgrades and product adoption.

Table 1. Classification of Cloud-oriented and Hardware-oriented AI Leaders

Company Type	Representative Companies	Core Business Strengths	AI Strategic Focus	Commercialization Model
Cloud-oriented	Microsoft, Amazon, Google, Meta	Global cloud computing platforms, extensive data center networks, platform ecosystems	Build multi-GW data centers, develop proprietary AI chips (TPU, Trainium), provide cloud-based generative AI models and agent services (Copilot, Gemini, Bedrock, Business AI)	Enterprise AI subscriptions, API usage-based revenue, advertising monetization
Hardware-oriented	Apple, Tesla	Hardware products and ecosystems, specialized computing architectures	On-device AI (Apple Silicon), physical AI (FSD, Robotaxi, Optimus) to reduce cloud dependence and deeply integrate with hardware experiences	Hardware sales, value-added services, AI features driving hardware upgrades

These two distinct models mean that, on the road to AI commercialization, they will face very different validation timelines, capital expenditure structures, and return profiles. Understanding this distinction not only helps interpret the signals emerging from recent earnings calls but also offers a clearer view of how each is likely to compete in the AI market over the next one to two years.

Four Core AI Trends

As generative AI moves from technical exploration to the race for deployment, the strategies of the six leading technology companies are becoming more focused and increasingly distinct. Over the next 12 to 18 months, four core trends will shape the landscape:

Optimization of AI infrastructure
The rise of the agent era
The start of the commercial validation phase
Computing efficiency as the new battleground

The sequence of these trends reflects the full arc of AI development, from building the foundation to deployment, then to validation, and finally to long-term optimization.

Trend 1: From Stacking to Optimizing AI Infrastructure

The transition from the “compute accumulation” phase of 2023–2024 to 2025 marks a shift in focus. The question is no longer simply who has more GPUs, but how to build infrastructure that is more efficient and more adaptable to support long-term AI commercialization.

For the cloud-oriented leaders (Microsoft, Amazon, Google, Meta), the past year has brought them into the multi-gigawatt data center era. Their priorities are moving from expanding GPU counts to improving tokens-per-GPU efficiency, reducing energy consumption, and lowering latency. At the same time, sovereign AI clouds, low-latency cloud services, and private deployments have become important directions, ensuring that key customers can run generative AI in secure and compliant environments.

For the hardware-oriented leaders (Apple, Tesla), Apple is pursuing an on-device AI plus private cloud architecture, keeping much of the AI processing on Apple Silicon devices to reduce cloud load and protect privacy. Tesla is embedding AI directly into its products, from Full Self-Driving (FSD) and Robotaxi to the Optimus humanoid robot, using physical AI as a core differentiator.

Trend 2: The Era of AI Agents

Over the past year, generative AI has largely taken the form of chatbots. Yet conversational AI often lacks stickiness, tending to remain in one-off interactions or experimental use. In contrast, AI agents can connect to tools, take initiative, and embed themselves in daily work and life. This ability to act within real workflows is central to their long-term commercial potential.

From the latest earnings calls, it is clear that the six major companies are shifting their focus toward AI agents capable of carrying out tasks, with three primary paths emerging in different market dimensions:

1. Cloud-native Enterprise Agents

These agents operate entirely in cloud environments, focusing on enterprise workflows and data processing without relying on specific hardware as an entry point.

Google Agentspace: A foundational enterprise agent platform that enables companies and developers to build their own corporate AI agents.
Microsoft Foundry Agent Service: Also a cloud-based enterprise agent platform, but deeply integrated with Microsoft 365 and Copilot to strengthen workflow capabilities within Microsoft’s ecosystem.
Amazon Bedrock Agent: A cloud-based agent with a more vertical focus, specializing in e-commerce, customer service, and logistics.

2. From Digital Agents to Consumer Hardware Entry Points

These agents retain the core capabilities of digital agents but rely on hardware devices as the main interface, making interactions more immediate and natural.

Meta Business AI: Essentially still an AI agent, but accessed through AI-enabled glasses, marking the first step from pure cloud to hardware-based entry.
Apple Personalized Siri: Also a hardware-enabled agent, deeply integrated with the iPhone and the broader Apple ecosystem, enhanced by Apple Intelligence to deliver personalized task handling.

3. Physical-world Automation

These agents do more than act in the digital realm; they can operate in the physical world, performing real-world tasks.

Tesla FSD and Robotaxi: AI agents in the transportation domain that can perceive their surroundings, make driving decisions, and carry out mobility services, representing a fundamentally different market dimension from digital agents.

Trend 3: The Commercial Validation Phase Begins

From the second half of 2025 through the first half of 2026, the six leading technology companies will enter a decisive period for AI commercialization. Over the past two years, they have committed unprecedented capital to infrastructure, model development, and product design. These investments must now begin to translate into measurable business returns, such as return on investment (ROI).

The earliest results will emerge in a few high-stickiness application areas. We can rank them by their alignment with each company’s AI strategy, their maturity, and the urgency of market validation.

First, enterprise-grade AI agents are at the core of nearly every cloud-oriented company’s strategy. They represent the largest investment areas and are tightly integrated with existing enterprise cloud services. These will be the first to enter real-world usage and face evaluation, testing whether they can truly become indispensable daily work partners.

Second, autonomous driving, Robotaxi services, and the production of Optimus robots, led by Tesla, will be closely watched. Although they face significant regulatory and technical hurdles, success in scaling operations could create landmark commercialization cases.

Third, AI glasses and wearable devices, championed by Meta and Apple, have long-term potential for high user engagement but remain in the early adoption stage. Market acceptance, retention, and conversion to paid usage will require more time to observe.

Finally, AI-powered advertising and e-commerce, already widely applied in the ad and recommendation systems of Meta, Google, and Amazon, are primarily efficiency improvements within existing businesses. Their potential for transformative impact is lower than the other applications, and thus they have a lower priority for immediate validation.

The outcomes of this stage will directly determine the pace of future capital spending and product strategy. If commercial validation falls short, both investment enthusiasm and the speed of product expansion may slow significantly. If it succeeds, strong case studies will fuel the next wave of AI growth.

Trend 4: From Computing Scale to Computing Efficiency

Computing power remains the foundation of generative AI, yet the focus is shifting toward achieving more with fewer resources. AI cannot rely indefinitely on buying more GPUs to expand capacity, especially as power availability, cost, and supply chain constraints become pressing bottlenecks. In this context, efficiency is emerging as the sustainable basis for competition. This shift is a natural evolution from the “compute accumulation” era to a more mature stage.

In the latest strategies of the six leading companies, improvements in computing efficiency can be grouped into three layers. Together, they form a bottom-up chain of optimization that spans from hardware architecture to commercial applications.

1. Hardware and System Architecture Level

Tesla has introduced a new metric for measuring AI efficiency called “intelligence per GB,” which reflects how effectively AI systems use memory to deliver intelligence. This metric represents the most fundamental layer of efficiency measurement, focusing on improving the density of intelligence at the physical resource level.

2. Model Inference and Training Efficiency Level

One level higher, Microsoft and Google are working to improve tokens-per-GPU processing efficiency so that the same hardware can handle more generative tasks. This metric targets the optimization of generative AI model performance within existing hardware limits. Compared with Tesla’s metric, it sits closer to the application layer but still focuses on maximizing the use of core computing resources.

3. Application and Algorithm Optimization Level

At the layer closest to business applications, Meta and Amazon are improving efficiency in algorithms and recommendation systems, such as reducing inference costs and speeding up ad-serving computations. Although these optimizations take place at the application level, they can significantly lower AI operating costs and directly enhance ROI in advertising and e-commerce.

Summary of the Four Core Trends

As shown in Table 2, these four trends together provide a framework for understanding how the six companies are shaping the AI landscape over the next one to two years. They also reveal the roles that different types of companies may play in this evolution. The next phase of AI infrastructure competition will not be decided by who has the most GPUs, but by who can achieve the highest performance and commercial efficiency with finite resources.

Table 2. Four Core AI Trends

AI Trend	Signals from Earnings Calls	Representative Companies
1. Optimization of AI Infrastructure	Expansion of multi-gigawatt data centers continues, but focus is shifting from sheer GPU counts to improving tokens-per-GPU efficiency and enabling flexible deployment. AI-first architectures, sovereign cloud, and low-latency cloud services are key directions.	Microsoft: Azure adopting AI-first architecture and efficiency gains Amazon: Trainium 2 used in Anthropic training Google: TPU development, expansion of enterprise cloud contracts Meta: Prometheus and Hyperion multi-GW clusters
2. The Era of AI Agents	AI moving from conversational tools to agents that can take initiative, connect to tools, and integrate into workflows. Agent applications span enterprise, consumer, and physical-world scenarios.	Google: Agentspace Microsoft: Foundry Agent Service Amazon: Bedrock Agent Meta: Business AI with AI-enabled glasses Apple: Personalized Siri Tesla: FSD/Robotaxi as transportation agents
3. The Commercial Validation Phase	High-stickiness AI applications begin testing ROI. Enterprise-grade agents show early adoption, while hardware-based AI still awaits large-scale rollout. Advertising and e-commerce will be the first testing grounds to deliver measurable results.	Microsoft / Google / Amazon: Growth in enterprise agent usage data Tesla: Robotaxi and Optimus require production scaling and regulatory approval Apple: 2026 Siri upgrade as potential upgrade driver Meta: Retention and monetization of AI glasses still uncertain Meta / Google / Amazon: AI in advertising and recommendation systems
4. Computing Efficiency as the New Battleground	New metrics emerging to measure AI efficiency (e.g., intelligence per GB, tokens per GPU). Focus on improving inference and training performance, reducing cost per unit of compute.	Tesla: Intelligence per GB metric Microsoft / Google: Tokens-per-GPU efficiency improvements Meta / Amazon: Algorithmic optimization for advertising and recommendation systems

Conclusion

Generative AI is moving beyond its early phase of model competition and compute accumulation into a new stage driven by efficiency and commercial validation. Optimization of AI infrastructure, the rise of AI agents, the start of the commercial validation phase, and computing efficiency as the new battleground will be the key trends shaping the industry over the next one to two years.

As shown in Table 3, cloud-oriented leaders are competing intensely in enterprise AI agents, infrastructure build-out, and efficiency gains. Hardware-oriented leaders are seeking breakthroughs in consumer access points and real-world automation. The success or failure of these different approaches will determine who can sustain leadership in the AI era.

Despite their varied strategies, the six companies share a clear consensus: AI is the primary arena for the next phase of competition. While the cloud-oriented and hardware-oriented paths are diverging, both sides are working to strengthen their positions in infrastructure and agent applications at the same time.

The year 2026 will serve as a defining year for commercial validation. If agents and hardware-based AI can prove their value in high-engagement scenarios, it could spark a second wave of AI enthusiasm. If not, the market may enter a period of narrative fatigue, slowing both investment and technological progress.

Over the next 12 to 18 months, the key developments to watch include:

Whether enterprise AI agents can become indispensable daily work tools
Whether autonomous driving and Robotaxi services can overcome regulatory and production hurdles
Whether AI wearables can achieve lasting engagement and paid adoption
Whether AI-powered advertising and e-commerce can deliver meaningful revenue growth

Ultimately, leadership in AI will be decided not by who has the largest models or the most GPUs, but by who can integrate AI most effectively into everyday life and industry, turning it into sustainable business value.

Table 3. AI Development Types and Trend Positioning of the Six Leaders

Company Type	Company	AI Focus Areas	Investment and Deployment Directions	Key Commercial Validation Points	Current Trend Positioning*
Cloud-oriented	Microsoft	Azure AI infrastructure, Copilot enterprise agents	Multi-gigawatt data centers, tokens-per-GPU efficiency improvements	Whether Copilot becomes an indispensable daily enterprise tool	Accelerating deployment
	Amazon	AWS Bedrock, AI-driven advertising monetization	Proprietary AI chips (Trainium 2 / Inferentia 2), Bedrock Agent	Sustained high demand for AWS AI, integration of DSP advertising	Accelerating deployment
	Google	Gemini, multimodal search agents	AI Overviews, Agentspace	Improvement in AI search performance and ad conversion rates	Accelerating deployment
	Meta	AI personal assistant (Business AI), AI glasses	Large-scale AI training clusters (Prometheus / Hyperion), Business AI	Retention and monetization model for AI glasses	High-expectation phase
Hardware-oriented	Apple	On-device AI, personalized Siri	Apple Silicon plus private cloud	2026 Siri upgrade driving hardware refresh cycle	Initial validation
	Tesla	Robotaxi, Optimus humanoid robot	FSD upgrades, autonomous driving agents	Geographic coverage and production scale of Robotaxi	Initial validation

*Definition of Current Trend Positioning

Accelerating Deployment: The product has completed core development and entered large-scale deployment, with adoption rates rising quickly and becoming part of regular daily use.
High-Expectation Phase: The market and the company hold high expectations for the product’s potential, but large-scale adoption and a proven business model have yet to be established.
Initial Validation: The product has completed core technical development and has entered small-scale pilot operations or regional rollout, with commercial viability and scalability still being tested.

This article is part of our Global Business Dynamics series. It explores how companies, industries, and ecosystems are responding to global forces such as supply chain shifts, geopolitical changes, cross-border strategies, and market realignments.
See more in this category, or explore more notes here.

The post AI Strategy Shifts Among the Big Six: Four Core Trends from Compute Scale to Efficiency Competition appeared first on Researcher and Research.

GPU Cloud Is Not Just a Compute Race but a Relay of Assets and Capital Belief

Jane Hsu — Mon, 07 Jul 2025 09:00:03 +0000

GPU Cloud Is Not Just a Compute Race but a Relay of Assets and Capital Belief

This article analyzes a key shift in GPU cloud platforms as they move from a technology-driven model to one powered by asset leverage. It highlights how asset-leveraged platforms are reshaping the competitive logic of the entire market. These platforms treat GPUs as financial assets and rent as cash flow, using strategies such as pre-lease contracts, installment-based procurement, and asset bundling to create an expansion model that closely resembles financial instruments. The focus of competition has shifted from who can run the fastest models to who can manage capital most efficiently. In this game, the real question is no longer who buys the GPU, but who is still willing to take the next handoff.

Introduction: The Four Operating Models of Cloud Infrastructure

Over the past few years, the core infrastructure of cloud computing has been dominated by three major providers: AWS, Google Cloud, and Microsoft Azure. These companies have built their services around large-scale, distributed data centers, offering stable and scalable computing power. This model, known as the hyperscaler approach, is driven by technical superiority and service completeness.

Since 2023, however, a new trend has begun to shift the rules of the game. Emerging GPU cloud platforms like Oracle and CoreWeave are not focused on innovating the cloud service itself. Instead, they are leveraging asset-based financing and rental models to turn high-cost hardware into financial assets. Their strength lies not in technology leadership, but in capital operations.

At the same time, a wave of startups such as Lambda Labs and Vast.ai has entered the market with a different approach. These companies specialize in high-performance, customized infrastructure for AI training. Rather than pursuing economies of scale like the hyperscalers, they differentiate through flexibility and operational efficiency.

As a result, four distinct operating models are now shaping the cloud landscape:

Traditional hyperscaler platforms: AWS, Google, and Microsoft offer stable, full-featured cloud services that serve both enterprises and developers.
Asset-leveraged platforms: Oracle and CoreWeave use GPU hardware as a capital leverage tool to accelerate deployment.
High-performance customized platforms: Lambda Labs and Vast.ai focus on adaptability and efficiency, targeting specific use cases.
Pure GPU rental platforms: A growing number of startups are emerging with a more flexible and financialized approach aimed at serving smaller AI developers.

Among these competing models, the second type known as asset-centric platforms deserves particular attention. Their rapid expansion is not only reshaping supply chain dynamics and capital flows, but also transforming cloud budgets from a form of technology investment into a belief-driven financial game.

The rest of this article will explore the operating logic behind these asset-leveraged platforms and examine how they are driving the current expansion of GPU cloud infrastructure, along with the risks that may follow.

1. Asset-Leveraged Cloud Platforms Operate More Like Asset Managers Than Tech Companies

We often assume that the core of a cloud platform business is selling compute. At first glance, it seems they convert GPUs into computing resources and rent them out to AI companies.

In reality, asset-leveraged cloud platforms are running an asset-driven business. They purchase expensive hardware and turn it into monthly rental streams by slicing, leasing, and redistributing the assets. In many cases, these assets are also used as collateral or repackaged for refinancing.

GPUs are treated as capital assets, and rental payments generate cash flow
Tenant contracts function like interest-bearing instruments, while full server racks serve as collateral
What appears to be cloud service delivery is actually a highly assetized and financialized capital model

At the core of this model is belief. As long as the market believes these compute resources will continue to be rented out consistently, capital will keep flowing in, and infrastructure will keep expanding. This belief does not only rest on tenant demand forecasts. It is even more deeply rooted in investors’ expectations of stable cash flows.

2. This Business Runs More Like a Relay Race than a Cloud Service

Take Oracle and CoreWeave as examples. These GPU cloud platforms often rely on highly efficient capital strategies to scale rapidly:

They use pre-lease agreements to guide procurement. Instead of purchasing hardware upfront, these platforms first secure commitments or letters of intent from tenants. Once there is a forecast of future cash flow, these agreements can serve as the foundation for financing.
They use installment payments to reduce capital pressure. Platforms do not need to pay the full cost of hardware at once. Many purchases are structured through installment plans or supply chain financing, allowing for expansion without heavy upfront investment.
They bundle assets to generate liquidity. Some platforms package GPUs with the associated lease contracts and sell them to asset managers or financing partners. These bundles are treated as stable, income-generating assets and can sometimes be securitized or refinanced.

While these strategies may not be directly reflected in financial reports, we can piece together a clear capital model by observing CoreWeave’s expanding credit lines, its multi-billion dollar cloud deal with OpenAI, and Oracle’s procurement and deployment pace under its Stargate project with NVIDIA.

This is a highly asset-centric business model. It works by securing lease commitments before GPU purchases, using those long-term agreements as collateral, and then using new funds to expand infrastructure. Instead of the traditional buy-then-sell cycle, these platforms follow a lease-first, finance-next approach. Once the lease is secured and confidence is established, hardware and capital follow.

Consider this hypothetical scenario:

In Year One, the platform purchases a large number of GPUs. Market demand is strong, rental prices are high, and model performance is improving. Everything looks profitable.
In Year Two, demand cools and rental rates drop, just barely covering depreciation and operations.
In Year Three, aging GPUs can no longer generate enough income to offset costs, leading to potential losses.

At this point, the platform may not cut costs. Instead, it might buy newer, more powerful GPUs and rely on fresh rental contracts to offset losses from older equipment.

In this cycle, the entire cash flow model depends on the next handoff. If someone is still willing to take the next step, whether a tenant or a financier, the pressure from the previous round remains hidden.

This logic might sound familiar.

“If we keep expanding, the losses won’t materialize.” It is a belief cycle often seen in asset bubbles. As long as the market continues to believe this relay can go on, the model will stay intact until the next runner fails to show up.

3. We Have Not Seen a Reversal Yet, but It Is Time to Start Asking Questions

So far, there are no clear signs of cancellations or collapse. GPUs remain in short supply, and demand for rentals and reservations is still strong. Asset-leveraged platforms like Oracle and CoreWeave continue to expand their cloud footprint, while leasing-focused startups are also entering the market. The overall industry is still in a phase of rapid expansion.

But what if this is only a transitional stage in a broader asset-leverage acceleration cycle?

What if this seemingly stable business model, which generates consistent rental income, is actually built on a deeper assumption that constant expansion is needed to sustain cash flow and asset efficiency? And what happens when that assumption starts to weaken?

This asset-driven model may also create structural pressure for other types of platforms. If over-invested GPU infrastructure begins to flood the market, it could trigger pricing and capital allocation effects that spill over to the three other models: hyperscalers, customized platforms, and pure GPU leasing providers.

We can begin with a few questions to guide our observations:

Can the current rental pricing structure truly sustain a three-year* depreciation and capital recovery cycle?
If tenants are concentrated in just a few large AI firms, is there hidden exposure to single-customer risk or credit tightening?
Is cloud infrastructure financing evolving into something closer to a financial product rather than a service model?
If GPU prices fall or rental rates decline, will asset-heavy platforms be forced to release inventory early, pushing the market into oversupply?
If the asset-leverage model cools down, could it shrink the margin space for other players and reshape competitive dynamics?

These questions are not meant to forecast a crash. They are meant to examine the logic of how this model actually works.

Because the more universally accepted something becomes, the more likely it is to be where a narrative break begins.

4. What If This Is Not Just a Technology Cycle but a Financial Narrative Taking Shape?

From 2023 to 2025, the story of GPU cloud has shifted. It is no longer just about who runs the fastest models or holds the most powerful compute.

Winning this race increasingly depends on who can secure GPUs early, deploy clusters quickly, and use capital leverage to gain market share. On the surface, it appears to be a competition over infrastructure. But beneath that, it is a contest of liquidity and asset deployment efficiency.

When supply is tight, rental rates are high, and capital is abundant, the strategy seems flawless. Prepaid contracts become purchase orders. Orders turn into server deployments. Servers convert into cash flows and future financing. Every step relies on a single assumption that someone will take the next handoff.

It is this assumption that entangles asset cycles, rental models, and capital markets into a structurally reflexive system. As long as the belief holds, expansion continues.

The rise of asset-leveraged platforms has not only introduced new competitors, it has also reshaped the rules of the game. Cloud platforms once centered on technical strength are now pressured to compete on capital efficiency.

For large-scale platforms, this structural risk appears manageable. Their diverse customer bases, multiple revenue streams, and more stable financials provide room to absorb shifts in demand or rental rates.

But for smaller players, the dynamics are different. When liquidity tightens, tenant appetite fades, or depreciation accelerates, GPUs once used as leverage can quickly become burdens. The expansion model built on belief and scale can reverse as soon as trust begins to crack.

From this perspective, the rise of asset-leveraged platforms is not simply a reflection of the AI wave. It represents a deeper evolution, one driven by financial narratives.

This narrative turns cloud budgets, once seen as technical investments, into an asset-centered competition. And it is quietly rewriting the competitive logic and risk structures that define this market.

Conclusion: Time to Start Watching

As GPU cloud platforms evolve beyond technical infrastructure into a combination of capital assets and belief systems, we may need to shift how we observe them. Some key questions to begin with include:

Are GPU rental prices starting to decline?
Is there a mismatch between the release cycle of next-generation GPUs and the readiness of tenants’ applications and real-world demand?
As capital enthusiasm cools, could that impact the timing of future deployments and procurement?

These questions do not necessarily signal imminent risk. But they remind us of a broader truth: the more stability is collectively assumed, the more likely reflexive tensions are quietly building underneath.

With the rise of asset-leveraged platforms, the logic of cloud infrastructure is being reshaped. The traditional hyperscaler model built around comprehensive enterprise-grade services is now being challenged by three distinct forces:

the efficiency-first approach of custom infrastructure startups,
the flexibility of pure GPU leasing platforms,
and the high-leverage capital strategies of asset-driven players.

Among them, asset-backed platforms are shifting the center of gravity. Their ability to move quickly in both capital deployment and hardware rollout is shifting the focus from pure technical superiority to financial operating strength. This shift is not only changing the rhythm of expansion and risk but may also compel other platforms to adapt, adopt asset-based logic, and rethink what “competitive advantage” means in this space.

In this relay of assets and belief, the real question has never been who buys the GPU. It is who is still willing to take the next handoff.

*We use a three-year time frame as a lens because it aligns with hardware depreciation cycles, contract terms, and potential turning points in capital tolerance.

This article is part of our Global Business Dynamics series.
It explores how companies, industries, and ecosystems are responding to global forces such as supply chain shifts, geopolitical changes, cross-border strategies, and market realignments.

See more in this category, or explore more notes here.

The post GPU Cloud Is Not Just a Compute Race but a Relay of Assets and Capital Belief appeared first on Researcher and Research.

From NVIDIA to the Rack: The Real AI Deployment Battle Is Just Beginning

Jane Hsu — Tue, 03 Jun 2025 12:53:30 +0000

From NVIDIA to the Rack: The Real AI Deployment Battle Is Just Beginning

When we talk about artificial intelligence (AI), the spotlight usually stays on models, compute power, and chips. But the most critical phase, which is deployment, is often left out of the conversation. Getting from NVIDIA’s chips to a fully operational rack in a data center takes far more than engineering. It requires navigating manufacturing logistics, capital pressure, thermal limits, geopolitical shifts, and a changing platform landscape.

This article delves into four often-ignored challenges in AI deployment:

ODMs as Financial and Risk-Bearing Partners: Original Design Manufacturers (ODMs) like Quanta, Foxconn, and Inventec have evolved from mere assemblers to key players bearing financial and supply chain risks.

Liquid Cooling as a Performance Ceiling: Efficient thermal management, particularly through liquid cooling, has become essential to maintain AI server performance and reliability.

Geopolitical Influences on Assembly Locations: The choice of assembly locations is increasingly driven by geopolitical strategies, impacting data sovereignty and security.

Full-Stack Delivery Redefining Platform Boundaries: The ability to deliver integrated systems is reshaping the control and influence within AI platforms.

For product leaders, infrastructure startups, and industry analysts, understanding these factors is crucial to navigating the evolving AI landscape.

The Overlooked Challenge in AI: Deployment

While NVIDIA’s Blackwell platform, OpenAI’s GPT-5, and AWS’s Tranium processors dominate discussions, the deployment phase remains underrepresented. Before AI systems become operational, they must undergo assembly, integration, cooling, testing, and delivery into data centers.

This journey starts with a chip from NVIDIA and culminates in a data center rack. Along the way, it involves Taiwanese manufacturing facilities, assembly lines in Vietnam and Mexico, liquid cooling module designs, yield coordination, and significant capital investments in pre-purchased components. These elements are fundamental to the seamless operation of AI models like GPT.

The future of AI is not solely defined by model architectures but also by the physical infrastructure, including motherboards, thermal modules, GPU preorders, and capital turnover cycles. Control over this infrastructure equates to influence over the next wave of AI platform power.

ODMs: From Assemblers to Strategic Partners

Companies such as Quanta, Foxconn, and Inventec were traditionally viewed as low-margin assemblers. Today, they play a pivotal role in ensuring timely AI system deliveries. These ODMs not only assemble full systems but also invest upfront to secure GPUs and CPUs under buy-and-sell arrangements, assuming capital pressures and supply risks.

This evolution signifies a shift from mere manufacturing to becoming financial backbones and deployment guarantors within the AI platform supply chain.

Thermal Management: The Hidden Bottleneck

As NVIDIA’s GB200 and GB300 gain prominence, market attention often centers on GPU performance and memory bandwidth. However, the primary obstacle to rapid AI server deployment lies in thermal management. Reliable and integrated liquid cooling systems have become top priorities for Cloud Service Providers (CSPs) when selecting suppliers.

Previously under-the-radar component manufacturers like Auras and Asia Vital Components are now essential to maintaining system stability, highlighting that effective thermal solutions are as critical as computational speed.

Geopolitical Considerations in Assembly Locations

The increasing assembly of AI servers in Vietnam, Mexico, or Tennessee is not merely a cost-driven decision. It reflects strategic moves by the United States to control computing locations and define data security boundaries. Manufacturers are adapting to a new kind of infrastructure competition, driven by the need for sovereign deployment.

For many CSPs, the origin of server assembly has become a focal point in assessing the risks associated with deploying AI models.

Integrated Delivery: Redefining Platform Control

With NVIDIA stepping back from directly shipping AI servers and companies like ZT being acquired by AMD, ODMs are now engaging directly with CSPs. The capability to deliver complete systems has transformed certain manufacturers into extensions of AI platforms themselves.

This shift underscores that those who can deliver entire racks effectively control the timelines and rhythms of AI platform operations.

Navigating the New AI Deployment Landscape

The forthcoming AI battleground is not about who trains models faster or deeper. It’s about who can reliably deliver fully integrated, stable, and financially backed systems on time. The real bottlenecks now lie in cooling systems, rack integration, working capital, and strategic manufacturing site selection.

This supply chain from NVIDIA to the rack signals a broader industrial transformation driven by deployment capacity, geopolitical decisions, capital constraints, and the redistribution of platform power.

If you are:

A Technical or Product Decision-Maker: This insight will help you understand the physical limitations of AI deployment and anticipate risks in future system designs.
An Infrastructure Startup or Systems Architect: This perspective will reshape your evaluation of platform partnerships, module reliability, and manufacturing alignment.
An investor or industry analyst will find in this analysis a pathway to an often-overlooked value shift from chipmakers to server integrators, cooling specialists, and manufacturing hubs positioned for the next growth cycle.

The true battleground of AI lies not just in chips but within factories, financial strategies, and the yet-to-be-delivered racks awaiting deployment.

This article is part of our Taiwan Tech and Market Shifts series.
It explores how Taiwan’s tech industries are adapting to global shifts in supply chains, manufacturing, policy, and innovation.

See more in this category, or explore more notes here.

The post From NVIDIA to the Rack: The Real AI Deployment Battle Is Just Beginning appeared first on Researcher and Research.

NVIDIA’s Leadership in AI: Key Insights from Jensen Huang’s GTC Keynote

Jane Hsu — Fri, 28 Mar 2025 10:09:53 +0000

NVIDIA’s Leadership in AI: Key Insights from Jensen Huang’s GTC Keynote

We’ve explored the evolution of AI, NVIDIA’s strategic positioning, and its impact at each stage. The breakthrough of the GeForce 5090 will drive the shift from Perceptual AI to Generative AI.

Next, Agentic AI will evolve into Physical AI, and these two will eventually merge, creating a profound real-world impact.

While NVIDIA has established itself as the dominant player in the AI ecosystem, the varying hardware needs across industries and use cases will spur competitors to find more cost-effective alternatives. As technology advances, this competition will only intensify.

Our Perspective

1. What Did We Learn from Jensen Huang’s Keynote?

Last year, NVIDIA made its return to in-person events, creating an atmosphere akin to a rock concert. Some even called it the “Woodstock of AI,” while this year’s event has been dubbed the “Super Bowl of AI.” As the GPU Technology Conference (GTC) continues to grow in scale and influence, AI has firmly positioned itself at the forefront of global technological advancement.

In Jensen Huang’s keynote, we mapped the evolution of AI, NVIDIA’s strategic positioning, and its industry impact at each stage. As illustrated in Table 1, AI development unfolds in several key phases, with NVIDIA playing a crucial role at each stage by providing the essential computing power through its GPU technology.

Table 1 NVIDIA’s AI Strategy

AI Development Stage	Key Features	NVIDIA’s Strategy
Perceptual AI	Enables AI to understand the world using deep learning technologies to support developments in fields like computer vision, speech recognition, and natural language processing (NLP), including facial recognition, object detection, voice assistants (Siri, Alexa), text classification, and sentiment analysis.	Provides GPU computing resources to drive deep learning. Develops the CUDA platform and Tensor Core accelerators to speed up training, enhancing efficiency, and supporting research and innovation in deep learning.
Generative AI	AI not only understands but also creates content, such as transforming text into images (e.g., Stable Diffusion, DALL·E), text into videos (e.g., Sora, Runway Gen-2), and accelerating innovations in fields like biotechnology.	Provides AI training infrastructure like the DGX supercomputer, and develops dedicated AI chips (e.g., H100, B200) to support large-scale AI training. Utilizes CUDA and TensorRT to accelerate inference, enabling AI to generate diverse content rapidly.
Agentic AI	AI evolves from passive to active, becoming AI agents that can autonomously execute tasks, such as searching for information, organizing reports, reading articles, or watching videos to learn new knowledge, and integrating with tools to automate coding or data processing.	Develops foundational AI models (e.g., NeMo) to help businesses create their own AI agents. Strengthens GPU AI inference capabilities with chips like the B200 and Grace Hopper. Promotes the AI software ecosystem to enable AI to control multiple software and systems.
Physical AI	AI understands the physical rules of the real world and applies them in areas such as intelligent robots (e.g., Amazon’s automated warehouse, Tesla’s Optimus robot), digital twin technologies (e.g., smart city simulations, robot training), autonomous driving, robot navigation, and AR/VR technologies.	Develops AI simulation platforms (e.g., Omniverse) to assist businesses in training robots and conducting digital twin simulations. Enhances the computational power of AI chips to enable real-time environmental perception and decision-making. Collaborates with industrial partners to apply AI in automated manufacturing and warehouse logistics.

Source: Researcher and Research

NVIDIA is evolving into the central supplier of the AI ecosystem, moving beyond its role as a GPU manufacturer. By actively integrating technologies such as CUDA, Omniverse, AI agents, and robotics, it is creating an irreplaceable software-hardware moat.

This shift suggests that Agentic AI could disrupt SaaS, enterprise software, marketing models, and decision-making processes, with significant implications for e-commerce strategies. More importantly, the next phase of AI development will extend beyond software, influencing hardware devices, automation applications, and physical-world AI technologies. For example, Physical AI is poised to transform industries like manufacturing, logistics, autonomous vehicles, and other automation sectors.

Next, we will delve deeper into these topics.

2. How NVIDIA Bridges Perceptual AI to Generative AI

In his keynote, Jensen Huang highlighted the technological advancements of the GeForce 5090, including a 30% reduction in size, a 30% improvement in cooling efficiency, and performance that far exceeds the 4090. We view this as a critical “computational infrastructure” for NVIDIA, bridging the development path from Perceptual AI to Generative AI.

Through advanced chip fabrication, packaging technologies, and optimized architectures, the 5090 significantly boosts GPU performance, accelerating AI computations across applications such as gaming rendering, 3D design, and AI-generated content.

NVIDIA has deeply integrated AI into the core of its GPUs, incorporating AI-assisted rendering techniques like 100% path tracing and AI-based pixel completion to demonstrate AI’s evolving role in graphics processing. This has the potential to disrupt workflows in the gaming and animation industries, while also transforming computational methods used across software and content creation sectors. This breakthrough is poised to make a lasting impact on game development, animation production, professional computing, and AI training.

3. Will the Next Wave Be Agentic AI or Physical AI?

The next phase in the evolution of Generative AI will likely move toward either Agentic AI or Physical AI. While there is no strict order, their relationship can be understood from a technological development perspective.

Agentic AI refers to AI systems with autonomous decision-making capabilities. These systems can perform tasks based on goals, environmental changes, and context, rather than simply responding to commands. It primarily develops in the “digital world,” providing AI with autonomy but without direct influence over the “physical world.”

In contrast, Physical AI involves AI systems capable of performing actions in the real world, interacting physically with their environment. Common applications include robots, autonomous vehicles, and smart factories. Physical AI depends on Agentic AI for decision-making and incorporates technologies like sensors and motion control to facilitate real-world interactions. Therefore, Physical AI can be seen as the natural evolution of Agentic AI.

Ultimately, these two areas will converge. The future of Physical AI will likely be driven by advanced Agentic AI, enabling robots and autonomous vehicles to make independent decisions and exert real-world influence. This is why NVIDIA is initially focusing on developing technologies related to Agentic AI before advancing into Physical AI.

3.1 Agentic AI Will Redefine Software Companies

Agentic AI represents the next evolution in AI, shifting from a passive “responder” to an active, autonomous decision-maker. In the future, AI will not only react to commands but will also perceive its environment, understand context, and engage in reasoning, planning, and action. It will even use tools to execute tasks. With the ability to browse the web, read texts, watch videos, and learn from these interactions, Agentic AI will evolve beyond relying on fixed datasets and will possess “self-learning” capabilities.

This transition will have a profound impact on industries like search engines, digital marketing, SEO, and e-commerce recommendation systems, as AI moves from being a supportive tool to becoming the ultimate decision-maker. For instance, AI could autonomously optimize marketing campaigns or adjust e-commerce strategies based on real-time data and insights.

Furthermore, this shift will challenge traditional SaaS systems, such as CRM and ERP software, which businesses currently rely on. AI will no longer be just an enhancement to these systems; businesses will require AI agents with decision-making abilities to directly execute tasks. Traditional software, which relies on manual inputs and predefined workflows, will no longer meet the demands, pushing companies toward more adaptive and intelligent solutions driven by Agentic AI.

3.2 Physical AI and the Robotics Revolution: A Turning Point for Manufacturing and Logistics

NVIDIA envisions Physical AI not only enabling AI to understand data but also allowing it to influence the physical world. This technological leap will drive advancements in robotics, transitioning from rigid, fixed production lines to more flexible, intelligent robots. Physical AI’s ability to understand concepts like friction, inertia, causality, and object permanence (i.e., objects don’t disappear, they are just temporarily hidden) demonstrates how AI is beginning to comprehend the physical world in a way that empowers robots to learn and adapt autonomously.

This evolution will significantly impact industries such as smart logistics, autonomous vehicles, warehouse management, and even urban planning. As AI becomes more adept at interacting with the real world, it will enhance systems reliant on precise physical interactions. Autonomous vehicles and robots within logistics and warehouse operations will be able to navigate complex environments, performing tasks with greater efficiency, safety, and flexibility.

Consequently, logistics and supply chain companies must closely track the development of robotics and AI computing solutions. These innovations are set to reshape operations and business models in industries that depend on physical tasks and movements.

4. Conclusion and Discussion

4.1 NVIDIA: From GPU Manufacturer to AI Ecosystem Leader

Jensen Huang highlighted how GeForce played a key role in promoting CUDA globally, catalyzing the rise of AI, and now, AI itself is revolutionizing the world of computer graphics. This signifies that NVIDIA has evolved beyond its roots as a graphics computing company, positioning AI computing as its central competitive advantage.

In the past, real-time graphics rendering relied heavily on path tracing technology, where each pixel was rendered mathematically, with AI inferring the other pixels. Today, AI directly participates in graphics creation, suggesting that future GPUs will likely integrate AI computing even more deeply. This shift is not only transforming gaming but is poised to extend into fields like medical imaging, scientific simulations, and 3D design.

NVIDIA has fully integrated AI with its GPUs, creating a powerful synergy between hardware and software. Moving from CUDA to AI computing, and now to AI-enhanced graphics, NVIDIA has transitioned from being a graphics card company to a leader in AI computing. AI is no longer just influencing computation methods; it now plays an active role in decision-making and data processing, understanding context, generating responses, and retrieving information to enhance understanding.

NVIDIA’s software strategy—spanning CUDA, Omniverse, and generative AI models—has become its core strength, shifting away from a hardware-centric business model. Instead, NVIDIA has locked industries into its software ecosystem, creating a robust competitive advantage. Specifically:

CUDA forces developers to run AI workloads on NVIDIA chips, reinforcing its market dominance.
AI Agents enable businesses to deploy AI decision-making systems directly on NVIDIA platforms, enhancing operational efficiency.
Omniverse leads the charge in the industrial metaverse, making NVIDIA the exclusive leader in this space.
Physical AI opens new opportunities in robotics, positioning NVIDIA as a key player in the automation future.

Through this integrated hardware-software strategy, NVIDIA has evolved from simply “selling GPUs” to controlling the entire AI industry’s computational and development landscape, making it increasingly difficult for competitors to challenge its dominance.

4.2 How Will the AI Industry’s Competitive Landscape Evolve?

The AI industry has evolved into a core technology domain, transitioning from a specific application to a disruptive force across multiple sectors. Each wave of AI innovation presents three primary challenges: data, training, and scalability. These factors will play a pivotal role in shaping the future competitive landscape, particularly in areas like data acquisition, computing performance, and large-scale deployment.

NVIDIA has established a near-monopoly in the AI training market, competing with industry leaders such as Google, Meta, and OpenAI in both AI training and inference. While NVIDIA maintains its dominance in AI training, the focus of future competition will shift towards inference—reducing costs, enhancing efficiency, and developing specialized chips. As ASIC (Application-Specific Integrated Circuit) technology progresses, competitors are intensifying efforts to create cost-effective solutions and secure breakthroughs in the inference market.

4.2.1 High-Performance Inference Chips (ASIC vs. GPU)

NVIDIA’s GPUs remain dominant in the high-performance computing sector. However, with the rise of specialized AI chips, particularly in cloud AI inference and edge computing, the cost advantages of ASIC technology are becoming more apparent. Notable examples include Google’s TPU, AWS’s Inferentia, Meta’s custom AI chips, and Tesla’s Dojo, all of which are increasingly challenging NVIDIA’s market leadership. As these competitors’ solutions mature, GPUs may face increasing challenges from ASICs, especially in inference applications, and potentially even in training.

4.2.2 Software-Hardware Integration and Ecosystem Development

Beyond hardware competition, the AI software ecosystem has become a crucial battleground. NVIDIA leverages its ecosystem, including CUDA and TensorRT, to strengthen its market position. However, Google, AWS, and Meta are actively developing their own AI software frameworks, such as TensorFlow and PyTorch, to reduce reliance on NVIDIA’s technology. In the future, as AI software and hardware become more tightly integrated, major companies will focus on building their own ecosystems while attempting to weaken NVIDIA’s influence in the software domain. The competition will center not only on hardware performance but also on the ability to establish seamlessly integrated software ecosystems.

4.2.3 Cloud vs. Edge AI Computing

Cloud computing remains the primary platform for AI training, but the rise of Edge AI is driving a significant shift. As applications like smart vehicles, automated factories, and IoT devices expand, the demand for edge computing continues to grow. Products such as NVIDIA Jetson and Tesla’s FSD chips are increasingly establishing strong footholds in the Edge AI market. As Edge AI progresses, these chips—designed for autonomous driving, smart devices, and industrial automation—will compete with large-scale cloud AI training platforms, collectively accelerating the adoption and evolution of AI technology.

4.3 Why Do So Many Companies Challenge NVIDIA Despite Its Unshakable Dominance?

NVIDIA’s unshakable dominance in AI stems from its highly integrated ecosystem, making it difficult for competitors to surpass in the short term. However, many companies continue to challenge its position, reflecting the complexity of the industry’s competitive landscape.

Technology evolves rapidly, and different applications have distinct hardware requirements. In some areas, specialized ASICs offer more cost-effective solutions than GPUs. Consequently, companies like Google, Meta, and AWS are developing their own custom chips to reduce reliance on NVIDIA’s products.

In summary, while NVIDIA’s leadership in AI remains formidable, competitors persist in challenging its position due to the diverse hardware demands across industries and applications, as well as the drive for more cost-efficient alternatives. This means NVIDIA will continue to face competition, particularly in the AI inference market.

This article is part of our Future Scenarios and Design series.
It explores how possible futures take shape through trend analysis, strategic foresight, and scenario thinking, including shifts in technology, consumption, infrastructure, and business models.

See more in this category, or explore more notes here.

The post NVIDIA’s Leadership in AI: Key Insights from Jensen Huang’s GTC Keynote appeared first on Researcher and Research.

OpenAI’s Trademark Strategy: The Potential Move into the Hardware Market

Jane Hsu — Thu, 20 Mar 2025 08:52:34 +0000

OpenAI’s Trademark Strategy: The Potential Move into the Hardware Market

OpenAI, now a focal point in the global AI tech sector, has recently registered trademarks in areas such as humanoid robots, VR headsets, AR glasses, smart jewelry, and smartwatches. These actions seem to hint at the company’s future growth trajectory. We believe that OpenAI’s trademark registrations are driven by several considerations: expanding its product line and market influence, protecting its brand in the face of competition, adapting to the trend of AI technology merging with hardware, exploring emerging fields and future technologies, and seeking collaboration and partnership opportunities.

However, we argue that these moves should be seen as strategic actions to strengthen OpenAI’s AI ecosystem rather than an indication of a full-scale entry into the consumer hardware market. The core objective is likely to maintain flexibility for future hardware ventures while enhancing the computational power of its AI models, thereby solidifying its competitive advantage in the global AI space.

As artificial intelligence (AI) continues to develop rapidly, OpenAI has become a central player in the global tech scene. Renowned for its cutting-edge AI technologies, such as the GPT series and deep learning capabilities, OpenAI has made significant strides in AI software. Recently, however, the company has taken steps in the hardware sector, registering multiple trademarks for products related to humanoid robots, virtual reality (VR) headsets, augmented reality (AR) glasses, smart jewelry, and smartwatches. These actions have sparked industry attention and suggest that OpenAI may be positioning itself to expand beyond software.

Our Perspective

1. Overview of OpenAI’s Trademark Registrations

According to reports from Forbes, OpenAI has recently registered trademarks related to hardware devices across several categories, including:

Humanoid Robots: OpenAI may be exploring how to integrate its AI systems into humanoid robots to enhance their intelligence and interactivity.
Virtual Reality (VR) Headsets and Augmented Reality (AR) Glasses: These devices rely on advanced computer vision and AI technologies to create immersive experiences. OpenAI could be planning to incorporate its AI technology into such devices to boost their computational performance and improve user interaction.
Smart Jewelry and Smartwatches: These wearables combine biometric sensors and health monitoring technologies. OpenAI’s trademark registrations suggest an interest in high-end wearables, potentially featuring AI-driven smart assistants.

2. Potential Considerations Behind OpenAI’s Trademark Registrations

Based on OpenAI’s recent trademark activities, we can infer several possible motivations behind these moves:

2.1 Expanding Product Line and Market Influence

While OpenAI has long been recognized for its AI software, these trademark registrations suggest a proactive move to extend its product line into hardware. This expansion could enhance its brand image, transitioning from a purely software-focused entity to a company deeply integrated with essential hardware in daily life.

2.2 Brand Protection and Market Competition

Trademark registrations primarily serve to protect a brand, preventing competitors from claiming similar market spaces. As more companies rush into the smart hardware market, OpenAI’s actions not only protect its future hardware products’ market dominance but also guard against competition and potential infringement in the same areas.

2.3 The Trend of Merging AI Technology with Hardware Devices

OpenAI’s excellence in natural language processing and deep learning provides a solid foundation for its expansion into hardware. Products such as humanoid robots, VR/AR headsets, and smart wearables require advanced computational capabilities and intelligent interaction. If these hardware products successfully integrate OpenAI’s AI technology, they could capture significant market share and establish a strong competitive edge.

2.4 Exploration of Emerging Fields and Future Technologies

Recent trademarks related to VR and AR indicate that OpenAI is exploring the virtual and augmented reality sectors. As these fields develop, they are poised to become key directions for the tech industry. OpenAI’s trademark registrations demonstrate its keen interest in this area and possibly signals plans to develop VR/AR solutions integrated with AI technologies, opening up new markets.

2.5 Collaboration and Partnership Opportunities

OpenAI may seek partnerships with existing hardware manufacturers or tech companies to co-develop AI-integrated smart hardware devices. By registering these trademarks, OpenAI is protecting its brand while paving the way for future collaborations, creating opportunities for joint technology and product development.

2.6 Summary

OpenAI has recently registered trademarks across various hardware domains, ranging from humanoid robots to smart wearables, clearly showcasing its strong interest in emerging technological fields. These registrations not only indicate that OpenAI is exploring ways to integrate its powerful AI technology into hardware products, but also suggest that the company intends to expand its scope beyond software and become a comprehensive technology company. As AI technology continues to evolve, OpenAI has the potential to not only make strides in the software sector but also shine in the hardware market, potentially reshaping the future landscape of consumer electronics.

However, the question remains: does this indicate that OpenAI will actively enter the consumer hardware market, or is it simply a strategic move in its broader hardware strategy? To clarify this, we will analyze OpenAI’s intentions from a strategic perspective.

3. Discussion: Potential and Strategic Interpretation of OpenAI’s Hardware Strategy

From a strategic viewpoint, OpenAI’s recent trademark registrations suggest the company may have plans for further developments in the hardware space. But does this mean OpenAI will actively pursue the consumer hardware market? In this context, let us explore OpenAI’s hardware strategy, beginning with its collaboration with Broadcom, followed by its strategic alliance with Microsoft, and concluding with an overview of other AI companies’ hardware developments.

3.1 Collaboration with Broadcom to Develop ASIC Chips

OpenAI’s core strength lies in its advanced AI models (such as GPT-4, GPT-5), rather than hardware technology. Partnering with Broadcom to develop ASIC chips is aimed at enhancing the computational performance of AI models and reducing operational costs. ASICs (Application-Specific Integrated Circuits) are custom-designed chips that offer significant advantages in computational efficiency and energy consumption compared to GPUs, which is crucial for improving training and inference efficiency.

Currently, NVIDIA dominates the AI training and inference market, but OpenAI’s heavy reliance on NVIDIA GPUs exposes it to supply chain risks and price volatility. Developing its own ASIC chips could reduce dependence on NVIDIA and increase self-sufficiency, helping OpenAI gain a competitive edge over tech giants like Google (with its TPU) and Meta (with in-house AI chips).

3.2 Strategic Partnership with Microsoft

Microsoft is a key investor in OpenAI and provides powerful cloud infrastructure for the company. This partnership could influence whether OpenAI develops its own AI hardware independently. If OpenAI continues to collaborate with Microsoft, it is more likely to align its hardware development with Microsoft’s existing AI infrastructure (such as Azure and its proprietary AI chips) rather than launching standalone consumer hardware products. Microsoft is also actively developing AI hardware infrastructure, which could make OpenAI increasingly dependent on Microsoft, focusing on the development and innovation of AI models.

3.3 Hardware Strategies of Other AI Companies

Currently, major tech companies are actively developing AI hardware, with a focus on custom-designed hardware to enhance the computational efficiency and performance of AI models. By analyzing the strategies of these competitors, we can better understand whether OpenAI is likely to enter the consumer hardware market and the potential challenges and opportunities it might face:

3.3.1 Google: Focus on Developing TPU Chips and Expanding in AR/VR

Google’s hardware strategy began with its development of Tensor Processing Unit (TPU) chips, which are specifically designed to accelerate deep learning tasks and have been highly effective in Google Cloud. In addition to its hardware infrastructure, Google has also integrated TPU technology into consumer hardware products such as Pixel smartphones, AI PCs, and most recently, AR/VR devices. This has positioned Google as a leader in the AI hardware field, with ambitions to embed AI into everyday consumer products. Google’s expansion into AR/VR, particularly through products like Google Glass and other wearables, signals its commitment to the next generation of AI-driven hardware.

3.3.2 Meta: In-House AI Chips and Expansion into AR/VR Hardware

Meta focuses on developing its own AI hardware to reduce reliance on NVIDIA GPUs. The company has created several proprietary AI processors, which are deployed in its data centers and used for machine learning tasks. Concurrently, Meta is heavily investing in expanding its AR/VR hardware portfolio, including the Oculus VR headsets and related devices. This strategy positions Meta as a key player in the AI hardware market, particularly in virtual reality.

3.3.3 NVIDIA: Continued Leadership in the AI Market with Advanced Chips and AI Servers

As the current leader in the AI market, NVIDIA maintains its dominance in AI training and inference with its powerful GPU architecture. The company continues to release more advanced chip series optimized for large-scale data processing and AI model acceleration, further solidifying its leadership in AI training. Additionally, NVIDIA is actively expanding its AI server business, providing robust computational support to data centers worldwide, thus enhancing its influence in the AI hardware space. Like OpenAI, NVIDIA relies on efficient hardware to support its AI models, but its advantages in the hardware sector have been further strengthened.

3.3.4 Summary

These companies’ hardware strategies allow us to more fully anticipate that, with the continued advancement of AI technology, the integration of hardware and software will become the primary area of competition in the future. Compared to these companies, OpenAI’s current hardware strategy is more focused on improving the performance of AI training and inference infrastructure, rather than directly entering the consumer hardware market. Therefore, OpenAI’s current strategy appears to be a long-term plan aimed at enhancing its AI infrastructure competitiveness rather than an immediate push into consumer hardware.

4. Conclusion

OpenAI’s recent trademark registrations clearly demonstrate its interest in the hardware market. However, these moves should be viewed as strategic actions to strengthen its AI ecosystem, rather than a full-scale push into the consumer hardware market. The core objective of these efforts is to enhance the computational power of its AI models, reduce reliance on external hardware providers, and further develop more competitive AI technology.

In summary, while OpenAI is exploring the hardware space, its fundamental goal remains to maintain a competitive edge in the AI software and hardware sectors. Therefore, OpenAI’s hardware initiatives should be understood as steps to support its AI development, rather than a signal of its deep entry into the consumer hardware market. OpenAI is currently focused on enhancing its AI capabilities; however, as it consolidates its position in the AI field, the company may eventually include specialized hardware in its development strategy to further improve the performance of AI models and maintain global technological leadership in AI.

This article is part of our Global Business Dynamics series.
It explores how companies, industries, and ecosystems are responding to global forces such as supply chain shifts, geopolitical changes, cross-border strategies, and market realignments.

See more in this category, or explore more notes here.

The post OpenAI’s Trademark Strategy: The Potential Move into the Hardware Market appeared first on Researcher and Research.

Exploring Weak Signals: Broadcom’s Perspective on AI Training ASICs

Jane Hsu — Fri, 14 Mar 2025 08:39:21 +0000

Exploring Weak Signals: Broadcom’s Perspective on AI Training ASICs

In the rapidly evolving AI hardware space, discussions often center around the competition between different chip architectures, particularly GPUs and ASICs (Application-Specific Integrated Circuits). While NVIDIA’s GPU has traditionally dominated AI training, ASICs have generally been seen as more suitable for the AI inference stage. However, Broadcom’s recent commentary during its earnings call on AI training-specific ASICs has caught our attention, potentially signaling a subtle shift in the industry’s understanding of AI workloads and the role of custom chips.

Our Perspective

1. Broadcom’s Signal

Broadcom’s focus on the AI training market has sparked our reflection. It is widely assumed that ASICs are best suited for inference, which involves processing large volumes of low-latency, high-frequency operations during the deployment phase of AI models. In contrast, GPUs—especially NVIDIA’s GPUs—have been the dominant force in AI training, which requires significant computational power and data model adjustments.

However, Broadcom’s mention of AI training-specific ASICs during its earnings call seems to contradict this conventional wisdom. The company revealed that it has already developed custom accelerators for high-end clients (reportedly Google, Meta, ByteDance, and OpenAI, though unconfirmed) to train cutting-edge models. Broadcom emphasized that ASICs could scale and provide the necessary performance for training large models, indicating an untapped potential in the training market.

2. Weak Signal: Is Broadcom Noticing Future Trends?

Typically, we view NVIDIA’s GPUs as the dominant force in AI training, while ASICs are more effective in inference due to their ability to execute repetitive computations with custom design, minimizing costs and maximizing efficiency for specific models. After two years of AI training market growth, the market may shift toward inference in the coming years.

However, Broadcom suggests that its future $60B to $90B SAM will largely focus on AI training, with large clients/partners expected to adopt ASICs for training. Broadcom’s extensive technological advantages in front-end IC design, key IP integration, back-end IC layout, and wafer fabrication—along with its capabilities at the system and rack level—provide a significant competitive edge in the ASIC space.

Broadcom’s view on AI training and ASICs could be a weak signal (defined as subtle or seemingly insignificant signs pointing to a larger, widely unrecognized trend), indicating that AI training and ASICs may integrate in ways the market has not widely anticipated.

3. Driving ASIC Adoption in AI Training

Several factors may drive ASIC adoption in AI training:

3.1 ASIC vs. GPU: Efficiency Advantage

While GPUs offer flexibility and robust computational power, ASICs’ customized designs can optimize specific tasks, significantly reducing power consumption and enhancing performance for particular model training. Deep learning training ASICs could outperform general-purpose GPUs, especially for fixed and predictable workloads, such as large language model training.

3.2 Customization Demand from Large Enterprises

Massive companies like Google, Meta, and OpenAI are increasingly looking to optimize hardware specifically for certain AI workloads. ASICs’ high degree of customization can be tailored to specific training tasks, greatly boosting performance. This is crucial for these companies, which are pushing the boundaries of AI and handling cutting-edge models with enormous computational needs.

3.3 Scalability Requirements

Broadcom’s interest in AI training ASICs aligns with the demand for computational scalability from large enterprises. These companies not only need single accelerator solutions but seek to further expand to meet the demands of training large-scale models. ASICs’ scalability, especially for training large language models or other advanced AI systems, will help enhance the efficiency of these companies’ research.

4. Broadcom’s Target Market: Large Enterprises and Hyperscalers

Broadcom’s strategic focus is on large enterprises and hyperscale cloud companies, aligning with its strengths in large-scale enterprise hardware solutions. In fact, Broadcom has already secured multiple hyperscale companies as clients, which may indicate that large enterprises are open to transitioning to ASICs for model training. This suggests that AI infrastructure could undergo a transformation, shifting from GPU-dominated training workloads to a blend of ASIC and GPU solutions.

5. Future Impact: How Will the Market Respond?

If Broadcom’s forecast proves correct, the AI hardware market may undergo a transformation. The success of ASICs would represent a fundamental shift in AI training infrastructure, potentially moving away from GPUs as the primary hardware to a future where specialized accelerator chips are designed for different stages of AI development.

However, these are still early signs, and many questions remain:

Can ASICs truly deliver the efficiency advantages that GPUs cannot match?
Will large enterprises’ demand for ASICs continue to grow?
Can Broadcom maintain its leadership in this space?

These questions will shape the future of AI training and hardware customization, revealing more challenges in the market.

Summary

Broadcom’s perspective on AI training and ASICs may signal a shift in AI hardware design and deployment. Although this idea is not widely recognized in the market yet, if Broadcom’s view gains traction, AI hardware infrastructure may evolve toward more customized and specialized solutions. The competition and convergence between ASICs and GPUs will become critical. In the coming years, the use of ASICs in AI training could be a driving force that reshapes the AI infrastructure landscape. This would not only represent a technological advancement in AI hardware but also influence the strategies of large enterprises and cloud providers regarding computational demands, pushing AI training toward a more professional and customized future.

This article is part of our Global Business Dynamics series.
It explores how companies, industries, and ecosystems are responding to global forces such as supply chain shifts, geopolitical changes, cross-border strategies, and market realignments.

See more in this category, or explore more notes here.

The post Exploring Weak Signals: Broadcom’s Perspective on AI Training ASICs appeared first on Researcher and Research.

Q1 2025 semiconductor market decline: Seasonal adjustment or sign of technical recession?

Jane Hsu — Tue, 25 Feb 2025 03:59:56 +0000

Q1 2025 semiconductor market decline: Seasonal adjustment or sign of technical recession?

Major semiconductor companies generally expect revenue to decline, with an average drop of approximately 9%, significantly exceeding the historical average of 5%. To determine whether this is a temporary adjustment or a sign of a deeper downturn, we reviewed past semiconductor market troughs and believe that demand in Q1 2025 remains stable. Key factors influencing the semiconductor industry this year include demand for AI servers, corporate IT spending, and developments in the Chinese market. Overall, while the global semiconductor market may face growth deceleration, it is expected to maintain an upward trajectory. Even in a worst-case scenario, a full-scale recession is unlikely.

According to SemiWiki, the global semiconductor market reached $170.9 billion in Q4 2024, showing a 17% year-on-year growth and a 3% quarter-on-quarter increase. The total market size for the year is projected at $628 billion, up 19.1% compared to the previous year. Looking ahead to Q1 2025, major semiconductor companies are expecting a general decline in revenues due to seasonal factors, inventory excess, and economic uncertainties, with an average decrease of around 9%. Nine companies, including SK Hynix, Qualcomm, and AMD, expect revenue growth, while seven companies, including Infineon and Renesas, anticipate declines. Furthermore, the market growth forecast for 2025 ranges from 7% to 15%, with AI servers being the primary growth driver. However, weak demand from smartphones, PCs, automobiles, and industrial sectors, along with global economic risks (such as potential U.S. tariff hikes), could further impact market performance.

Our Analysis

Historical data shows that in the past decade, the global semiconductor market has experienced a decline in Q1 nine times, with an average decrease of around 5%. However, the expected 9% decline in Q1 2025 far exceeds the historical average, suggesting that this downturn may not just be a seasonal adjustment but could signal the onset of a technical recession in the semiconductor industry.

To better assess whether the Q1 2025 downturn is a short-term adjustment or a sign of deeper recession, we will review several recent semiconductor market downturns, analyzing the primary factors and demand shrinkage patterns. As shown in Table 1, we have summarized semiconductor market recession scenarios from key periods such as 2011-2012, 2015-2016, 2018-2019, and 2022-2023, along with their underlying causes and demand trends, providing a comprehensive basis for judgment.

Table 1 Key factors and demand patterns behind recent semiconductor market downturns

Industry recession year & subsequent changes	Main causes of industry decline	Industry demand decline pattern
2011-2012 Q1 2021 declined by 7% YoY, with a rebound in H2 driven by the iPhone 5 release	European Debt Crisis (debt issues in Greece, Spain, Italy, etc.) led to global demand contraction	PC market rapidly shrinks Smartphone market growth slows Enterprise IT capital expenditure decreases Memory and consumer electronics demand declines
2015-2016 Q1 2016 declined by 6% YoY, with a rebound in H2 driven by China’s infrastructure push and 4G phone adoption	Slowdown in China’s economic growth Decline in smartphone market growth	PC market declines Smartphone demand drops DRAM and NAND memory prices plummet
2018-2019 Q1 2019 declined by 10% YoY, with a recovery in H2 driven by 5G phone growth	U.S.-China trade war caused global supply chain reshuffling	Smartphone demand drops Server market temporarily declines, Intel faces CPU supply shortages Enterprise IT capital expenditure drops, server and data center markets decline DRAM memory prices plummet
2022-2023 Q1 2023 declined by 15% YoY, while AI server demand exploded in H2	High inventory and inflationary pressures, over-ordering during the pandemic, inventory began to deplete in 2022, and the Ukraine-Russia war led to a surge in global energy prices	U.S. Federal Reserve’s rapid interest rate hikes led to a contraction in the consumer market Smartphone and PC demand sharply shrinks Consumer electronics market crashes DRAM and NAND memory prices plummet

Source: Researcher and Research

Based on this analysis, we believe that the Q1 2025 downturn is more likely to be a short-term demand contraction. The key drivers for the future development of the semiconductor industry will be AI server demand, corporate IT spending, and the Chinese market, which are crucial factors influencing industry trends in the coming year.

Key factors for 2025 Q1 semiconductor industry demand and future development:

1. AI server demand

AI server demand will be the key driver of semiconductor industry growth in 2025. In 2023 and 2024, AI server demand supported growth in the wafer foundry and overall semiconductor industries. However, if the growth rate of demand drops below 10%, it could negatively affect the supply chain for wafer foundries and servers, possibly leading to a price downturn in the DRAM and NAND markets.

This means that the continued large-scale investment in AI servers by cloud service providers like AWS, Google Cloud, Meta, and Microsoft will be a critical indicator for the future of the semiconductor industry. If these companies’ IT capital expenditures slow down, orders for high-end HPC GPUs (such as NVIDIA’s H100 and B100) could decrease, reducing the utilization rate of TSMC’s CoWoS advanced packaging capacity.

Currently, Amazon, Microsoft, Google’s parent company Alphabet, and Meta are expected to spend over $320 billion in capital expenditures, a 30% increase over the $246 billion estimated for 2024. Although Microsoft has recently canceled some data center leases, it still plans to invest more than $80 billion in FY 2025 (down from an originally planned $85 billion). Additionally, NVIDIA’s data center business is still showing strong growth, and TSMC’s CoWoS capacity demand remains tight, suggesting that the AI server market demand remains optimistic. Therefore, AI server demand will be the pillar of the semiconductor industry in the short term and will determine whether Q1 2025 will experience a temporary decline.

2. Global corporate IT spending

Global corporate IT spending is another key factor influencing semiconductor industry development in 2025. Corporate IT spending directly impacts demand for servers, PCs, enterprise SSDs, and enterprise GPUs, thus driving the overall semiconductor market. Server chips (e.g., Intel, AMD, Ampere), enterprise SSDs (e.g., Micron, Samsung, SK Hynix), and networking equipment (e.g., Broadcom, Marvell, Cisco) are all affected by fluctuations in corporate IT budgets.

When corporate IT budget growth is below 3%, semiconductor market demand typically slows down. Therefore, while market research firms provide forecasts, these data may not immediately reflect changes in corporate IT spending. Observing the capital expenditure growth rate of major cloud providers is a key indicator; if this growth rate drops below 5%, it may signal a conservative approach in corporate IT spending, negatively impacting the semiconductor market.

According to Gartner’s forecast, global corporate IT spending will grow 9.8% year-on-year in 2025, reaching $5.6178 trillion. At the same time, the capital expenditure growth rate of major cloud providers is expected to continue at 30%. These figures suggest that while there may be some uncertainty in the short term, global corporate IT spending remains at a high level, helping to support semiconductor market demand.

3. China Market

The China market plays a critical role in global semiconductor demand, especially in sectors like PCs, smartphones, and servers. As one of the main markets for these products, the recovery of China’s market will be a key factor influencing the global semiconductor industry in 2025. If the Chinese economy recovers, demand in consumer electronics and cloud markets could significantly rebound, driving semiconductor demand. However, a continued economic weakness in China may lead to a market downturn similar to the 2015-2016 period.

Key indicators to observe include China’s GDP growth rate, the manufacturing PMI (Purchasing Managers’ Index), and whether the government introduces large-scale infrastructure or subsidy policies. Additionally, changes in the local smartphone market and capital expenditure growth among Chinese cloud companies (such as Alibaba, Tencent, and Baidu) are crucial reference points.

China’s PMI for January 2025 was 49.1, below the expected 50.1 and the previous 50.1, indicating weakness in the manufacturing sector. Although the National Bureau of Statistics has not yet released GDP forecasts for 2025, the IMF predicts China’s economic growth will reach 4.6%. The World Bank has also raised its forecast for China’s growth in 2025 to 4.5%, an increase of 0.4 percentage points from earlier projections.

Although a PMI below 50 and a GDP growth forecast under 4.5% typically indicate soft consumption in the Chinese market, the Chinese government’s proactive measures to promote semiconductor self-sufficiency and investments in related fields are expected to stimulate market demand, especially in telecommunications, industrial, and automotive sectors. Therefore, despite uncertainties in China’s economy, the semiconductor market still holds growth potential under the right policy-driven conditions.

Summary

Given current market trends, AI server demand remains strong, global enterprise IT spending stays high, and Chinese government support policies will drive semiconductor market growth. Considering these factors, the semiconductor market demand in Q1 2025 is expected to remain stable.

Despite uncertainties such as global economic fluctuations and policy changes, demand-side growth, particularly in AI servers and enterprise IT spending, suggests the semiconductor market will continue to develop steadily. China’s market, under policy leadership, may maintain some growth momentum, further supporting global semiconductor demand. Although growth rates may slow, the overall market still holds potential for growth.

Looking at the full year outlook, even in the worst-case scenario, the global semiconductor market may face challenges with slower growth, but it will continue to experience positive growth.

Outlook

Market demand remains highly uncertain. On the supply side, the U.S. “America First” policy will push governments worldwide to increase support for their domestic semiconductor industries, fostering regional development of the global semiconductor sector.

Historically, the semiconductor industry has relied on globalized supply chains to increase efficiency and reduce costs, with clear divisions of labor in design, manufacturing, and testing across countries. However, as nations enhance policy support for their domestic semiconductor industries (such as subsidies, tax incentives, and infrastructure support) to ensure the autonomy of key technologies, semiconductor production may increasingly cluster within regions, altering the traditional global supply chain structure.

This shift may disrupt traditional industry cycles, leading to longer supply chain adjustments, increased costs, and changes in market competition. For example, companies that once relied on overseas foundries may be forced to find new production partners or even establish domestic production capabilities, increasing the pressure on technology and capital investments. Furthermore, regional development could raise trade barriers, further impacting market liquidity and increasing business risks.

This article is part of our Taiwan Tech and Market Shifts series.
It explores how Taiwan’s tech industries are adapting to global shifts in supply chains, manufacturing, policy, and innovation.

See more in this category, or explore more notes here.

The post Q1 2025 semiconductor market decline: Seasonal adjustment or sign of technical recession? appeared first on Researcher and Research.

The graphics card market enters a golden era: AI and gaming demand driving growth

Jane Hsu — Fri, 21 Feb 2025 08:19:06 +0000

The graphics card market enters a golden era: AI and gaming demand driving growth

This year, the graphics card market is experiencing a rare opportunity, especially after the mining boom, with market dynamics becoming increasingly attractive. Key drivers include the upgrade to NVIDIA’s RTX 50 series, a surge in AI inference demand, the launch of major gaming titles, and supply chain shifts. The RTX 50 series significantly enhances performance, particularly in gaming and AI-generated imagery, driving the growth of the graphics card market. Additionally, with advancements in AI technology, especially the rise of the DeepSeek open-source series, graphics cards are playing an increasingly critical role in AI inference, further accelerating demand. The gaming market is also thriving with the release of several major titles, boosting demand for graphics cards in the PC platform.

From the perspective of Taiwan’s supply chain, the booming graphics card market is expected to bring three major changes to the global tech industry: a shift of the gaming market to the PC platform, accelerated growth in graphics card demand, and the blurring of boundaries between AI and gaming. The widespread adoption of AI technology is becoming a key driver of graphics card demand, marking a new stage in the market’s development.

In summary, the graphics card market is undergoing a pivotal transformation. The surge in AI inference demand has made graphics cards not only essential for gamers but also critical hardware for AI developers and businesses. As market demand diversifies, supply chain issues continue to have a significant impact, but the strong growth in graphics card demand signals the ongoing development of the market.

According to the Traditional Chinese version of the Industrial and Commercial Times, following the launch of NVIDIA’s next-generation RTX 50 series graphics cards, a shortage occurred, primarily due to performance issues with the RTX 5070 and 5060 chips, which required debugging. This delay pushed the mass production schedule to mid-March or late April. Additionally, the Tainan earthquake disrupted TSMC’s wafer production, further exacerbating the supply crunch.

The RTX 50 series features a new Blackwell architecture with enhanced AI and neural rendering capabilities, leading to a surge in demand for the RTX 5090 and 5080 models. Despite strong demand, supply remains unstable due to technical challenges and natural disasters.

Furthermore, the development of DeepSeek’s language models has broken through the bottleneck in large model computing power, enabling more companies to run deep learning models with the NVIDIA 4090D GPU at a low cost of around NT$100,000, while maintaining low power consumption. This has led to structural growth in AI computing power demand and boosted the need for high-end graphics cards.

In an exclusive interview, NVIDIA CEO Jensen Huang emphasized that AI will become an essential tool for learning in the future, expanding the AI ecosystem and further driving demand.

Our Perspective

This year marks a rare opportunity in the graphics card market, especially after the mining boom. Several key factors explain why the graphics card market is so intriguing this year:

1. RTX 50 series upgrade and AI performance boost

The significant performance upgrade of NVIDIA’s RTX 50 series graphics cards is the biggest market driver. Specifically, DLSS 4.0 and the shift from CNN models to Transformer models not only enhance gaming performance but also boost the demand for AI-generated imagery and computing. For gamers, the cost-performance ratio of graphics cards has significantly improved. For businesses and developers, the rise in AI model demand brings new opportunities, further propelling global graphics card market growth.

2. Explosion in AI inference demand

With the rapid growth of AI, particularly the rise of the DeepSeek open-source series, demand for graphics cards has surged. Powerful graphics card performance supports efficient AI inference operations, and demand is rapidly increasing in markets like China. AI technology has penetrated various fields, from entertainment to enterprise applications, continuing to drive the demand for high-performance graphics cards.

3. Launch of major gaming titles

With the release of several highly anticipated games, the demand for graphics cards will rise significantly. The improved visual quality of games increases gamers’ need for powerful graphics cards. These releases not only directly boost graphics card sales but also indirectly heighten awareness of GPU technology, further driving market demand.

4. Supply chain shifts and graphics card shortages

The shortage of graphics cards reflects strong market demand, especially for new models like the RTX 50 series. This supply gap has intensified consumer purchasing desire and driven up prices, making the market more dynamic. Supply chain changes are affecting pricing and supply conditions, potentially altering the competitive landscape.

5. Changes in PC gaming and console marketsShare

The growing share of the PC gaming market reflects increased consumer demand for high-performance hardware. Gamers now expect to run games on PCs while multitasking, which further drives the demand for graphics cards. While the shift in PC gaming market share contributes steadily to graphics card demand, its impact is secondary compared to the AI inference demand.

6. Recovery of the Chinese market

With the recovery of graphics card demand in China, especially after the easing of policy restrictions, the global market is expected to experience further growth. The simultaneous launch of the RTX 5090 China edition, along with AI performance discounts, will further stimulate demand in the Chinese market and contribute to global market expansion.

In conclusion, this year’s graphics card market presents a unique opportunity. With the release of major games and the steady growth of the PC gaming market, graphics card upgrades will be a top choice for many players, especially driven by the demand for higher resolution gaming.

From our observation of Taiwan’s supply chain, we believe the hot trend in the graphics card market will bring three significant changes to the global tech industry:

1. Gaming market shifting to PC platforms

With major game releases like Monster Hunter Wilderness, the gaming market is gradually shifting towards the PC platform. This will encourage global game developers to invest more in PC gaming, potentially sparking innovation targeted at this market.

2. Accelerated graphics card demand

The launch of NVIDIA’s RTX 50 series and AMD’s RX 7000 series has not only boosted graphics card performance but also sparked strong demand among gamers and AI developers. This wave of interest has attracted not only gaming enthusiasts but also accelerated hardware upgrades in sectors like AI research and data centers, which rely heavily on GPU resources. As a key hub for global graphics card manufacturing, Taiwan will play a crucial role in meeting this surging market demand.

3. Blurring boundaries between AI and gaming

From the rise of DeepSeek’s open-source models to the performance boosts brought by RTX 50 series for AI productivity, we see the boundaries between AI and gaming gradually blurring. This not only increases the demand for graphics cards but also highlights the strong need for graphics cards in hardware technology. Taiwan’s graphics card suppliers will become key partners in the global AI inference and game development sectors.

Conclusion

The graphics card market is undergoing a critical transformation. While gaming remains a key demand driver, the surge in AI inference demand has expanded the role of graphics cards beyond gaming, positioning them as essential hardware for AI developers and enterprises. The increasing overlap between gaming and AI applications is diversifying the demand for graphics cards.

Supply chain issues also have a significant impact, particularly in the high-demand graphics card sector. The shortage and price hikes reflect the sharp rise in market demand, especially for high-performance models like the RTX 50 series, driven by both gaming and rapidly growing AI inference needs.

As AI technology continues to rise, particularly with the DeepSeek open-source series, powerful graphics card performance becomes critical for efficient AI inference. This shift aligns with the changing dynamics of the graphics card market, where AI technology is now a primary driver of demand. The soaring need for GPUs to support AI computing is opening new opportunities for the graphics card market’s growth.

This article is part of our Taiwan Tech and Market Shifts series.
It explores how Taiwan’s tech industries are adapting to global shifts in supply chains, manufacturing, policy, and innovation.

See more in this category, or explore more notes here.

The post The graphics card market enters a golden era: AI and gaming demand driving growth appeared first on Researcher and Research.

AI chip market evolution: Cloud vs. edge in training and inference Part 2: Edge training, inference, and market trends

Jane Hsu — Wed, 19 Feb 2025 13:38:49 +0000

AI chip market evolution: Cloud vs. edge in training and inference

Part 2: Edge training, inference, and market trends

The AI chip market is undergoing significant transformations, which can be understood through two key dimensions: deployment environments (cloud vs. edge) and market segments (training vs. inference).

Cloud-based training currently dominates the market and is expected to maintain strong growth in the future. Training is critical for AI model development, requiring immense computational power to process vast amounts of data, which is why it is primarily concentrated in cloud data centers. NVIDIA is the dominant player in this space, but competition from AWS, Google TPU, and Meta could reshape the market landscape in the coming years.

Alongside training, the cloud inference market is also experiencing rapid growth. Inference refers to applying trained AI models to real-world scenarios and making decisions based on real-time data. The increasing demand for inference is driven by advancements in AI models and emerging application scenarios. While inference requires lower computational power than training, it demands high accuracy and low latency, making cloud-based inference expansion essential for scaling AI applications.

On the edge, both edge training and edge inference cater to demands for low latency and data privacy. Compared to cloud-based training, edge training remains a smaller market but is expected to grow with the rise of smart devices and autonomous vehicles. Meanwhile, edge inference focuses on real-time, on-device processing, which is crucial for applications such as autonomous driving and other latency-sensitive use cases. While the edge market remains a niche segment, its importance will increase as these specialized applications continue to evolve.

In conclusion, training and inference in both cloud and edge environments present unique challenges and opportunities. Companies capable of delivering high-performance, cost-effective solutions in these areas will have the potential to challenge existing market leaders.

Following our previous discussion on the cloud AI training and inference market, this article will focus on the on-premises AI chip market for training and inference.

Our Analysis

Compared to the cloud market, on-premises AI solutions offer distinct advantages in low latency and data privacy. As emerging applications such as autonomous vehicles and smart devices grow, on-premises AI training and inference are expected to be key drivers of future market expansion. This article will analyze the trends shaping this segment and explore efficient, cost-effective solutions to address competitive challenges. Finally, we will summarize the key findings from both articles in a comparative table and discuss the future trajectory of the AI chip market, highlighting strategic areas that potential competitors and market players should closely monitor.

3. Edge Training

In the AI chip market, edge training represents the smallest segment, accounting for approximately 3–5% of the market. Edge training requires high computational efficiency and is highly sensitive to latency. These applications must perform AI training at the edge rather than relying on cloud-based computing resources. Examples include autonomous vehicles that require real-time learning and adaptation of their perception systems, industrial machines and robots that train and adapt based on local data, wearable devices such as smartwatches and health monitoring systems that continuously learn and adjust based on user data in real time, and smart city applications that require immediate data processing.

With growing concerns over data privacy and security, businesses are increasingly opting for edge training to ensure sensitive data remains protected. In this market, NVIDIA remains the dominant supplier, while AMD holds a smaller share. If emerging players like DeepSeek can provide cost-effective AI training solutions optimized for edge devices—especially amid growing demands for data privacy and real-time processing—they could become serious challengers to NVIDIA.

4. Edge Inference

Edge inference accounts for approximately 10–15% of the AI chip market, primarily serving enterprise inference needs by running trained models on endpoint devices or edge computing systems. These applications demand low latency and real-time responsiveness, with significantly lower computational requirements than AI training. Key use cases include:

Smart security: real-time image analysis for detecting suspicious behavior or individuals
Smart home: rapid response to user commands and real-time environmental adjustments
Smart transportation: traffic monitoring, autonomous driving, and intersection surveillance
Drones – real-time image analysis for navigation and filming
Healthcare monitoring – real-time data processing to assess user health conditions
Industrial IoT – data collection and analysis to ensure smooth production operations

As concerns over data privacy and security grow, more businesses and institutions are shifting inference processing to local devices to prevent sensitive data leakage. This trend is especially prominent in industries requiring strict privacy protection, such as finance, healthcare, and government.

NVIDIA remains the leading supplier, while AMD continues to gain traction with advancements in its products. Competitors offering high-efficiency, low-power edge inference solutions—tailored to smart home, security, and industrial IoT applications—could challenge NVIDIA and AMD in this evolving market.

Conclusion

The AI chip market is undergoing rapid transformation, with different market segments showing varying growth trends and challenges, as shown in Table 1.

Table 1 AI chip market structure analysis and comparison

Domain	Market share	Major companies	Potential competitors	Future development trends	Potential challenges
Cloud training	50-70%	NVIDIA (H100, CUDA, TensorRT), Google (TPU)	AWS (Trainium), Meta, Anthropic, Tenstorrent, CSP ASIC	Market growth continues, but growth rate may slow down; technological optimization (e.g., sparsification, mixed precision training) reduces costs; expansion of generative AI applications	High AI training costs; uncertainty in technological optimization and market demand; need for high-performance chips to support training demands
Cloud inference	15-25%	NVIDIA, AMD, Intel (Habana Gaudi)	Qualcomm, Mythic, Cerebras, Groq, CSP ASIC, DeepSeek	Significant market growth; new applications (e.g., autonomous driving, financial risk assessment, medical diagnostics); lower inference costs and hardware innovations enhance efficiency	NVIDIA’s market leadership challenged; increasing data privacy and security requirements; significant investment required for hardware infrastructure innovation
Edge training	3-5%	NVIDIA, AMD	DeepSeek, Project Digits, Internal enterprise demand	Increasing edge training demand due to privacy protection focus; companies will emphasize real-time learning and enhanced perception systems	Higher hardware requirements and costs; need for low-latency and high-performance solutions
Edge inference	10-15%	NVIDIA, AMD	DeepSeek, Internal inference demand, self-built equipment	Increased edge inference demand due to growing data privacy and security concerns; growing applications in finance, healthcare, etc.	High performance and low-latency requirements; privacy protection issues; data security requirements must meet regulations

Source: Researcher and Research

The growth of the edge market and its divergence from the cloud market compel us to consider the potential impact of emerging competitors. Among these, DeepSeek’s technological innovations are particularly noteworthy, as they have the potential to disrupt the current market landscape. Although large companies like NVIDIA and Google currently dominate the market, DeepSeek’s rise—whether in hardware acceleration or breakthroughs in AI training—could significantly alter this dynamic.

Despite the ongoing emergence of Cloud Service Providers (CSPs) developing their own ASICs, ICs, and other competitors, we believe that NVIDIA continues to maintain a strong market position and technological advantage. The primary reasons for this are:

1. GPU Advantage

NVIDIA has long dominated the GPU market, with accelerators like the A100 and H100 becoming industry standards for AI training and inference. NVIDIA’s GPUs not only support training but also handle large-scale inference, playing a critical role in many AI applications. As a result, even with the competition from CSPs developing their own ASICs and ICs, NVIDIA maintains a strong advantage in applications requiring general-purpose and high-performance computing.

2. Robust Software Ecosystem

NVIDIA boasts a comprehensive developer ecosystem, including tools like CUDA, cuDNN, and TensorRT, making it easy for developers to build AI applications on NVIDIA hardware. In contrast, competitors like CSPs with in-house ASICs and DeepSeek need to invest significant time and resources in developing their own software ecosystems, giving NVIDIA a clear edge.

3. Efficient AI Computing Platform

NVIDIA’s high-performance computing (HPC) and AI platforms offer highly optimized hardware and software integration, providing powerful acceleration for a variety of AI workloads, such as natural language processing and image recognition. The optimization of these platforms gives NVIDIA a performance advantage in processing large datasets and models, surpassing other competitors.

However, NVIDIA also faces significant challenges, primarily from two forces in the specialized competition space:

1. Development of CSP-Developed Chips

In an effort to reduce reliance on third-party chip suppliers and achieve cost control and hardware customization, many CSPs have opted to develop their own chips. For example, Google’s TPU focuses on neural network inference, while Amazon’s Inferentia is optimized for inference scenarios. These in-house chips offer more efficient, cost-competitive solutions for specific applications.

2. Breakthroughs in Specialized ASICs

Some competitors may pose a threat to NVIDIA by achieving breakthroughs in specific areas, such as low-latency inference or other specialized acceleration needs, and developing highly specialized ASICs. Especially in cost-sensitive markets or those with niche requirements, NVIDIA’s high-end GPUs (such as the A100 and H100) may not be as attractive as these specialized ASIC solutions due to their higher price points.

Therefore, NVIDIA currently maintains a significant competitive advantage in the general AI training and inference market. However, if CSPs aggressively develop in-house ASICs or competitors make breakthroughs in specialized areas, NVIDIA will face increased competitive pressure. Future competition will depend on whether these challengers can surpass NVIDIA products in terms of performance, efficiency, price, and ecosystem support.

In conclusion, when discussing the AI chip market, the training and inference needs in both cloud and edge markets each present different challenges and opportunities. The development of these four areas is interwoven, and the characteristics of each can influence the future direction of the market. As AI technology evolves, businesses that can provide higher-performance, cost-effective solutions in these areas will not only effectively address current market challenges but also capture growth opportunities in the future. Such solutions have the potential to challenge the current market leaders and carve out a strong position in these rapidly developing sectors.

See more in this category, or explore more notes here.

The post AI chip market evolution: Cloud vs. edge in training and inference Part 2: Edge training, inference, and market trends appeared first on Researcher and Research.

AI chip market evolution: Cloud vs. edge in training and inference Part 1: Cloud training and inference

Jane Hsu — Tue, 18 Feb 2025 09:09:59 +0000

AI chip market evolution: Cloud vs. edge in training and inference

Part 1: Cloud training and inference

The AI chip market is undergoing significant transformations, which can be understood through two key dimensions: deployment environments (cloud vs. edge) and market segments (training vs. inference).

Cloud-based training currently dominates the market and is expected to maintain strong growth in the future. Training is critical for AI model development, requiring immense computational power to process vast amounts of data, which is why it is primarily concentrated in cloud data centers. NVIDIA is the dominant player in this space, but competition from AWS, Google TPU, and Meta could reshape the market landscape in the coming years.

Alongside training, the cloud inference market is also experiencing rapid growth. Inference refers to applying trained AI models to real-world scenarios and making decisions based on real-time data. The increasing demand for inference is driven by advancements in AI models and emerging application scenarios. While inference requires lower computational power than training, it demands high accuracy and low latency, making cloud-based inference expansion essential for scaling AI applications.

On the edge, both edge training and edge inference cater to demands for low latency and data privacy. Compared to cloud-based training, edge training remains a smaller market but is expected to grow with the rise of smart devices and autonomous vehicles. Meanwhile, edge inference focuses on real-time, on-device processing, which is crucial for applications such as autonomous driving and other latency-sensitive use cases. While the edge market remains a niche segment, its importance will increase as these specialized applications continue to evolve.

In conclusion, training and inference in both cloud and edge environments present unique challenges and opportunities. Companies capable of delivering high-performance, cost-effective solutions in these areas will have the potential to challenge existing market leaders.

This article, “AI Chip Market Evolution: Cloud vs. Edge in Training and Inference,” is structured into two parts to provide a comprehensive analysis of the market’s diverse landscape and future trends.

Part 1: Cloud Market
This section explores Cloud Training and Cloud Inference, examining their growth potential, key industry players, and evolving competitive dynamics.

Part 2: Edge Market
This section delves into Edge Training and Edge Inference, focusing on low latency, data privacy, and emerging applications, while assessing their technological advancements and market opportunities.

By adopting this structured approach, the article offers a clearer understanding of the AI chip market’s distinct segments, delivering valuable insights to industry stakeholders for strategic planning and competition.

Our Analysis

When examining the future of the AI industry, understanding the market structure is essential. The AI landscape can be broadly categorized into four key markets: Cloud Training, Cloud Inference, Edge Training, and Edge Inference. While each of these markets operates with distinct mechanisms and applications, they are deeply interconnected, forming the foundation of AI technology.

Cloud Training: The Core of AI Development
Cloud training serves as the backbone of AI advancement, responsible for large-scale data processing and model training. As data volumes continue to grow and computational demands rise, cloud training remains the cornerstone of AI’s rapid technological progress.
Cloud Inference: Enabling Real-Time AI Applications
Once AI models are trained, cloud inference plays a crucial role in applying them to real-world scenarios. Unlike training, inference focuses on enabling AI to deliver rapid, real-time responses. The evolution of cloud inference is tightly linked to cloud training, and as demand for AI-powered solutions expands, the cloud inference market continues to grow.
Edge Training & Edge Inference: Addressing Privacy and Latency Needs
The edge AI market—comprising edge training and edge inference—is gaining momentum, particularly in applications requiring high data privacy and low latency. By shifting training and inference processes from the cloud to local devices, edge AI enhances security and reduces response times. In certain use cases, edge solutions offer distinct advantages over cloud-based alternatives.

These four market segments not only highlight how AI technologies are applied across industries today but also signal the future direction of AI development. Despite their unique operational models, their interdependencies will continue to drive AI innovation and industry-wide progress.

This article will provide an in-depth exploration of the cloud sector within the AI chip market, focusing specifically on cloud training and cloud inference markets. As artificial intelligence technologies advance, these two markets are rapidly becoming key growth areas. Given the critical role of training and inference in the AI model lifecycle, both will significantly influence the structure of the market and future competitive dynamics. In this article, we will analyze the current state and future growth potential of the cloud training market, along with the competitive landscape of key players. Additionally, we will explore the development trends of cloud inference and its impact on emerging application scenarios. This section will lay the foundation for the subsequent discussion on the edge market and provide a comprehensive understanding of the current landscape and challenges within the AI chip market.

Among these four markets, the cloud training market forms the foundation of AI development, as large-scale data processing and model training require substantial computational power, which cloud platforms provide. With the advancement of AI technologies and the explosion of data, the cloud training market has become the core driver of AI technology growth. In the cloud training market, the deep learning process of AI models demands vast computational resources, typically relying on the infrastructure provided by large cloud service providers. The successful completion of this process sets the stage for subsequent inference and applications, fueling the rapid growth of the cloud inference market.

1. Cloud Training

The cloud training market holds the largest share of the AI chip market, estimated at 50%-70%. The demand for training large AI models such as GPT-4, Gemini, and Claude allows NVIDIA to maintain its market leadership. The H100’s competitive edge lies in its CUDA ecosystem and TensorRT, while Google TPU leverages its massive internal training clusters and networking technologies. Additionally, companies like AWS with Trainium, Meta, and Anthropic are developing their own training ASICs. While these companies are unlikely to challenge NVIDIA’s market dominance in the short term, they could change the market landscape in the long term.

The market will continue to grow, but the growth rate may slow as AI training remains expensive, and companies are looking for ways to reduce costs. These include techniques such as sparsity, mixed-precision training, more efficient AI acceleration chips, and the reusability of pre-trained models (with improvements in fine-tuning technology).

However, even if technological optimizations reduce the cost of AI training, key considerations remain:

companies still need high-performance AI training chips,
generative AI use cases are rapidly expanding, and
advancements in training technologies and changes in market demand remain uncertain.

Therefore, the training market led by NVIDIA and Google TPU will continue to experience strong demand in the short term.

If DeepSeek can provide innovative AI acceleration technologies and efficient training solutions in the cloud training market, it could challenge the established leaders, such as NVIDIA and Google TPU. Cloud training relies heavily on high-performance chips and energy-efficient technologies to handle large-scale datasets and models. If DeepSeek can offer low-cost, high-performance AI training solutions, particularly by making breakthroughs in sparsity techniques and mixed-precision training, it has the potential to disrupt the existing market structure.

After discussing the development of the cloud training market, we now turn to explore the cloud inference market. Although these two markets may seem distinct, they are actually closely related. Cloud training involves large-scale computations based on vast datasets, while cloud inference applies the trained models to real-world business scenarios. Therefore, the development of both markets is interdependent.

2. Cloud Inference

The rapidly growing cloud inference market has become the second-largest segment of the AI chip market, accounting for approximately 15-25% of the market share. This includes applications such as chatbots, speech recognition, recommendation systems, computer vision (e.g., autonomous driving perception systems), financial risk assessment, medical diagnostics, and industrial inspection.

Currently, the market is dominated by NVIDIA, although AMD has started gaining some market share through its partnerships with Meta and Microsoft. Meanwhile, Intel is maintaining its competitive edge in the inference market with its Habana Gaudi AI chips and Xeon CPUs. Tenstorrent is actively developing its own AI accelerator, attempting to challenge NVIDIA’s leadership position. Qualcomm is focusing on low-power AI inference applications, launching the Cloud AI 100 accelerator, while Mythic explores analog computing technologies, which may further impact the AI inference market in the future. Cerebras and Groq are also competing by renting out inference compute power through self-built small cloud service providers (CSPs), adding more options to the market.

Looking ahead, the cloud inference market is expected to grow significantly, driven by the following key factors:

Advances in AI models: This is the most critical factor. As more advanced AI models (such as OpenAI’s GPT-5 and DeepSeek’s R1) are released, there will be a significant increase in the demand for inference computing, which will, in turn, drive the overall market development.
Emerging application scenarios: Applications like autonomous driving, financial risk assessment, and medical diagnostics will greatly drive the demand for inference computing. These applications not only expand the scope of AI use but also increase the demand for efficient inference, further stimulating market growth.
Reduction in inference costs: As inference costs decrease, more companies will enter the market and apply AI technologies in more fields, which will be a long-term driver of market growth.
Innovation and improvement in hardware infrastructure: As specialized hardware for AI inference (such as AI ASICs and FPGAs) is developed, inference efficiency will greatly improve, further reducing costs and enhancing performance.
Stricter data privacy and security requirements: As AI is applied in sensitive fields like finance and healthcare, increasing requirements for data privacy and security will drive demand for more efficient and regulation-compliant inference technologies. This could also promote the development of related technologies, such as encrypted inference and edge computing.
Enhancement of traditional software by AI: AI’s enhancement of traditional software can significantly improve the performance of existing systems, providing more efficient tools for traditional businesses and possibly giving rise to new business models. Although the impact of this factor is indirect, it plays a stabilizing role in driving long-term market growth.

As AI technology continues to evolve, the demand for more efficient and advanced inference computing will intensify, leading to a surge in demand for inference models. We expect that the computational demand for inference models could exceed the current demand for large language models (LLMs) by more than ten times. This change in demand will drive CSPs to accelerate the development of self-designed ASIC chips to improve inference performance and reduce reliance on third-party hardware (like NVIDIA). For example, AWS aims for 50% of its chips to be self-designed ASICs, Meta plans 70%, and Microsoft could reach up to 80%.

Moreover, as the demand for efficient inference computing increases in AI applications like autonomous driving and medical diagnostics, DeepSeek has the potential to become a strong competitor by introducing accelerators or software solutions that surpass existing technologies.

This would have a significant impact on NVIDIA, which currently dominates the market. As Google’s TPUs, AWS’s Inferentia, and other specialized ASIC technologies gradually increase their market share, more competitors will join the market, making the competition increasingly fierce.

After understanding the market dynamics of cloud inference, the next piece will focus on the development of the edge market, another field that complements cloud inference. While edge training and inference are independent of the cloud market, they offer more efficient and low-latency solutions in many situations, especially in industries with high data privacy requirements, where edge solutions hold unmatched advantages.

See more in this category, or explore more notes here.

The post AI chip market evolution: Cloud vs. edge in training and inference Part 1: Cloud training and inference appeared first on Researcher and Research.