DeepSeek LLM: Revolutionizing AI Inference Costs
DeepSeek, a Chinese AI startup backed by High-Flyer, a quantitative hedge fund, recently launched its latest large language model (LLM), DeepSeek-R1. This release has generated significant attention for its impressive performance and remarkably low AI inference costs, potentially disrupting the LLM market.
For investors tracking AI infrastructure and cloud computing, DeepSeek's pricing breakthrough raises critical questions about the sustainability of current AI lab business models, the pricing power of hyperscalers like Microsoft, and the long-term demand trajectory for GPU infrastructure from NVIDIA and others. If DeepSeek's cost structure is replicable, we may be witnessing the early stages of margin compression across the AI value chain.
DeepSeek: Lower Inference Costs Disrupt the LLM Market
DeepSeek is making a notable impact by offering high-quality AI inference at highly competitive rates. The blended cost 1 of the R1 model is $0.65 per 1 million tokens, substantially lower than the $26.25 for OpenAI’s o1 model while maintaining comparable quality.

While the price comparison with the OpenAI o1 model makes for compelling headlines, it is important to recognize that the o1 model's pricing is an outlier, significantly higher than that of other available models. A more suitable comparison would be with OpenAI's o3-mini model, priced at $1.93 per 1 million tokens. Although less expensive, the o3-mini is still of a similar quality. DeepSeek’s pricing is about 66% lower than that of the o3-mini.
Moreover, DeepSeek's performance matches that of leading OpenAI models regarding quality. As illustrated in the graph above, DeepSeek represents a considerable improvement over other models available at its price point, particularly when compared to Google's suite of Gemini models and Alibaba's Qwen2.5. This significant difference in DeepSeek LLM inference costs signals a new competitive dynamic.
Get Started
Monitor how AI labs and hyperscalers discuss pricing power, competitive positioning, and margin sustainability as inference costs decline.
Get started with 15 companies free. No sales calls, no demos required.
The Truth About DeepSeek's LLM Training Costs
There has been much discussion regarding DeepSeek's claim of spending only $5.6 million on training. However, focusing solely on this figure is misleading.
This amount pertains to the training run of a precursor model called DeepSeek V3. The naming conventions in AI can often cause confusion, even among experts. Additionally, the figure represents only the hardware utilization for training and focuses solely on a specific subset of the total costs. It does not account for expenses related to data acquisition, pre-training, innovation training (the advancements in R1 compared to V3), post-training, and research and development. The total expenditure is likely much higher when these additional factors are considered.
No other AI lab—certainly not OpenAI—shares its hardware training costs, making it challenging to gauge how significant a portion of the total model development costs these training expenses truly comprise. Rumors suggest that models like GPT-4o have estimated training costs of around $10–$50 million rather than the hundreds of millions or billions often cited. While DeepSeek's training efficiency is impressive, it may not be as groundbreaking as some narratives suggest.
DeepSeek's Impact on AI Labs and the Inference Business Model
There is no evidence to suggest that DeepSeek is subsidizing its inference prices below its marginal operating expenses. Although the company operates a successful hedge fund in China, it does not have any outside investors in its AI lab and is unlikely capable of sustaining operating losses at the scale of some venture-backed AI labs.
In contrast, OpenAI reportedly offers its models at or near break-even relative to its marginal operating expenses. The company has incurred losses of $5 billion in FY24 on $3.7 billion of revenue, and CEO Sam Altman has publicly acknowledged that the company is losing money on its “pro” level subscription. This is further underscored by the apparent need to invest $500 billion over the coming years into project “Stargate” to support model training and inference.
Therefore, it is reasonable to conclude that both companies are operating around a break-even point for their respective marginal operating expenses. Given the substantial pricing differences between OpenAI's and DeepSeek's models, it is thus likely that DeepSeek operates at a significantly lower cost than OpenAI. Such a significant disparity is feasible only if DeepSeek has achieved breakthroughs that enable it to run inference at a substantially lower cost. This directly challenges the current AI labs business model.
As a result, we can anticipate a pricing race to the bottom among AI labs, where most will only be able to recover their marginal operating expenses. Amortizing model development costs will present a considerable challenge for many companies, as it has been thus far. This poses a significant LLM market disruption.
Get Started
Track how Microsoft, NVIDIA, Oracle, and other AI infrastructure players adjust their guidance as inference pricing dynamics shift.
Get started with 15 companies free. No sales calls, no demos required.
Investment Implications: From Pricing Power to Margin Compression
What does DeepSeek's cost breakthrough mean for investors in Microsoft, NVIDIA, Oracle, and other AI infrastructure plays?
If DeepSeek's efficiency gains can be replicated across the industry, several investment themes require reassessment:
AI Lab Margins and Revenue Sustainability
OpenAI's reported $5B loss on $3.7B revenue suggests that current AI lab business models depend on significant pricing power. DeepSeek's 66% cost advantage relative to comparable OpenAI models threatens this assumption. For Microsoft, which has invested over $13B in OpenAI and resells AI services through Azure, deteriorating pricing power would directly impact Azure's AI revenue growth and margin profile.
Hyperscaler AI Revenue Expectations
Cloud providers have built substantial revenue projections around AI workload monetization. If inference costs decline by 50-70% due to efficiency breakthroughs, hyperscalers face a choice: pass savings to customers (compressing revenue growth) or maintain prices (risking customer migration to more efficient alternatives). Either outcome challenges current valuation assumptions.
GPU Demand Trajectory
Lower inference costs mean enterprises and AI labs can achieve the same performance with fewer GPUs. While this doesn't eliminate GPU demand, it could significantly flatten the growth curve. For NVIDIA, which has seen extraordinary demand for H100 and upcoming Blackwell chips, any moderation in hyperscaler GPU procurement would be material.
Competitive Response Timeline
The critical unknown is how quickly OpenAI, Anthropic, Google, and others can match DeepSeek's efficiency. If the answer is "quarters, not years," we may see rapid price compression across AI services. Professional investors should monitor earnings call commentary, capex guidance revisions, and any changes to AI product pricing.
Get Started
Get real-time analysis of guidance changes, sentiment shifts, and financial impacts across Microsoft, NVIDIA, Oracle, and the AI infrastructure ecosystem.
Get started with 15 companies free. No sales calls, no demos required.
How to Track These Developments
DeepSeek's cost breakthrough is not a conclusion—it's a catalyst for industry-wide repricing. Professional investors need tools to monitor:
- Pricing adjustments from OpenAI, Anthropic, and other AI labs
- Guidance revisions from Microsoft, NVIDIA, and hyperscalers on AI revenue expectations
- Management commentary on competitive positioning and margin sustainability
- Capex trajectory changes as inference efficiency reduces GPU requirements
- Customer migration patterns toward more cost-efficient AI providers
Marvin Labs AI Investor Co-Pilot helps you track these signals in real time across your entire coverage universe, turning industry shifts into actionable investment insights.
Footnotes
-
Blended cost is the expected cost for 1m tokens calculated as (3 * cost of 1m input tokens + cost of 1m output tokens) / 4. Empirical evidence suggests that the 3 to 1 ratio is a sensible estimate over varying usage scenarios for LLM. ↩

Alex is the co-founder and CEO of Marvin Labs. Prior to that, he spent five years in credit structuring and investments at Credit Suisse. He also spent six years as co-founder and CTO at TNX Logistics, which exited via a trade sale. In addition, Alex spent three years in special-situation investments at SIG-i Capital.