The AI pricing war just got very lopsided. A UBS report finds that roughly 60% of companies actively tracking their AI budgets are now migrating workloads toward cheaper models, with Chinese open-source alternatives leading the charge.
The reason is straightforward: certain Chinese AI models cost as little as $2 to $3 per million output tokens, compared to around $15 for comparable US models.
The numbers behind the migration
A JPMorgan analysis puts the price differential in sharper relief. Select Chinese AI models are up to 50 times cheaper per token than their American counterparts, while maintaining competitive performance across standard benchmarks.
The cost of inference for certain Chinese models runs 10 to 20 times lower than leading US models. For routine tasks like answering FAQ-style questions or generating boilerplate text, enterprises are increasingly concluding that paying premium prices for premium models is simply unnecessary.
This has given rise to a practice called “model routing,” where companies direct simple tasks to cheaper models and reserve expensive, high-capability models from firms like OpenAI and Anthropic for complex reasoning or mission-critical applications.
Who’s gaining ground
The Chinese models showing up on enterprise shortlists aren’t obscure research projects. DeepSeek, Alibaba’s Qwen, Moonshot AI’s Kimi, Zhipu AI’s GLM, and MiniMax are all gaining traction in corporate evaluations as of mid-to-late June 2026.
What these models share is an open-weight architecture that lets companies deploy them locally or through cloud catalogs. When a model is open-weight, companies can fine-tune it for specific use cases without paying ongoing licensing fees to the model provider.
Chinese developers have pursued this cost-efficient approach partly out of necessity. Hardware constraints, specifically limited access to top-tier Nvidia chips due to US export controls, have pushed Chinese AI firms to optimize relentlessly for performance-per-dollar.
What this means for investors
The hybrid strategy that enterprises are adopting—cheap models for routine work and premium models for complex tasks—has a catch for the premium providers: the vast majority of enterprise AI workloads are routine. If 80% of a company’s queries get routed to a $2-per-million-token model, the addressable market for the $15 model shrinks dramatically.
Disclosure: This article was edited by Editorial Team. For more information on how we create and review content, see our Editorial Policy.

1 hour ago
2
















English (US) ·