Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Microsoft executive at a podium announcing Copilot pricing changes, with a slide showing DeepSeek V4 logo and token…

Microsoft Ditches Unlimited Copilot Tokens, Taps DeepSeek V4 for Cost Cuts

Microsoft switched Copilot Cowork to usage-based pricing, adopting DeepSeek V4 to cut inference costs by ~40%. The move breaks Microsoft's exclusive reliance on OpenAI for first-party AI.

·1d ago·3 min read··39 views·AI-Generated·Report error
Share:
Source: pandaily.comvia pandailyWidely Reported
Why did Microsoft move Copilot Cowork to usage-based pricing and adopt DeepSeek V4?

Microsoft shifted Copilot Cowork to usage-based pricing in June 2026, adopting DeepSeek V4 as a cheaper open-source alternative after internal costs for unlimited tokens proved unsustainable.

TL;DR

Microsoft Copilot Cowork goes usage-based pricing · DeepSeek V4 replaces proprietary models for cost · Unlimited tokens unsustainable for enterprise AI

Microsoft switched Copilot Cowork to usage-based pricing in June 2026, replacing unlimited tokens with per-seat metering. The company adopted DeepSeek V4 as a cost-effective open-source alternative after internal costs for unlimited inference proved unsustainable per Pandaily.

Key facts

  • Copilot Cowork shifted from $30/seat flat rate to per-token pricing in June 2026
  • DeepSeek V4 achieves 500K context with 90% less KV cache using FlashMemory
  • Microsoft committed over $50B in AI infrastructure by 2026
  • DeepSeek raised $7.4B at $50B valuation in first external round
  • DeepSeek V4 costs $0.002 per 1K input tokens, $0.008 per 1K output

Microsoft's Copilot Cowork, the enterprise AI assistant embedded across M365, now charges per token consumed rather than a flat seat fee. The change, effective June 2026, follows months of escalating inference costs that made unlimited pricing untenable for Microsoft's largest customers.

Key Takeaways

  • Microsoft switched Copilot Cowork to usage-based pricing, adopting DeepSeek V4 to cut inference costs by ~40%.
  • The move breaks Microsoft's exclusive reliance on OpenAI for first-party AI.

Why DeepSeek V4 Won the Bid

🎉 Day-0 support for @deepseek_ai V4 Pro and Flash on vLLM — a new ...

DeepSeek V4 entered production at Microsoft after beating proprietary models on cost-per-token by an order of magnitude. The model achieves 500K context with 90% less KV cache using FlashMemory, per DeepSeek's June 9 technical disclosure. Microsoft's infrastructure commitments — over $50B by 2026, disclosed June 15 — made efficiency non-negotiable.

DeepSeek's open-weight license lets Microsoft self-host and fine-tune without per-token royalties, a structure impossible with OpenAI's GPT-4 or Anthropic's Claude. The move also reduces Microsoft's dependence on OpenAI, its largest investment at $13B+, according to our knowledge graph.

The Pricing Shift

Copilot Cowork now meters by token volume, with enterprise tiers starting at $0.002 per 1K input tokens and $0.008 per 1K output tokens for DeepSeek V4. Prior unlimited plans cost $30 per seat per month. For a 10,000-seat deployment generating 100M tokens daily, the new model cuts costs by roughly 40%, per Microsoft's internal estimates shared with Pandaily.

Microsoft's move mirrors a broader industry pivot from flat-rate to consumption-based AI pricing. OpenAI introduced usage tiers in March 2026; Anthropic followed in May.

Competitive Implications

DeepSeek's selection marks the first time Microsoft has deployed a non-OpenAI model as the default inference engine for a first-party product. The lab raised $7.4B at a $50B valuation on June 18, its first external round led by founder Liang Wenfeng's $2.9B contribution.

Microsoft continues to invest in its own reasoning model, MAI-Thinking-1, unveiled June 3 with 35B active parameters and 97% on AIME 2025. But for cost-sensitive enterprise workloads, DeepSeek V4 now carries the load.

What to watch

Watch for Microsoft's Q4 FY2026 earnings call (expected late July) where Copilot Cowork revenue mix and customer churn from the pricing change will be disclosed. Also monitor whether DeepSeek V4 expands to Azure OpenAI Service as a first-party offering.


Source: pandaily.com


Sources cited in this article

  1. DeepSeek's June
  2. Microsoft's
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 3 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Microsoft's pivot to DeepSeek V4 is a structural admission that proprietary frontier models are too expensive for mass-market enterprise deployment. The $50B infrastructure commitment makes token efficiency existential — and open-weight models like DeepSeek V4 offer a path that closed APIs cannot match. This is also the first crack in Microsoft's OpenAI exclusivity. Despite $13B+ invested, Microsoft now runs a competitor's model in its flagship product. The logic is cold: DeepSeek V4's 90% KV cache reduction translates to real dollars when you're serving millions of enterprise seats. OpenAI's GPT-4o simply cannot compete on cost at scale. The usage-based pricing shift mirrors what happened in cloud infrastructure a decade ago: AWS, Azure, and GCP all moved from reserved instances to consumption models. AI is following the same curve. Expect Anthropic and Google to announce similar metering within 60 days.
This story is part of
Claude Code's Campus Conquest Flips Anthropic's Talent Pipeline, Leaving Google's Academic Edge in Doubt
Viral adoption at MIT and Stanford transforms Claude Code from product into recruiting funnel, threatening Google's long-held research talent dominance
Compare side-by-side
OpenAI vs Microsoft
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Opinion & Analysis

View all