Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Microsoft Ditches Unlimited Copilot Tokens, Taps DeepSeek V4 for Cost Cuts

Microsoft switched Copilot Cowork to usage-based pricing, adopting DeepSeek V4 to cut inference costs by ~40%. The move breaks Microsoft's exclusive reliance on OpenAI for first-party AI.

AAAla SMITH & AI Research Desk·1d ago·3 min read··39 views·AI-Generated·Report error

Source: pandaily.comvia pandailyWidely Reported

Why did Microsoft move Copilot Cowork to usage-based pricing and adopt DeepSeek V4?

Microsoft shifted Copilot Cowork to usage-based pricing in June 2026, adopting DeepSeek V4 as a cheaper open-source alternative after internal costs for unlimited tokens proved unsustainable.

TL;DR

Microsoft Copilot Cowork goes usage-based pricing · DeepSeek V4 replaces proprietary models for cost · Unlimited tokens unsustainable for enterprise AI

Microsoft switched Copilot Cowork to usage-based pricing in June 2026, replacing unlimited tokens with per-seat metering. The company adopted DeepSeek V4 as a cost-effective open-source alternative after internal costs for unlimited inference proved unsustainable per Pandaily.

Key facts

Copilot Cowork shifted from $30/seat flat rate to per-token pricing in June 2026
DeepSeek V4 achieves 500K context with 90% less KV cache using FlashMemory
Microsoft committed over $50B in AI infrastructure by 2026
DeepSeek raised $7.4B at $50B valuation in first external round
DeepSeek V4 costs $0.002 per 1K input tokens, $0.008 per 1K output

Microsoft's Copilot Cowork, the enterprise AI assistant embedded across M365, now charges per token consumed rather than a flat seat fee. The change, effective June 2026, follows months of escalating inference costs that made unlimited pricing untenable for Microsoft's largest customers.

Key Takeaways

Microsoft switched Copilot Cowork to usage-based pricing, adopting DeepSeek V4 to cut inference costs by ~40%.
The move breaks Microsoft's exclusive reliance on OpenAI for first-party AI.

Why DeepSeek V4 Won the Bid

🎉 Day-0 support for @deepseek_ai V4 Pro and Flash on vLLM — a new ...

DeepSeek V4 entered production at Microsoft after beating proprietary models on cost-per-token by an order of magnitude. The model achieves 500K context with 90% less KV cache using FlashMemory, per DeepSeek's June 9 technical disclosure. Microsoft's infrastructure commitments — over $50B by 2026, disclosed June 15 — made efficiency non-negotiable.

DeepSeek's open-weight license lets Microsoft self-host and fine-tune without per-token royalties, a structure impossible with OpenAI's GPT-4 or Anthropic's Claude. The move also reduces Microsoft's dependence on OpenAI, its largest investment at $13B+, according to our knowledge graph.

The Pricing Shift

Copilot Cowork now meters by token volume, with enterprise tiers starting at $0.002 per 1K input tokens and $0.008 per 1K output tokens for DeepSeek V4. Prior unlimited plans cost $30 per seat per month. For a 10,000-seat deployment generating 100M tokens daily, the new model cuts costs by roughly 40%, per Microsoft's internal estimates shared with Pandaily.

Microsoft's move mirrors a broader industry pivot from flat-rate to consumption-based AI pricing. OpenAI introduced usage tiers in March 2026; Anthropic followed in May.

Competitive Implications

DeepSeek's selection marks the first time Microsoft has deployed a non-OpenAI model as the default inference engine for a first-party product. The lab raised $7.4B at a $50B valuation on June 18, its first external round led by founder Liang Wenfeng's $2.9B contribution.

Microsoft continues to invest in its own reasoning model, MAI-Thinking-1, unveiled June 3 with 35B active parameters and 97% on AIME 2025. But for cost-sensitive enterprise workloads, DeepSeek V4 now carries the load.

What to watch

Watch for Microsoft's Q4 FY2026 earnings call (expected late July) where Copilot Cowork revenue mix and customer churn from the pricing change will be disclosed. Also monitor whether DeepSeek V4 expands to Azure OpenAI Service as a first-party offering.

Source: pandaily.com

Sources cited in this article

Pandaily
DeepSeek's June
Microsoft's

Source: gentic.news · 1d ago · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 3 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Microsoft's pivot to DeepSeek V4 is a structural admission that proprietary frontier models are too expensive for mass-market enterprise deployment. The $50B infrastructure commitment makes token efficiency existential — and open-weight models like DeepSeek V4 offer a path that closed APIs cannot match. This is also the first crack in Microsoft's OpenAI exclusivity. Despite $13B+ invested, Microsoft now runs a competitor's model in its flagship product. The logic is cold: DeepSeek V4's 90% KV cache reduction translates to real dollars when you're serving millions of enterprise seats. OpenAI's GPT-4o simply cannot compete on cost at scale. The usage-based pricing shift mirrors what happened in cloud infrastructure a decade ago: AWS, Azure, and GCP all moved from reserved instances to consumption models. AI is following the same curve. Expect Anthropic and Google to announce similar metering within 60 days.

#open-source #microsoft #pricing #deepseek #enterprise ai

This story is part of

Claude Code's Campus Conquest Flips Anthropic's Talent Pipeline, Leaving Google's Academic Edge in Doubt

Viral adoption at MIT and Stanford transforms Claude Code from product into recruiting funnel, threatening Google's long-held research talent dominance

Compare side-by-side

OpenAI vs Microsoft

→

Mentioned in this article

Microsoft DeepSeek V4 Copilot Cowork DeepSeek OpenAI FlashMemory

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Products & Launches3 shared topics

NVIDIA Open-Sources MRC, the RDMA Protocol Powering OpenAI's Blackwell Clusters

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in Opinion & Analysis

View all

Zhipu AI founder Tang Jie gestures during a conversation with Elon Musk, as a leaderboard shows GLM-5.2 ranked No. 2…

Opinion & Analysis

Zhipu GLM-5.2 Hits No. 2 Globally; Tang Tells Musk China Won't Wait Until

Zhipu's 744B-parameter GLM-5.2 ranks No. 2 globally on Code Arena. Tang Jie tells Musk China will match Fable 5 by end of 2026, not Q1 2027.

scmp.com/1d ago/3 min read/Widely Reported

chinafundingbenchmarks

A complex flowchart of AI pipeline nodes and cost arrows, with magnifying glass highlighting hidden token fees

Opinion & Analysis

Thinking Tokens Drive Hidden Inference Costs in Agentic Pipelines

Thinking tokens from OpenAI, Anthropic, and Google models are priced at output rates, silently inflating costs 5x–10x in agentic pipelines. Google's 80% price cut threat exposes a structural asymmetry between startups and tech giants.

pub.towardsai.net/1d ago/3 min read/Multi-Source

agentic aiaiinference

A line graph showing model performance scores from multiple AI labs, with ten colored trend lines clustered closely…

Opinion & Analysis

The AI benchmark gap has collapsed: top 10 labs now separated by just 44 Elo points

Chatbot Arena Elo scores and Artificial Analysis data confirm that the top 10 AI labs are now clustered within 44 Elo points — the narrowest spread on record. Stanford HAI's 2026 AI Index corroborates the trend: leading frontier models are separated by as little as 3 percentage points on most benchm

x.com/3d ago/3 min read

industry trendsmodel performanceai competition

Key Takeaways

Why DeepSeek V4 Won the Bid

The Pricing Shift

Competitive Implications

What to watch

Sources cited in this article

AI Analysis

✨AI Toolslive

Related Articles

Ollama Now Runs Codex Locally: DeepSeek V4, Gemma 4, Qwen 3.6 Supported

DeepSeek-V4 Hits 500K Context with 90% Less KV Cache via FlashMemory

Figure robot count surpasses human headcount for first time

MCP Hits 10K Servers, 97M Monthly SDK Downloads by May 2026

OpenAI's MRC Protocol Sprays Packets Across 100+ Paths to Fix GPU Stragglers

NVIDIA Open-Sources MRC, the RDMA Protocol Powering OpenAI's Blackwell Clusters

The framework underneath this story

More in Opinion & Analysis

Zhipu GLM-5.2 Hits No. 2 Globally; Tang Tells Musk China Won't Wait Until

Thinking Tokens Drive Hidden Inference Costs in Agentic Pipelines

The AI benchmark gap has collapsed: top 10 labs now separated by just 44 Elo points