Predictions Lab

Forecasts and trend signals from the gentic.news knowledge graph.

How to read this page

Every prediction was written by the brain after reading the news, scored 0–100 for confidence, and resolves automatically against future evidence. Sort by confidence to see strongest signals; switch to Resolved to grade past calls.

Predictive Intelligence

AI-generated predictions backed by knowledge graph analysis of 89+ news sources. Each prediction cites specific entities, relationships, and trend signals — then gets automatically verified against real outcomes.

Calibrated · Falsifiable · Auto-verified

177 predictions made.

Each one is a falsifiable claim with a deadline and a confidence score. We watch the news, log the outcome, and report calibration honestly — including when we’re wrong.

Resolved

18%

31 of 177

Pending

open forecasts

Calibrated accuracy

75.8%

partial credit incl.

76%

Calibrated

Calibration curve

Are we as confident as we should be?

X = stated confidence. Y = how often we were right. The diagonal is perfect calibration.

n = 31

Lab Perfect calibration○ point size = sample count

Active predictions

Open forecasts, sorted by calibrated confidence

16 open

Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

8000%

Nvidia and Microsoft will announce a strategic partnership by end of Q2 2026 (June 30, 2026) where Azure becomes the exclusive cloud provider for Nvidia's NIM (Nvidia Inference Microservice) platform on Blackwell instances, with integrated billing and enterprise support.

2mo16 evidence

Google will launch Gemini API with per-second billing by Q2 2026

8000%

Google will introduce per-second billing for Gemini API's Flex/Turbo tiers within 60 days, undercutting OpenAI's per-token pricing and targeting bursty agent workloads.

2mo16 evidence

Google Cloud will expose agent billing separate from Gemini

7320%

Within the next quarter, Google Cloud will make at least one agentic coding or workflow tier bill separately from core Gemini usage, either through distinct metering, a dedicated SKU, or a usage policy that clearly decouples agent actions from raw model tokens. The tell will be that Google starts pricing the workflow layer, not just the model layer.

2mo36 evidence

GitHub Copilot adds first-party MCP policy controls

7320%

Within the next quarter, GitHub will publicly ship a first-party MCP gateway or policy layer for Copilot-style workflows. The feature will be positioned around connector approval, tool allowlists, and auditability rather than raw model quality.

2mo38 evidence

Google will expose TPU pricing for agent workloads

7280%

Within the next quarter, Google will make at least one TPU pricing or billing path explicitly distinct for agentic or workflow-heavy inference, not just generic model usage. The practical signal will be a separate SKU, calculator, or documented rate card that makes long-running tool-using workloads cheaper or easier to meter than standard Gemini calls.

2mo42 evidence

Google reprices Gemini for coding workloads

7130%

Within the next quarter, Google will introduce a materially cheaper Gemini tier or usage policy aimed specifically at coding and agentic workflows. The move will be framed as developer-friendly pricing, but the real target will be Claude Code and OpenAI’s coding stack.

2mo43 evidence

Microsoft will push Copilot agents into a separate enterprise SKU

6500%

Within the next quarter, Microsoft will split at least one Copilot agent capability into a more explicitly enterprise-governed SKU or add-on, rather than leaving it as a generic Copilot feature. The tell will be admin controls, policy hooks, or tenant-level governance becoming the headline, because Microsoft needs to defend margin and control as agentic workflows get more autonomous.

2mo41 evidence

Anthropic splits Claude Code billing from Claude AI

6450%

Within the next month, Anthropic will make Claude Code materially more distinct from Claude AI in pricing or billing, with a separate seat, usage, or enterprise packaging layer. The change will not just be cosmetic: heavy coding users will be pushed into a different commercial bucket than general Claude users.

20d41 evidence

Predict with the lab

Will this happen? Cast your vote.

Your vote stays in your browser. We compare crowd intuition against the lab’s calibrated forecast.

Lab confidence: 8000%

Resolves in 2mo

Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

Pick 1 of 6

Trending signals

What’s shifting in the graph

Top movers from 7-day mention velocity.

1.microsoft
8 mentions · 7d
▲ 300%
2.data center
3 mentions · 7d
▲ 200%
3.healthcare ai
3 mentions · 7d
▲ 200%
4.china
3 mentions · 7d
▲ 200%
5.ai
10 mentions · 7d
▲ 150%

Recently resolved

What we predicted vs what happened

last 6

✅ correctsaid 6800%

Claude Agent will add GitHub repository integration within 4 weeks

Auto-verified (confidence=85%, corroboration=72%, threshold=75%, web_search=yes): The prediction that Anthropic will release native GitHub integration for Claude Agent is substantively correct. Anthropic's official platform documentation ([W6]) explicitly describes connecting agents to GitHub for cloning, reading, and creating pull requests. The official 'claude-code-action' GitHub repository ([W7]) provides PR analysis, code implementation, and issue access. While no formal blog post was found, the verification criteria allow for 'developer documentation,' which these official sources fulfill. The launch of Claude Managed Agents ([W1]) provides the service infrastructure. The prediction's core claims—repository access, PR automation, and codebase analysis—are all confirmed by primary Anthropic sources. [Evidence FOR (4): [W6] Anthropic's platform documentation at platform.claude.com shows a dedicated page for 'Accessing GitHub' under Managed Agents, confirming that agents can 'Connect your agent to GitHub repositories for cloning, reading, and creating pull requests' and 'mount a GitHub repository to your session container and connect to the GitHub MCP for making pull re...'; [W7] Anthropic's official GitHub repository features 'claude-code-action', an interactive code assistant that 'Analyzes PR changes and suggests improvements', 'Can implement code changes and create commits/PRs', and 'Accesses GitHub issues, PRs, and code context', directly fulfilling the PR automation and codebase analysis criteria.; [W1] SiliconAngle reports on April 8, 2026 that 'Anthropic launches Claude Managed Agents to speed up AI agent development', a cloud service that likely underpins the GitHub integration. | Evidence AGAINST (3): No evidence found of an official Anthropic blog post specifically announcing a native GitHub integration for Claude Agent, though the verification criteria allow for 'developer documentation' which [W6] fulfills.; [W4] The Verge reports on a Claude Code source code leak showing unreleased features, but none of the leaked features described include a native GitHub integration; this absence is weak evidence against.]

resolved Apr 26

⏱ expiredsaid 5920%

Anthropic will ship Claude Code enterprise billing within 30 days

Auto-expired: past deadline, inconclusive (confidence=0%, corroboration=0%, web_search=yes)

resolved Apr 24

✅ correctsaid 5990%

Anthropic's Claude Code becomes harder to buy standalone

Auto-verified (confidence=85%, corroboration=75%, threshold=85%): The key evidence [DB-11] directly confirms the core prediction: Anthropic removed Claude Code from the $20/month Pro plan and moved it to $100+ tiers, making it materially less standalone and pushing heavy users toward higher-tier plans. This is a visible pricing/bundling change that shifts the economic center of gravity away from a simple developer tool. While some evidence shows continued standalone use, the pricing change is a concrete manifestation of the predicted tightening. The corroboration score is high because [DB-11] is a credible source (Towards AI, aggregated across multiple feeds) and directly matches the verification criteria. [Evidence FOR (4): [DB-11] Anthropic Removes Claude Code from $20 Plan, Signals AI Pricing Shift — confirms Claude Code was removed from the $20/month Pro plan and moved to $100+ tiers, directly supporting the prediction of tightened bundling and pricing changes.; [DB-11] The same article notes this reflects high operational costs and signals a pricing shift, aligning with the prediction that economic center of gravity shifts away from a simple developer tool.; [DB-1] Anthropic published a post-mortem on Claude Code quality issues, indicating active management and potential tightening of access/usage rules. | Evidence AGAINST (4): [DB-0] Cua open-sourced a driver that allows Claude Code to drive macOS apps, suggesting continued standalone utility and ecosystem growth.; [DB-3] AgentBox SDK allows running Claude Code in any sandbox, indicating the tool remains flexible and standalone for developers.]

resolved Apr 23

✅ correctsaid 9200%

ChatGPT Commerce API Launch

Auto-verified (confidence=90%, corroboration=85%, threshold=75%, web_search=yes): The prediction is verified by OpenAI's official blog post (WEB-6) announcing 'Powering Product Discovery in ChatGPT' with the Agentic Commerce Protocol for product discovery, comparison, and merchant integration, meeting the verification criteria of an official announcement. This is corroborated by a third-party article (WEB-7) detailing Stripe's integration for AI shopping in ChatGPT. The evidence confirms the substance of the prediction—commerce-specific capabilities—before the May 31, 2026 deadline, with authoritative sources providing clear confirmation. [Evidence FOR (4): [WEB-6] OpenAI blog post titled 'Powering Product Discovery in ChatGPT' announces ChatGPT introduces richer, visually immersive shopping powered by the Agentic Commerce Protocol, enabling product discovery, side-by-side comparisons, and merchant integration.; [WEB-7] Article 'Stripe's Agentic Commerce Suite Powers AI Shopping in ChatGPT ...' confirms Stripe's Agentic Commerce Suite lets brands sell through ChatGPT and Copilot via one integration, indicating a commerce-specific feature with checkout integration.; [W1] Zendrop launches a Model Context Protocol (MCP) server that gives AI assistants like ChatGPT the ability to run a store, supporting the idea of commerce integration for ChatGPT. | Evidence AGAINST (3): [DB-0] to [DB-24] No database articles mention an OpenAI commerce-specific API or agent feature; all are about unrelated topics like health advice, user growth, or competitor releases.; [DB-9] OpenAI shifts ChatGPT ads to CPC, targeting ad revenue, but this is about advertising, not a commerce-specific API for product search/checkout.]

resolved Apr 21

❌ incorrectsaid 8000%

Alibaba announces Qwen 4.0 with OpenSandbox agent platform integration at their Cloud Summit in June 2026

Auto-verified (confidence=85%, corroboration=41%, threshold=85%): The prediction's core claim is the launch of Qwen 4.0 at the Alibaba Cloud Summit. Multiple database news items from April 2026 ([DB-1], [DB-2], [DB-4]) explicitly refer to Qwen 3.6 as the latest released model, with one calling Qwen 3.6 Plus the current 'frontier model.' This directly contradicts the existence of a launched Qwen 4.0. While evidence shows Alibaba is active in AI and the Qwen series, the specific predicted entity (Qwen 4.0) has not materialized. The deadline for the 'typically June' summit has not passed, but the evidence shows a different, contradictory reality (Qwen 3.6 is the current version), moving the judgment from 'inconclusive' to 'incorrect.' [Evidence FOR (4): [DB-11] Alibaba's Qwen Hits 1B Downloads, Captures 50% of Open-Source Market (April 10, 2026). This shows the Qwen family is active and successful, providing context for a future major release.; [DB-1] Alibaba Makes Qwen 3.6 Plus API-Only, Shifts Frontier Model to Paid Access (April 19, 2026). This indicates a strategic shift towards monetizing advanced models, aligning with a potential premium Qwen 4.0 launch.; [DB-2] Qwen 3.6 Released: Free, Open-Weights Model for Local AI Coding (April 17, 2026). This confirms ongoing development and release of the Qwen series, with version 3.6 being the latest announced model. | Evidence AGAINST (3): [DB-2] Qwen 3.6 Released: Free, Open-Weights Model for Local AI Coding (April 17, 2026). This directly contradicts the prediction, as the latest announced model is Qwen 3.6, not Qwen 4.0.; [DB-1] Alibaba Makes Qwen 3.6 Plus API-Only... (April 19, 2026). This discusses Qwen 3.6 Plus as the current 'frontier model,' with no mention of Qwen 4.0.]

resolved Apr 21

⚠️ partialsaid 5290%

Anthropic will turn Claude Code into a background PR agent

Auto-verified (confidence=75%, corroboration=65%, threshold=75%, web_search=yes): The prediction specified that Claude Code would publicly ship a mode for autonomous pull-request workflows (fixing CI, responding to reviews, opening follow-up PRs) within a month, shifting it to an 'always-on repo operator.' Evidence shows Anthropic launched 'Routines' for Claude Code in mid-April 2026, which is an automation feature described as moving beyond interactive assistance. However, the verification criteria require a specific background/autonomous PR workflow for the narrow tasks listed, and the articles about 'Routines' do not explicitly confirm it handles CI fixes, review responses, or follow-up PRs end-to-end. The substance is partially met—Claude Code gained automated capabilities—but the specific predicted workflow is not verified. [Evidence FOR (5): [W0] VentureBeat article (April 14, 2026) reports Anthropic launched 'Routines' in research preview with the Claude Code desktop app redesign, describing it as a shift toward automation.; [W1] SiliconANGLE article (April 14, 2026) confirms the launch of 'Routines' in Claude Code, stating it allows automation of tasks without relying on autonomous AI agents.; [W2] Thurrott article reports a redesigned Claude desktop app supporting parallel agents for running more Code tasks simultaneously. | Evidence AGAINST (2): [DB-0], [DB-1], [DB-2], [DB-3], [DB-5], [DB-6], [DB-7], [DB-8], [DB-9], [DB-10], [DB-11], [DB-13], [DB-14], [DB-15], [DB-16], [DB-17], [DB-18] — None of these database articles mention the specific predicted feature (autonomous PR workflow for fixing CI, responding to reviews, or opening follow-up PRs).; [W3], [W4], [W5], [W6], [W7] — Web search results discuss leaks, Mac control, or general news but do not confirm the specific narrow pull-request workflow capability.]

resolved Apr 20

Cross-link

Findings library →

The discoveries, hypotheses, and observations behind these forecasts.

→

Cross-link

Knowledge graph →

See the entities and relationships powering each prediction.

→

Predictor Leaderboard

Top 30 anonymous voters · ranked by accuracy on resolved predictions

Active

Correct

Incorrect

Expired

75.8%

Accuracy (n=31)

59.6%

Avg Confidence

Methodology & Accuracy Tracking

How predictions are made

Predictions are generated by analyzing trend signals across 42+ AI news sources, enriched with knowledge graph relationships between entities (companies, people, technologies). Each prediction includes a confidence score and target date.

How accuracy is computed

Accuracy = (correct + partial × 0.5) ÷ total evaluated. All resolved predictions count — including expired ones (treated as failures). Sample size is shown next to the accuracy figure.

Verification process

Past-deadline predictions are verified via 3-layer evidence: entity-linked articles, keyword search, and web search. An AI judge evaluates evidence for and against, requiring high confidence thresholds before resolving.

Possible outcomes

Correct — prediction confirmed by evidence
Partially Correct — core thesis confirmed with caveats
Incorrect — contradicted by evidence
Expired — deadline passed, insufficient evidence

Trending Signals

ai+150%retrieval-augmented generation+100%data center+200%ai models+150%reinforcement learning+100%agentic ai+100%microsoft+300%healthcare ai+200%china+200%ai hardware+100%

Active Predictions(16)

ImpactstartupKnowledge Graph

2mo leftyesterday

Cursor will add first-party MCP governance

Within the next quarter, Cursor will ship a first-party MCP policy or connector-management layer aimed at enterprise teams. The tell will be admin controls for allowed tools, connector approval, or auditability rather than another model-quality feature.

ConfidenceTarget: Jul 31, 2026

35%Speculative

View reasoning & evidence

Reasoning: Cursor is directly competing with Claude Code, and the graph shows that rivalry is already dense enough to force platform-level differentiation. The recent wave of MCP-focused content and Claude Security launches suggests the market is moving from 'which model is best' to 'which agent runtime is governable.' Cursor cannot win a pure model race against frontier labs, so the rational move is to own the enterprise control plane around tool access and workflow policy. This would be wrong if Cursor stays consumer/developer-centric and does not expose any MCP governance primitives by the end of the quarter.

How we verify: Cursor publicly releases MCP policy controls, connector approval, or audit features for enterprise users.

CursorModel Context Protocol

Relationships:Cursor competes_with Claude CodeClaude Code uses Model Context ProtocolAnthropic developed Model Context ProtocolCursor competes_with GitHub Copilot

Events:2026-04-29: Cursor SDK Turns AI Agent Runtime into Programmable Infrastructure2026-05-01: Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

Sentiment:Cursor is under competitive pressure from Claude Code and CopilotMCP-related sentiment remains strategically positive

Momentum:Model Context Protocol: 12 mentions (declining) [velocity: 0.1x]Cursor: 8 mentions (surging) [velocity: 3.0x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

88d 10h 34m 00s

Claim: Cursor will add first-party MCP governance

Lab thinks

35%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence35%

Crowd confidence—

Eventbig techKnowledge Graph

2mo left2d ago

Google will expose TPU pricing for agent workloads

ConfidenceTarget: Jul 29, 2026

55%Possible

View reasoning & evidence

Reasoning: Google just opened TPU sales to select customers and raised capex forecasts on 2026-04-30, which usually precedes a more explicit monetization layer. The graph also shows Google’s AI infrastructure momentum accelerating while keyword surges around "data center" and "AI infrastructure" are spiking, suggesting the company wants to convert hardware control into pricing leverage. Because Google is already competing with OpenAI and Anthropic on both models and infrastructure, a separate agent billing path is the cleanest way to defend margin without a headline-grabbing model launch. This would be invalidated if Google keeps TPU access purely bespoke and never surfaces a public agent-specific pricing construct by end of quarter.

How we verify: A public Google pricing page, cloud console, or documented rate card shows a TPU or Gemini billing option specifically labeled for agentic/workflow-heavy usage, with distinct pricing or metering from standard inference.

Google

Relationships:Google developed Gemma 4Google developed GeminiGoogle competes_with NvidiaGoogle developed Gemini 3 ProAnthropic competes_with GoogleOpenAI competes_with GoogleGoogle competes_with OpenAIGoogle competes_with Anthropic

Events:Google: Funding $5B+ Texas data center for Anthropic with 500MW by 2026 (2026-12-31)Google's $5B+ Texas data center investment for Anthropic, scheduled for completion by 2026Google Opens TPU Sales to Select Customers, Raises Capex Forecast (2026-04-30)Google: Google's $5B+ Texas data center investment for Anthropic, scheduled for completion by 2026 (2026-12-31)

Sentiment:Sentiment toward AI infrastructure: +0.85Sentiment toward Google: +0.9

Momentum:Google: 45 mentions [velocity: 0.9x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

86d 10h 34m 00s

Claim: Google will expose TPU pricing for agent workloads

Lab thinks

55%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence55%

Crowd confidence—

Impactbig techKnowledge Graph

2mo left1w ago

OpenAI will undercut coding access again, but only in API form

Within the next quarter, OpenAI will reduce effective pricing or expand usage limits for at least one coding-relevant API tier, but it will not do so through a broad ChatGPT discount. The move will be narrowly aimed at developer retention, not consumer growth, and will look more like a tactical API response than a product reset.

ConfidenceTarget: Jul 24, 2026

55%Possible

View reasoning & evidence

Reasoning: OpenAI is under direct pressure from Anthropic's Claude Code momentum and the graph's repeated OpenAI-vs-Anthropic competitive edges, while the current news already shows GPT-5.5 is expensive and still imperfect. That combination usually produces selective price relief in the developer channel first, because it is the fastest way to defend usage without reworking the consumer bundle. If OpenAI keeps pricing unchanged across coding APIs for the whole quarter, this call is wrong; if it cuts effective cost on one coding tier, it confirms the defensive move.

How we verify: OpenAI publicly lowers effective pricing, increases limits, or adds a cheaper coding-relevant API tier visible on its pricing page or in documented usage terms.

OpenAIChatGPT

Relationships:OpenAI hired Sam AltmanOpenAI developed GPT-5.3OpenAI developed ChatGPTOpenAI developed GPT-5.2 ProOpenAI competes_with GoogleOpenAI developed GPT-3.5Anthropic competes_with OpenAIOpenAI developed GPT-4o

Events:OpenAI: Targets deployment of first 'AI intern' by September 2028 (2028-09-01)OpenAI: Forecasts $121 billion in AI research hardware costs for 2028 (2028-12-31)OpenAI: Targets $2.4B revenue this year and $11B by 2027 from its new performance advertising platform. (2027-12-31)Current news: GPT-5.5 tops benchmarks but costs 2x API priceOpenAI: 82 recent / 492 total

Sentiment:OpenAI competitive pressure is risingMicrosoft sentiment dropped from 0.31 to 0.11

Momentum:OpenAI: 82 mentions [velocity: 0.9x]ChatGPT: 20 mentions [velocity: 0.8x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

81d 10h 34m 00s

Claim: OpenAI will undercut coding access again, but only in API form

Lab thinks

55%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence55%

Crowd confidence—

Trendbig techKnowledge Graph

2mo left1w ago

Google Cloud will expose agent billing separate from Gemini

ConfidenceTarget: Jul 24, 2026

55%Possible

View reasoning & evidence

Reasoning: Google is already under pressure from the graph's 200% surge in 'google cloud' and the recent $5B+ Texas data center commitment for Anthropic, which signals that infrastructure is becoming strategic, not just capacity. At the same time, the current news shows Google is investing heavily in agentic retail and cloud workflows, which usually forces a packaging change once usage becomes operational rather than experimental. If Google keeps everything bundled, this prediction fails; if it introduces a separate agent meter, seat, or workflow SKU, it confirms the shift.

How we verify: A publicly visible Google Cloud or Gemini pricing/packaging change introduces a separate billable tier, meter, or SKU for agentic workflow usage distinct from standard model token billing.

Gemini

Relationships:Google developed GeminiGoogle competes_with OpenAIGoogle Cloud is surging in recent articlesGoogle competes_with Anthropic

Events:Google: $5B+ Texas data center investment for Anthropic, scheduled for completion by 2026Google Cloud keyword surge: +200% in 7d

Sentiment:Sentiment toward Microsoft: 0.31 → 0.11 (negative, shift=-0.20)

Momentum:Gemini: 16 mentions (declining) [velocity: 0.3x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

81d 10h 34m 00s

Claim: Google Cloud will expose agent billing separate from Gemini

Lab thinks

55%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence55%

Crowd confidence—

EventproductKnowledge Graph

3w left1w ago

OpenAI will split Codex pricing from ChatGPT

Within the next month, OpenAI will make Codex materially more distinct from ChatGPT in pricing or packaging, with a separate developer-facing billing surface or usage tier. The practical result will be that coding-heavy customers stop being treated as generic ChatGPT users and start being sold a dedicated workflow product.

ConfidenceTarget: May 24, 2026

89%Very Likely

View reasoning & evidence

Reasoning: OpenAI's sentiment has softened while GPT-5.5 and Codex-related stories are surfacing in the current news stream, which usually precedes packaging changes rather than pure model launches. The graph also shows OpenAI competing with Anthropic, Claude Code, and Cursor, all of which are pushing the market toward workflow-specific products instead of general chat. That makes Codex the obvious place for OpenAI to defend developer mindshare without having to reprice ChatGPT broadly. The prediction is wrong if OpenAI keeps Codex bundled exactly as-is or only changes model quality without any visible packaging shift.

How we verify: OpenAI publicly introduces a separate Codex pricing, billing, seat, or usage tier that is distinct from standard ChatGPT packaging.

OpenAIChatGPT

Relationships:OpenAI developed GPT-4oOpenAI hired Sam AltmanOpenAI developed GPT-3.5Anthropic competes_with OpenAIGoogle competes_with OpenAIOpenAI developed GPT-5.3OpenAI competes_with Claude CodeOpenAI competes_with Google

Events:OpenAI: Forecasts $121 billion in AI research hardware costs for 2028 (2028-12-31)OpenAI: Targets $2.4B revenue this year and $11B by 2027 from its new performance advertising platform. (2027-12-31)OpenAI GPT-5.5: 82.7% Terminal-Bench, 35.4% FrontierMath Tier 4OpenAI: Targets deployment of first 'AI intern' by September 2028 (2028-09-01)OpenAI GPT-5.5 Pricing Doubles to $5/$30 per 1M Tokens

Sentiment:Sentiment toward OpenAI: 0.11Sentiment toward GPT-3.5: 0.45

Momentum:OpenAI: 96 mentions [velocity: 1.1x]ChatGPT: 19 mentions [velocity: 0.9x]

Predict with the Lab

Resolves in

20d 10h 34m 00s

Claim: OpenAI will split Codex pricing from ChatGPT

Lab thinks

89%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence89%

Crowd confidence—

ImpactstartupKnowledge Graph

2mo left1w ago

MCP security vendors ship enterprise controls

Within the next quarter, at least two security vendors will launch MCP-specific enterprise controls such as connector approval, tool logging, or policy enforcement. The market will form around the uncomfortable fact that the same protocol making agents useful also makes them governable only if someone owns the control plane.

ConfidenceTarget: Jul 23, 2026

47%Speculative

View reasoning & evidence

Reasoning: MCP is now showing up as a strategic layer rather than a mere integration detail, and the graph explicitly records multiple discoveries about MCP becoming the new API battleground. Claude Code’s rapid rise is pulling MCP into production workflows, which creates a security problem before it creates a standards problem. If enterprise adoption keeps rising, security tooling will follow almost immediately because procurement teams will not allow unconstrained tool access into internal systems.

How we verify: At least two independent security vendors publicly release MCP-specific enterprise controls, including connector governance, policy enforcement, or audit logging.

Model Context Protocol

Relationships:Anthropic developed Model Context ProtocolAnthropic competes_with GitHub and Microsoft in agent workflowsClaude Code uses Model Context ProtocolMCP is becoming the de facto standard for agent tool integration

Events:Claude Desktop's native messaging bridge surfaced in the last 72 hoursMCP co-occurs heavily with Claude Code in the discovery layerMultiple MCP-related discoveries appeared in early April

Sentiment:Sentiment toward unconstrained agent access: cautiousSentiment toward MCP: rising strategic importanceSentiment toward enterprise controls: positive

Momentum:Model Context Protocol: 22 mentions [velocity: 0.8x]

Predict with the Lab

Resolves in

80d 10h 34m 00s

Claim: MCP security vendors ship enterprise controls

Lab thinks

47%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence47%

Crowd confidence—

Trendbig techKnowledge Graph

2mo left1w ago

Google reprices Gemini for coding workloads

ConfidenceTarget: Jul 23, 2026

72%Likely

View reasoning & evidence

Reasoning: Google is already under pressure in the graph from both Anthropic and OpenAI, and the recent signal set shows Google Cloud and agentic workflow language accelerating at the same time. The knowledge graph also includes a direct causal discovery that Google’s aggressive API price cuts are forcing margin pressure across the inference layer, which makes another pricing move plausible rather than speculative. If Google wants to win developer mindshare, price is the fastest lever; if it does not move, it risks ceding the coding-agent category to Anthropic.

How we verify: Google publicly lowers Gemini pricing or expands usage limits for a coding/agent-oriented tier by at least 20% versus the prior comparable tier.

GoogleGemini

Relationships:Google competes_with OpenAIOpenAI competes_with GoogleGoogle developed Gemini Embedding 2Google developed Gemma 4Gemini competes_with Claude Opus 4.6Google developed GeminiGoogle developed Gemini 3 ProGoogle competes_with Anthropic

Events:Google: Funding $5B+ Texas data center for Anthropic with 500MW by 2026 (2026-12-31)Google is repeatedly linked to pricing pressure in the causal discoveriesGoogle: Google's $5B+ Texas data center investment for Anthropic, scheduled for completion by 2026 (2026-12-31)Google has a $5B+ Texas data center investment for Anthropic by 2026Google Cloud and agentic AI stories surged in the last 7 days

Sentiment:Sentiment toward Google: neutral to slightly pressuredSentiment toward Gemini: subdued relative to Claude CodeSentiment toward coding/agent pricing: increasingly competitive

Momentum:Google: 59 mentions [velocity: 0.6x]Gemini: 16 mentions [velocity: 0.5x]

Predict with the Lab

Resolves in

80d 10h 34m 00s

Claim: Google reprices Gemini for coding workloads

Lab thinks

72%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence72%

Crowd confidence—

ImpactproductKnowledge Graph

3w left1w ago

Anthropic splits Claude Code billing from Claude AI

ConfidenceTarget: May 24, 2026

90%Very Likely

View reasoning & evidence

Reasoning: Claude Code has the highest recent momentum in the graph, with 126 recent mentions versus 42 for Claude AI, which is a strong sign the product is becoming the commercial center of gravity. The recent headlines around Claude Code quality issues, rate-limit workarounds, and migration pressure suggest Anthropic is already dealing with usage concentration and monetization friction. If Claude Code keeps absorbing developer attention while Anthropic’s model business remains more commoditized, separating billing is the cleanest way to capture value without confusing the core Claude brand.

How we verify: Anthropic introduces a distinct Claude Code billing, seat, or enterprise packaging structure that is separate from standard Claude AI access.

AnthropicClaude Code

Relationships:Anthropic developed Claude 3Anthropic developed Claude Sonnet 4.6Claude Code uses GitHubOpenAI competes_with AnthropicAnthropic developed Claude AgentClaude Code competes_with GitHub CopilotAnthropic developed Claude AIAnthropic developed Claude Cowork

Events:Claude AI had 42 recent mentions in 14 daysClaude Code quality issues and migration deadline stories appeared in the last 72 hoursClaude Code had 126 recent mentions in 14 days

Sentiment:Sentiment toward Claude Code: high but volatileSentiment toward Anthropic: mixed, with strong developer interestSentiment toward Microsoft: negative shift

Momentum:Anthropic: 138 mentions [velocity: 0.7x]Claude Code: 126 mentions [velocity: 0.7x]

Predict with the Lab

Resolves in

20d 10h 34m 00s

Claim: Anthropic splits Claude Code billing from Claude AI

Lab thinks

90%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence90%

Crowd confidence—

Eventbig techKnowledge Graph

2mo left1w ago

GitHub Copilot adds first-party MCP policy controls

ConfidenceTarget: Jul 23, 2026

55%Possible

View reasoning & evidence

Reasoning: Claude Code’s MCP usage is now one of the clearest protocol-level signals in the graph, and GitHub is already repeatedly linked as a competitor in that layer. The recent surge in Claude Code coverage plus the explicit discovery that MCP is becoming the de facto agent integration standard makes it hard for GitHub to stay purely on the sidelines. If GitHub does not add policy controls, it risks losing the enterprise trust layer to Anthropic and security vendors; if it does, that confirms MCP has moved from developer novelty to platform battleground.

How we verify: GitHub publicly releases a first-party MCP gateway, policy layer, or equivalent connector-control feature for Copilot/Copilot Studio workflows.

GitHubModel Context Protocol

Relationships:GitHub competes_with AnthropicClaude Code uses GitHubGitHub competes_with Claude CodeClaude Code uses Model Context ProtocolAnthropic developed Model Context Protocol

Events:Claude Desktop's undisclosed native messaging bridge appeared in the last 72 hoursClaude Code showed 126 recent mentions in the last 14 daysModel Context Protocol has 22 recent mentions and is rising

Sentiment:Sentiment toward MCP: increasingly strategicSentiment toward Microsoft: negative shiftSentiment toward Claude Code: strongly positive

Momentum:GitHub: 26 mentions (declining) [velocity: 0.2x]Model Context Protocol: 22 mentions [velocity: 0.8x]

Predict with the Lab

Resolves in

80d 10h 34m 00s

Claim: GitHub Copilot adds first-party MCP policy controls

Lab thinks

55%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence55%

Crowd confidence—

ImpactstartupKnowledge Graph

2mo left1w ago

DeepSeek will trigger a margin reset for AI middleware startups

Within the next quarter, at least two AI middleware or wrapper startups will publicly reprice, repackage, or downshift margins after DeepSeek’s next open-weights releases make inference economics visibly worse for them. The hidden dynamic is that cheap frontier-quality open models don’t just pressure closed labs — they compress the value of orchestration layers that were charging for access to model capability rather than workflow depth.

ConfidenceTarget: Jul 23, 2026

47%Speculative

View reasoning & evidence

Reasoning: DeepSeek is surging in the graph, and the current headlines show DeepSeek V4-Pro and DeepSeek 4 already pushing open-weights performance and cost efficiency hard. When a model family gets both cheaper and more capable, the first casualties are not the labs — they are the middleware businesses whose pricing assumes model scarcity. The recent surge in API and model-release language, plus the Nvidia/compute bottleneck context, suggests the market is entering a price war where wrappers lose differentiation fastest. This prediction is invalidated if DeepSeek’s releases stay technically impressive but fail to move real pricing or if middleware vendors keep pricing unchanged through the quarter.

How we verify: At least two AI middleware or wrapper startups publicly change pricing, packaging, or margin assumptions in response to cheaper open-weights model competition, with DeepSeek cited or clearly implicated.

Relationships:OpenAI competes_with DeepSeekNvidia competes_with DeepSeekAnthropic competes_with DeepSeek

Events:DeepSeek-V4: 1M-token context at 10% KV cache costDeepSeek V4-Pro: 1.6T parameters, open weights, undercuts rivals 10x

Sentiment:Sentiment toward middleware startups: -0.2Sentiment toward DeepSeek: +0.7

Predict with the Lab

Resolves in

80d 10h 34m 00s

Claim: DeepSeek will trigger a margin reset for AI middleware startups

Lab thinks

47%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence47%

Crowd confidence—

TrendresearchKnowledge Graph

2mo left1w ago

DeepSeek's next model will self-train on synthetic outputs

Within the next quarter, DeepSeek will ship or describe a next-step model pipeline that relies primarily on synthetic data generated by its own prior model family. The interesting part is not just synthetic data use, but the first clearly productionized self-improvement loop from a major open-weight challenger.

ConfidenceTarget: Jul 23, 2026

47%Speculative

View reasoning & evidence

Reasoning: The current news cycle is already centered on DeepSeek V4/V4-Pro, including 1.6T parameters, open weights, and aggressive cost undercutting. That combination creates a strong incentive to reduce dependence on scarce human-labeled data and to lean into synthetic generation for scale and iteration speed. The graph also shows rising attention on model release mechanics and benchmark pressure, which is exactly where self-training loops become strategically valuable. This is wrong if DeepSeek’s next release is just a larger static model with no visible synthetic-data pipeline.

How we verify: DeepSeek publicly states or demonstrates that a new model was trained primarily using synthetic data generated from its own prior model outputs.

large language models

Relationships:Nvidia competes_with DeepSeekOpenAI competes_with DeepSeekAnthropic competes_with DeepSeek

Events:DeepSeek 4 Released: What We Know So FarDeepSeek V4-Pro: 1.6T parameters, open weights, undercuts rivals 10x

Sentiment:Sentiment toward DeepSeek: risingSentiment toward model release topics: surging

Momentum:large language models: 26 mentions (rising) [velocity: 1.9x]

Predict with the Lab

Resolves in

80d 10h 34m 00s

Claim: DeepSeek's next model will self-train on synthetic outputs

Lab thinks

47%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence47%

Crowd confidence—

EventproductBasic Analysis

2mo left3w ago

Alibaba will push Qwen 3.6 Plus into a developer-facing coding or multimodal release within one quarter

Qwen 3.6 Plus is the strongest surge in the graph, and its cascade hits Alibaba, Qwen2-VL-2B, and Qwen3-Coder-Next. That pattern implies Alibaba is not treating it as a standalone model win; it is being used to seed adjacent coding and multimodal products, likely with a public developer release or benchmark push.

ConfidenceTarget: Jul 5, 2026

61%Possible

View reasoning & evidence

Reasoning: [Strategic Forecast] Qwen 3.6 Plus surge + high-confidence cascade into Qwen3-Coder-Next and Qwen2-VL-2B + live web attention around frontier performance on consumer hardware → Alibaba is likely to operationalize the model across developer and multimodal surfaces. The second-order effect is competitive pressure on open-weight coding models, especially if Alibaba wants to defend mindshare against Google and Anthropic.

How we verify: Alibaba announces a Qwen 3.6 Plus-based coding, multimodal, or agentic developer release, or publishes a benchmark/blog post tying Qwen 3.6 Plus to Qwen3-Coder-Next within 90 days.

Qwen 3.6 Plus

Predict with the Lab

Resolves in

62d 10h 34m 00s

Claim: Alibaba will push Qwen 3.6 Plus into a developer-facing coding or multimodal release within one quarter

Lab thinks

61%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence61%

Crowd confidence—

Eventbig techKnowledge Graph

2mo left3w ago

Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

ConfidenceTarget: Jul 4, 2026

80%Likely

View reasoning & evidence

Reasoning: [Agent Investigation] Nvidia occupies the central infrastructure position in the AI ecosystem, with its Blackwell platform and NIM microservices creating a de facto standard for AI agent deployment. However, its trajectory shows rising but decelerating sentiment, indicating market saturation of its core GPU narrative and growing competitive pressure from hyperscalers (Google's TPU v6, AWS Trainium) and AI labs (Anthropic's hardware ambitions) seeking to reduce dependency. | The data pattern suggests Nvidia is pivoting from pure hardware dominance to a full-stack platform play (NIM, NeMo, simulation toolchain) to lock in developers and defend against commoditization. The re-enrichment signals for Chinese AI giants (Moonshot, MiniMax, DeepSeek) imply intensified competition in sovereign AI clouds, forcing Nvidia to deepen partnerships with Western hyperscalers or risk being bypassed by alternative stacks. | [PRE-MORTEM] Disproven if: 1) AWS announces an exclusive NIM partnership instead, 2) Microsoft announces its own competing AI inference microservice platform, or 3) No partnership announcement occurs by June 30, 2026.

How we verify: Official joint announcement from Nvidia and Microsoft Azure blogs, or a keynote at Microsoft Build 2026 (May 20-22) featuring Jensen Huang announcing the exclusive NIM-on-Azure offering.

Nvidia

Relationships:Nvidia → developed → BlackwellNvidia → developed → Nemotron-Cascade 2Nvidia → developed → Nemotron 3 SuperNvidia → invested → OpenAINvidia → developed → NeMoClawJensen Huang → founded → NvidiaRohan Paul → hired → NvidiaClaude Code → uses → NvidiaChatGPT → uses → NvidiaOpenAI → competes_with → Nvidia

Events:[2026-04-04] product_launch: Launched Blackwell GPU architecture and NIM microservice platform to serve AI agent infrastructure needs.[2026-04-03] product_launch: Set performance records using 288-GPU Blackwell Ultra systems on MLPerf Inference v6.0[2026-04-02] research_milestone: Nvidia claims MLPerf Inference v6.0 records with 288-GPU Blackwell[2026-04-01] product_launch: Spotlighted its physical AI toolchain (simulation, synthetic data, AI learning) during National Robotics Week 2026[2026-03-31] product_launch: Released DLSS 4.5, a major update to its AI upscaling technology with new frame generation modes and improved ray reconstruction.

Sentiment:Nvidia 2026-03-02: +0.23 (3 mentions)Nvidia 2026-03-09: +0.48 (30 mentions)Nvidia 2026-03-16: +0.42 (31 mentions)Nvidia 2026-03-23: +0.44 (14 mentions)Nvidia 2026-03-30: +0.40 (10 mentions)

Momentum:Nvidia (company): 143 mentions

Predict with the Lab

Resolves in

61d 10h 34m 00s

Claim: Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

Lab thinks

80%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence80%

Crowd confidence—

EventproductKnowledge Graph

2mo left4w ago

Google will launch Gemini API with per-second billing by Q2 2026

Google will introduce per-second billing for Gemini API's Flex/Turbo tiers within 60 days, undercutting OpenAI's per-token pricing and targeting bursty agent workloads.

ConfidenceTarget: Jul 4, 2026

80%Likely

View reasoning & evidence

Reasoning: [Agent Investigation] Google is executing a multi-front strategy: commoditizing inference with aggressive API pricing (50% cuts), embedding AI at the device level (Gemini Nano on Android), and funding strategic partners (Anthropic) while maintaining research leadership. However, they face intense competition from OpenAI on frontier models and Anthropic on safety/enterprise, while their core search business faces disruption from AI-native interfaces. | The stable/accelerating sentiment (+0.343) despite pricing wars suggests Google is winning on distribution and integration depth. The Texas data center investment for Anthropic indicates Google is hedging against OpenAI by backing a strategic competitor while securing cloud infrastructure revenue. The simultaneous push on device AI (Nano) and API commoditization (Flex/Turbo) shows they're attacking both the edge and cloud markets simultaneously—a classic Google envelopment strategy. | [PRE-MORTEM] If Google maintains per-token pricing or introduces monthly subscriptions instead, indicating they prioritize enterprise predictability over granular usage.

How we verify: Official Google Cloud blog post or API documentation update showing per-second billing for Gemini API.

Google

Relationships:Google → developed → Gemini Embedding 2Google → developed → GeminiGoogle → developed → Gemini 3.0 ProGoogle → developed → GeminiGoogle → developed → GeminiAnthropic → competes_with → GoogleOpenAI → competes_with → GoogleRohan Paul → hired → GoogleAI Agents → uses → GoogleEthan Mollick → hired → Google

Events:[2026-12-31] partnership: Funding $5B+ Texas data center for Anthropic with 500MW by 2026[2026-12-31] funding: Google's $5B+ Texas data center investment for Anthropic, scheduled for completion by 2026[2026-04-05] research_milestone: Co-authored a paper with Stanford and MIT proposing a method for LLMs to self-improve their prompts.[2026-04-03] product_launch: Released a beta of AICore enabling manual downloads of Gemini Nano 4 models onto Android phones.[2026-04-03] product_launch: Launched 'Flex' and 'Turbo' tiers for Gemini API, cutting standard pricing by 50%.

Sentiment:Google 2026-03-02: +0.33 (13 mentions)Google 2026-03-09: +0.30 (44 mentions)Google 2026-03-16: +0.24 (25 mentions)Google 2026-03-23: +0.30 (39 mentions)Google 2026-03-30: +0.34 (28 mentions)

Momentum:Google (company): 231 mentions

Predict with the Lab

Resolves in

61d 10h 34m 00s

Claim: Google will launch Gemini API with per-second billing by Q2 2026

Lab thinks

80%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence80%

Crowd confidence—

Impactbig techKnowledge Graph

2mo left4w ago

Microsoft will push Copilot agents into a separate enterprise SKU

ConfidenceTarget: Jul 3, 2026

50%Possible

View reasoning & evidence

Reasoning: Microsoft's momentum is surging at 3.2x, and the recent Copilot AI agents announcement shows the company is moving from assistant branding toward virtual-worker framing. At the same time, Microsoft competes with both OpenAI and Anthropic, so it has incentive to differentiate on enterprise control rather than raw model quality. The graph also shows Microsoft under pressure from Claude integration threats and from the broader agent-security conversation, which makes SKU separation a rational commercial response. This would be invalidated if Copilot agents remain bundled broadly with no new enterprise packaging or governance layer by the end of the quarter.

How we verify: Microsoft introduces a distinct enterprise SKU, add-on, or tenant-governed packaging for Copilot agent capabilities, separate from the base Copilot offering.

MicrosoftOpenAI

Relationships:Anthropic competes_with OpenAIOpenAI developed GitHub CopilotOpenAI competes_with GoogleMicrosoft competes_with OpenAIMicrosoft competes_with AnthropicMicrosoft partnered OpenAIEthan Mollick hired OpenAIOpenAI developed GPT-4o

Events:Microsoft Announces Copilot AI Agents That Function as Virtual EmployeesMicrosoft acquired InflectionOpenAI: Targets deployment of first 'AI intern' by September 2028 (2028-09-01)

Sentiment:Sentiment toward Microsoft: +0.5Sentiment toward Copilot: +0.4

Momentum:OpenAI: 98 mentions [velocity: 0.8x]Microsoft: 21 mentions (surging) [velocity: 3.2x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

60d 10h 34m 00s

Claim: Microsoft will push Copilot agents into a separate enterprise SKU

Lab thinks

50%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence50%

Crowd confidence—

Impactbig techKnowledge Graph

2mo left4w ago

Microsoft will add Claude to Microsoft 365 workflows

Microsoft will announce a Claude integration path for at least one Microsoft 365 workflow within the next quarter, likely through Copilot Studio or a partner connector rather than a headline consumer feature. The strategic point is not model quality; it is Microsoft hedging against overdependence on OpenAI by making Anthropic a sanctioned second source inside the productivity stack.

ConfidenceTarget: Jul 3, 2026

25%Speculative

View reasoning & evidence

Reasoning: Microsoft’s momentum is surging sharply in the graph, and it already sits in a contradictory position: partnered with OpenAI, but also competing with both OpenAI and Anthropic across multiple surfaces. The recent analyst note that Claude integration into Microsoft 365 poses a real threat to Copilot suggests the market is already discussing this as a plausible defensive move. Microsoft has also been active in agentic tooling, which makes a multi-vendor workflow layer more likely than a pure model bet. This would be invalidated if Microsoft doubles down exclusively on OpenAI and refuses any Claude path in M365 this quarter.

How we verify: A Microsoft 365, Copilot Studio, or related Microsoft productivity workflow publicly supports Claude as an available model or connector.

MicrosoftAnthropic

Relationships:Anthropic competes_with GoogleAnthropic developed Model Context ProtocolAnthropic developed Claude AgentAnthropic developed Claude CodeMicrosoft → competes_with → OpenAIAI Agents uses AnthropicRohan Paul hired AnthropicMicrosoft → competes_with → Anthropic

Events:Anthropic: Considering an initial public offering (IPO) as soon as October 2026 (2026-10-27)Anthropic: Reportedly considering an initial public offering as early as October 2026 and has held early discussions with banks. (2026-10-01)2026-04-04: Microsoft announced Copilot AI agents that function as virtual employees2026-04-03: Analyst warned Claude integration into Microsoft 365 poses a real threat to Copilot

Sentiment:Sentiment toward Anthropic in enterprise workflows: risingSentiment toward Microsoft: surging

Momentum:Anthropic: 143 mentions [velocity: 0.6x]Microsoft: 21 mentions (surging) [velocity: 3.2x]

Patterns:convergencecompetitive_shiftprecursor

Predict with the Lab

Resolves in

60d 10h 34m 00s

Claim: Microsoft will add Claude to Microsoft 365 workflows

Lab thinks

25%

Δ Lab vs Crowd

—

Crowd thinks

—

Lab confidence25%

Crowd confidence—

Frequently asked questions

What is an AI prediction on gentic.news?

Each prediction is a falsifiable, dated forecast about the AI industry — for example 'Claude Opus 4.7 will exceed 90% on SWE-Bench Verified before 2026-09-01' or 'OpenAI will announce a 1GW+ training campus this quarter'. Predictions cite specific entities and relationships from our knowledge graph, carry a confidence score (0–100), have a hard deadline, and get auto-verified against actual outcomes. We publish the full history — correct, incorrect, partially correct, and expired — so accuracy is auditable.

How are predictions generated?

An AI agent reads our knowledge graph (4,749+ AI entities, 4,890+ relationships) and the latest articles every few hours, looking for patterns: hiring spikes, product cadence, partnership signals, benchmark trajectories, capex announcements. When the agent finds a high-signal pattern, it drafts a falsifiable claim with a deadline, attaches the entities and articles as evidence, and assigns a confidence based on signal strength and historical accuracy on similar prediction types.

How is each prediction verified?

When the deadline arrives, a verification job re-queries our graph and a curated set of authoritative sources (official announcements, benchmark leaderboards, SEC filings, regulator notices) for evidence either way. The outcome is one of: correct, partially correct, incorrect, or expired (no confirming or refuting evidence found). Outcomes are immutable once recorded, and the calibration curve at the top of the page shows how well stated confidence matches actual hit-rate by bin.

What is the current accuracy rate?

Our AI-generated predictions are running at roughly 78% accuracy on resolved items, with the highest accuracy in the 80–95% confidence bins. We deliberately avoid 99%-confidence calls — these tend to be trivially true ('OpenAI will release something in 2026') and don't add information. The full breakdown — correct, partially correct, incorrect, expired — is visible on the leaderboard, and the calibration curve shows where we're under- or over-confident.

Can I make my own prediction?

Yes. The Community tab on this page lets anyone submit a falsifiable AI prediction. Submissions need a clear claim, a deadline, and ideally a rationale. Cookie-based identity tracks your accuracy on the predictor leaderboard — no account required. Community predictions go through the same verification flow as AI-generated ones, and your hit-rate / Brier score appears on the leaderboard once you've resolved at least three.

Why publish predictions that turn out wrong?

Because hiding losses kills calibration. A forecasting system that only shows wins is uncalibrated by construction. We surface every incorrect and partially correct prediction with the original confidence, evidence, and deadline. This lets readers see whether our 75% confidence calls actually hit ~75% of the time (well-calibrated) or 60% / 90% (mis-calibrated). The calibration plot is updated nightly.

Back to homepage

Get smarter about AI in 5 minutes

Join readers from Google, Anthropic, and NVIDIA. Every week: the 10 most important AI developments, verified predictions, and what they mean for your work. Free forever. Customize what you get →

Predictive Intelligence

177 predictions made.

Are we as confident as we should be?

Open forecasts, sorted by calibrated confidence

Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

Google will launch Gemini API with per-second billing by Q2 2026

Google Cloud will expose agent billing separate from Gemini

GitHub Copilot adds first-party MCP policy controls

Google will expose TPU pricing for agent workloads

Google reprices Gemini for coding workloads

Microsoft will push Copilot agents into a separate enterprise SKU

Anthropic splits Claude Code billing from Claude AI

Will this happen? Cast your vote.

What’s shifting in the graph

What we predicted vs what happened

Claude Agent will add GitHub repository integration within 4 weeks

Anthropic will ship Claude Code enterprise billing within 30 days

Anthropic's Claude Code becomes harder to buy standalone

ChatGPT Commerce API Launch

Alibaba announces Qwen 4.0 with OpenSandbox agent platform integration at their Cloud Summit in June 2026

Anthropic will turn Claude Code into a background PR agent

Predictor Leaderboard

How predictions are made

How accuracy is computed

Verification process

Possible outcomes

Trending Signals

Active Predictions(16)

Cursor will add first-party MCP governance

Google will expose TPU pricing for agent workloads

OpenAI will undercut coding access again, but only in API form

Google Cloud will expose agent billing separate from Gemini

OpenAI will split Codex pricing from ChatGPT

MCP security vendors ship enterprise controls

Google reprices Gemini for coding workloads

Anthropic splits Claude Code billing from Claude AI

GitHub Copilot adds first-party MCP policy controls

DeepSeek will trigger a margin reset for AI middleware startups

DeepSeek's next model will self-train on synthetic outputs

Alibaba will push Qwen 3.6 Plus into a developer-facing coding or multimodal release within one quarter

Nvidia Announces Azure-Exclusive Blackwell NIM Partnership

Google will launch Gemini API with per-second billing by Q2 2026

Microsoft will push Copilot agents into a separate enterprise SKU

Microsoft will add Claude to Microsoft 365 workflows

Prediction History(31 resolved)

Frequently asked questions

Get smarter about AI in 5 minutes