unverified claims
30 articles about unverified claims in AI news
GPT-5.5 Stealth Test Reports Emerge, Claiming Performance Over Opus 4.7
Social media reports suggest OpenAI may be conducting limited, unannounced testing of GPT-5.5. Initial, unverified claims from testers indicate it outperforms Anthropic's Claude 3.5 Opus 4.7 model.
Alleged OpenAI Codex Codebase Leak Circulates on X, Unverified
An unverified claim of a full OpenAI Codex codebase leak is circulating on social media. No official confirmation or source code has been substantiated, leaving the report in question.
Jensen Huang Claims NVIDIA Has 'Achieved AGI' in Lex Fridman Interview, Sparking Industry Debate
NVIDIA CEO Jensen Huang stated in a Lex Fridman podcast interview that he believes his company has 'achieved AGI.' The brief, unverified claim has ignited immediate discussion about the definition and benchmarks for artificial general intelligence.
Kunluncore Files STAR Market IPO, Claims 32K GPU Cluster First
Kunluncore filed for a STAR Market IPO, claiming a 32K GPU cluster first, testing investor appetite for domestic AI chips.
AI Agent 'Business OS' Emerges, Claims Full GUI-Based Business Automation
A developer announced an AI agent that operates a business through a GUI, not just chat. The claim suggests a shift from task-specific AI to full-process automation.
Alibaba's XuanTie C950 CPU Hits 70+ SPECint2006, Claims RISC-V Record with Native LLM Support
Alibaba's DAMO Academy launched the XuanTie C950, a RISC-V CPU scoring over 70 on SPECint2006—the highest single-core performance for the architecture—with native support for billion-parameter LLMs like Qwen3 and DeepSeek V3.
GLM-5.1 Claims Autonomous Self-Improvement Without Human Metrics
Zhipu AI's GLM-5.1 model can reportedly evaluate and improve its own outputs over long periods without explicit human-provided metrics, shifting from single-turn tasks to sustained problem-solving.
Massive Video Reasoning Dataset Released, Reportedly 1000x Larger Than Predecessors
An unverified report claims the release of a video reasoning dataset roughly 1000x larger than existing benchmarks. If true, it would be a significant resource for training next-generation video understanding models.
Sam Altman Aims for '5T Tokens Per Day' as OpenAI Reportedly Scales GPT-5.4
Sam Altman stated his goal is to flood the market with AI tokens, comparing intelligence to a utility. A separate, unverified report claims GPT-5.4 is processing '5T tokens per day' in its first week.
DeemosTech Rodin Gen-2.5: 10M-Polygon 3D GenAI in 4 Seconds
DeemosTech claims Rodin Gen-2.5 generates 10M polygon 3D models in 4 seconds with skin microstructures, but provides no benchmarks or technical details.
Gemini Flash Rumored at 92% of GPT-5.5 Coding, 15-20x Cheaper
Unconfirmed rumor claims Gemini Flash achieves 92% of GPT-5.5 coding performance at 15-20x lower cost. Source is a single X post; no official confirmation.
GPT-5.5 + Codex Combines App Building, Browser Use, Image Gen
@intheworldofai claims GPT-5.5 + Codex is a super app better than Claude Code, with 7 capabilities including app building, debugging, browser use, and image generation.
Rumor: Anthropic's Next Claude Update May Include AI App Builder
A rumor on X claims the next Claude update will include an app builder, allowing users to create applications through conversational AI. This could significantly lower the barrier to app development.
Mythos AI Model Reportedly 'Destroys' Benchmarks in Early Leak
A viral tweet claims the unreleased Mythos AI model 'destroys every other model' based on leaked benchmarks. No official confirmation or technical details are available.
AI Weekly: GPT-6 Rumors, DeepSeek V4 on Huawei, Anthropic Models, Qwen 3.6-Plus
A weekly roundup video aggregates major AI rumors and announcements, including unverified GPT-6 details, DeepSeek V4 reportedly running on Huawei hardware, and launches of Anthropic's Conway and Ultraplan and Alibaba's Qwen 3.6-Plus.
Gamma 31B Model Reportedly Outperforms Qwen 3.5 397B, Highlighting Efficiency Leap
A developer's social media post claims the Gamma 31B model outperforms the much larger Qwen 3.5 397B. If verified, this would represent a dramatic efficiency gain in large language model scaling.
Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns
A Fortune report, cited on social media, claims Anthropic's upcoming Opus 5 model is a 'massive leap' from Claude 3.5 Sonnet, posing significant security risks. OpenAI is also rumored to have a similarly advanced model, 'Spud,' in development.
Claude 'Mythos' Leak Suggests New Tier Beyond Opus 4.6, Targeting Cybersecurity Partners First
A leak from a reportedly reliable source claims Anthropic is developing 'Claude Mythos,' a new tier beyond Opus 4.6 with major gains in coding, reasoning, and cybersecurity. The model is described as so compute-intensive that initial access will be limited to select cybersecurity partners.
Frontier AI Models Reportedly Score Below 1% on ARC-AGI v3 Benchmark
A social media post claims frontier AI models have achieved below 1% performance on the ARC-AGI v3 benchmark, suggesting a potential saturation point for current scaling approaches. No specific models or scores were disclosed.
Professors at NYU, Stanford, and Case Western Reportedly Using NotebookLM to Automate Course Creation
Professors at three major universities have reportedly stopped building courses manually and are using Google's NotebookLM AI to automate the process. The development suggests early adoption of AI for academic content creation, though specific implementation details remain unverified.
Apple WWDC 2026: Gemini Deeply Integrated into iOS
A tweet from @kimmonismus claims Apple's 2026 WWDC will be the most exciting yet, with the first deep integration of a useful AI model (Gemini) into iOS and a new Apple CEO.
Unnamed Python Rewrite Gains 47K+ GitHub Stars in 5 Hours, Breaks Platform Velocity Record
An unidentified Python rewrite project amassed over 47,000 GitHub stars in just five hours, a velocity faster than any previous project in the platform's history. The viral surge suggests a high-demand tool or library, though its exact nature and technical merits remain unverified.
Anthropic's Claude Allegedly Has Secret 'Benjamin Franklin Persuasion & Leverage Machine' Mode
A viral tweet claims Anthropic's Claude AI has a hidden mode designed for persuasion and leverage analysis. No official confirmation or technical details have been provided by the company.
Anthropic's Claude Reportedly Has 'Ikigai Career Mapper' Feature for Personalized Career Guidance
A viral social media post claims Anthropic's Claude AI has a hidden 'Ikigai Career Mapper' mode that analyzes skills, passions, and income needs to suggest career paths. No official confirmation or technical details have been provided by Anthropic.
No Rigorous Productivity Tests Exist for Post-2025 Autonomous Coding Tools
No productivity studies exist for autonomous coding tools launched December 2025. All research predates the Claude Code/Codex revolution, creating a major knowledge gap.
Alibaba + Nanjing Univ Claim 9.36X Faster Million-Token Prefill vs FlashAttention-2
Alibaba + Nanjing Univ claim 9.36X faster million-token prefill vs FlashAttention-2, targeting the key bottleneck in long-context LLM inference.
Claude Reaches 30M Daily Users; Anthropic Scales
Claude reportedly reaches 30 million daily users per a third-party claim, though Anthropic has not confirmed the figure. The milestone, if accurate, shows growing consumer adoption but lags behind ChatGPT.
SenseTime Open-Sources Omni-Modal Model That Thinks in Pixels and Words
SenseTime open-sourced an omni-modal AI that reasons in pixel-word space without visual encoder or VAE, challenging dominant multimodal architectures.
Google CodeWiki Turns GitHub Repos Into Interactive Docs
Google launched CodeWiki, turning any GitHub repo into interactive docs with diagrams, tutorials, and a chatbot. It differentiates by structure over file summarization.
MIT Hackathon Team Builds Wearable AI for Physical Movement Guidance
MIT hackathon team builds wearable AI for real-time physical movement guidance via sensors and on-device inference, demoed by @kimmonismus.