kimi
30 articles about kimi in AI news
Kimi 2.6 Thinking Shows Promise as Open Weights Model, Lags Behind Closed SoTA
An initial evaluation of Moonshot AI's Kimi 2.6 Thinking model finds it generates extensive reasoning traces but delivers only 'okay-ish' results on creative and coding tasks, highlighting the persistent open vs. closed model gap.
Moonshot AI's Kimi K2.6 Hits 58.6% on SWE-Bench Pro, Leads Open-Source Coding
Moonshot AI released Kimi K2.6, an open-source coding model achieving 58.6% on SWE-Bench Pro and 54.0% on HLE with tools. This positions it as a top-tier open alternative to proprietary models like Claude 3.5 Sonnet.
Stealth 100B Model Appears on OpenRouter, Possibly DeepSeek or Kimi
A new, unannounced 100-billion-parameter AI model has appeared on the OpenRouter API platform. Its origin is unknown, but observers speculate it could be a variant from DeepSeek or an update to Kimi's code model.
Kimi 2.6 Code Model Teased in Leaked Image, Suggesting Moonshot AI Update
A screenshot circulating online appears to show a 'Kimi 2.6' code model interface, suggesting Moonshot AI is preparing an update to its Kimi Chat platform focused on coding tasks.
Alibaba's Qwen3.6-Plus Reportedly Under Half the Size of Kimi K2.5, Nears Claude Opus 4.5 Performance
Alibaba's Tongyi Lab announced Qwen3.6-Plus, a model reportedly under half the size of Moonshot's Kimi K2.5 while approaching Claude Opus 4.5 performance, signaling major efficiency gains in China's LLM race.
Fireworks AI Launches 'Fire Pass' with Kimi K2.5 Turbo at 250 Tokens/Second
Fireworks AI has launched a new 'Fire Pass' subscription offering access to Kimi K2.5 Turbo at speeds up to 250 tokens/second. The service includes a free trial followed by a $7 weekly subscription.
Moonshot AI Launches Kimi Slides: AI Tool Converts Notes into Investor-Ready Presentations
Moonshot AI has launched Kimi Slides, an AI-powered presentation generator that converts unstructured notes into investor-ready slide decks. The tool is positioned as a direct competitor to high-cost freelance presentation designers.
Kimi Launches 'Kimi Slides' AI Presentation Tool, Claims 5-Minute Investor Deck Creation
Moonshot AI's Kimi chatbot has launched a new feature called Kimi Slides that generates investor-ready presentations from messy notes in 5 minutes, positioning itself against professional design services.
Kimi 2.5's 1T Parameter MoE Model Runs on 96GB Mac Hardware via SSD Streaming
Developers have demonstrated that Kimi 2.5's 1 trillion parameter Mixture-of-Experts model can run on Mac hardware with just 96GB RAM by streaming expert weights from SSD, with only 32B parameters active per token.
Step-3.5-Flash: 196B Open-Source MoE Model Activates Only 11B Parameters, Outperforms Kimi K2.5 and Claude Opus 4.5 on Key Benchmarks
Shanghai-based StepFun's Step-3.5-Flash, a 196B parameter sparse mixture-of-experts model that activates only 11B parameters per token, achieves top scores on AIME 2025 (97.3) and LiveCodeBench-V6 (86.4) while costing 18.9x less to run than Kimi K2.5.
Moonshot AI's Kimi Introduces Attention Residuals to Mitigate Deep-Layer Information Loss in LLMs
Moonshot AI's Kimi team proposes Attention Residuals, a novel mechanism replacing standard residual connections. It allows each layer to attend to and selectively retrieve information from any previous layer, improving performance on long-context reasoning tasks.
Kimi's Selective Layer Communication Improves Training Efficiency by ~25% with Minimal Inference Overhead
Kimi has developed a method that replaces uniform residual connections with selective information routing between layers in deep AI models. This improves training stability and achieves ~25% better compute efficiency with negligible inference slowdown.
NVIDIA's Kimi-K2.5 Eagle Head: Supercharging Moonshot's Reasoning with Speculative Decoding
NVIDIA has released the Kimi-K2.5 Eagle head on Hugging Face, implementing Eagle-3 speculative decoding to dramatically accelerate inference for Moonshot's reasoning models. This breakthrough promises blazing-fast performance while maintaining accuracy.
Cursor AI Meets Kimi K2.5: The Rapid Prototyping Revolution in Software Development
The integration of Cursor AI's code editor with Kimi's K2.5 model enables developers to transform simple prompts into functional applications in under a minute, dramatically accelerating the prototyping phase and lowering barriers to software creation.
Kimi's Meteoric Rise: How Moonshot AI's Chatbot Became China's Fastest $10B Unicorn
Moonshot AI's Kimi chatbot generated more revenue in just 20 days than in all of 2025, achieving a $10 billion valuation in just over two years. This explosive growth signals a major shift in China's AI landscape and global AI competition.
Kimi Launches OpenClaw-Powered Workspace: China's Browser-Based AI Revolution
Kimi has unveiled Kimi Claw, a browser-based AI workspace featuring 24/7 operation, 5,000+ community skills, 40GB cloud storage, and native OpenClaw integration. This development represents China's growing influence in accessible, cloud-native AI tools.
Kimi Team's 'Attention Residuals' Replace Fixed Summation with Softmax Attention, Boosts GPQA-Diamond by +7.5%
Researchers propose Attention Residuals, a content-dependent alternative to standard residual connections in Transformers. The method improves scaling laws, matches a baseline trained with 1.25x more compute, and adds under 2% inference overhead.
Free-Claude-Code Proxy Routes Anthropic API to Free NVIDIA NIM Models
A developer released free-claude-code, a proxy that intercepts Claude Code's API calls and routes them to free NVIDIA NIM endpoints, unlocking free access to models like Kimi K2 and GLM 4.7. This bypasses Anthropic's subscription fees and adds remote execution via a Telegram bot.
DeepSeek V4 Begins Limited Rollout with Fast, Expert, Vision Modes
DeepSeek V4 is reportedly in limited gray-scale testing with a new interface offering Fast, Expert, and Vision modes. This mirrors competitor Kimi's tiered system and suggests a move towards performance-based rate limiting.
Moonshot AI CEO Yang Zhilin Advocates for Attention Residuals in LLM Architecture
Yang Zhilin, founder of Moonshot AI, argues for the architectural value of attention residuals in large language models. This technical perspective comes from the creator of the popular Kimi Chat model.
Alibaba Cloud's $3 Coding Plan Disrupts AI Development Market
Alibaba Cloud has launched a unified coding subscription offering four frontier AI models for just $3, potentially reshaping how developers access and use coding assistants. The plan includes Qwen 3.5-Plus, Kimi K2.5, MiniMax M2.5, and GLM-5 in a single package.
DeepSeek V4-Pro: 1.6T parameters, open weights, undercuts rivals 10x
DeepSeek unveiled V4-Pro and V4-Flash, its largest open-weight models with up to 1.6 trillion parameters and a 1M-token context window. The new hybrid attention architecture cuts compute for long contexts by 73–90%, enabling prices far below OpenAI, Google, and Anthropic.
Moonshot AI Ships Trillion-Parameter Open Model, Matches Claude Opus on Coding
Moonshot AI released a trillion-parameter open-source model that reportedly matches Anthropic's Claude Opus on most coding benchmarks. This follows the same day Anthropic committed $25B to AWS for compute, highlighting divergent AI scaling strategies.
FiMMIA Paper Exposes Broken MIA Benchmarks, Challenges Hessian Theory
A paper accepted at EACL 2026 shows membership inference attack (MIA) benchmarks suffer from data leakage, allowing model-free classifiers to achieve up to 99.9% AUC. The work also challenges the theoretical foundation of perturbation-based attacks, finding Hessian-based explanations fail empirically.
US Closed-Source AI Models Maintain Frontier Lead, Meta Re-Enters Race
An analysis of frontier AI model makers shows US closed-source leaders (Google, OpenAI, Anthropic) maintaining a significant lead, with Meta re-entering the race. The best Chinese models remain 7-9+ months behind released US models.
MetaClaw Enables Deployed LLM Agents to Learn Continuously with Fast & Slow Loops
MetaClaw introduces a two-loop system allowing production LLM agents to learn from failures in real-time via a fast skill-writing loop and update their core model later in a slow training loop, boosting accuracy by up to 32% relative.
Moonshot AI Explores Hong Kong IPO Amid $1B Funding Round at $18B Valuation
Moonshot AI is considering a Hong Kong IPO while pursuing a new funding round of up to $1 billion at an $18 billion pre-money valuation. This signals a strategic shift for the Chinese 'AI Tiger' from private capital to public markets.
The Claude OAuth Workaround Is Dead. Here's How to Cut Your Claude Code API Bill Today
Anthropic killed the OAuth token exploit. Use TeamoRouter's 50% discount and multi-provider routing to slash Claude Code costs without crypto.
Multi-Agent Coding Systems Compared: Claude Code, Codex, and Cursor
A hands-on comparison reveals three fundamentally different approaches to multi-agent coding. Claude Code distinguishes between subagents and agent teams, Codex treats it as an engineering problem, and Cursor implements parallel file-system operations.
Minimax Confirms Abab 6.5 Pro Model as 'Minimax 2.7' in Teaser Announcement
Minimax has officially branded its upcoming Abab 6.5 Pro model as 'Minimax 2.7' in a teaser announcement. This confirms the company's next major model release is imminent.