kimi

30 articles about kimi in AI news

Moonshot AI Releases 1.56T-Parameter Kimi K3, Requires 2x B200 Nodes

Moonshot AI released Kimi K3, a 1.56T parameter MoE model at 1561 GB, requiring 2x B200 nodes. No benchmarks disclosed.

Jul 27, 2026100% relevant

Jensen Huang: DeepSeek, Kimi open models boost Nvidia sales

Jensen Huang says Chinese open models DeepSeek and Kimi boost Nvidia GPU demand, not threaten it. Market misunderstood their impact twice.

Jul 25, 202693% relevant

Moonshot AI's Kimi K3: 2.8T params, 1M token window, $3/M input

Moonshot AI released Kimi K3, a 2.8T-parameter mixture-of-experts model with 1M token context window and $3/M input pricing, claiming autonomous chip design and research capabilities.

Jul 16, 2026100% relevant

NVIDIA Releases FP4 Quantized Kimi-K2.7-Code with 1T Parameters

NVIDIA released FP4 quantized Kimi-K2.7-Code on Hugging Face, a 1T-parameter model for Blackwell GPUs with claimed accuracy retention.

Jul 10, 202690% relevant

NVIDIA Nemotron 3 Ultra: 550B Open-Weight Model Challenges GLM, Kimi

NVIDIA released Nemotron 3 Ultra, a 550B open-weight model claiming near-SOTA performance, competing with GLM-5.1 and Kimi K2.6. No benchmarks yet.

Jun 1, 202687% relevant

Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup

Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.

May 23, 202693% relevant

Moonshot AI's Kimi WebBridge Lets Agent Use Your Logged-In Sessions

Moonshot AI released Kimi WebBridge, a browser extension that lets its Kimi agent use your logged-in sessions. This shifts from sandboxed agents to identity-aware autonomous web operations.

May 20, 202692% relevant

CoreWeave Tops Kimi K2.6 Inference Speed

CoreWeave tops 10 other providers on speed and price-performance for Moonshot AI's Kimi K2.6 in Artificial Analysis benchmark.

May 11, 202681% relevant

Kimi 2.6 Thinking Shows Promise as Open Weights Model, Lags Behind Closed SoTA

An initial evaluation of Moonshot AI's Kimi 2.6 Thinking model finds it generates extensive reasoning traces but delivers only 'okay-ish' results on creative and coding tasks, highlighting the persistent open vs. closed model gap.

Apr 21, 2026100% relevant

Moonshot AI's Kimi K2.6 Hits 58.6% on SWE-Bench Pro, Leads Open-Source Coding

Moonshot AI released Kimi K2.6, an open-source coding model achieving 58.6% on SWE-Bench Pro and 54.0% on HLE with tools. This positions it as a top-tier open alternative to proprietary models like Claude 3.5 Sonnet.

Apr 20, 2026100% relevant

Stealth 100B Model Appears on OpenRouter, Possibly DeepSeek or Kimi

A new, unannounced 100-billion-parameter AI model has appeared on the OpenRouter API platform. Its origin is unknown, but observers speculate it could be a variant from DeepSeek or an update to Kimi's code model.

Apr 13, 202685% relevant

Kimi 2.6 Code Model Teased in Leaked Image, Suggesting Moonshot AI Update

A screenshot circulating online appears to show a 'Kimi 2.6' code model interface, suggesting Moonshot AI is preparing an update to its Kimi Chat platform focused on coding tasks.

Apr 13, 202685% relevant

Alibaba's Qwen3.6-Plus Reportedly Under Half the Size of Kimi K2.5, Nears Claude Opus 4.5 Performance

Alibaba's Tongyi Lab announced Qwen3.6-Plus, a model reportedly under half the size of Moonshot's Kimi K2.5 while approaching Claude Opus 4.5 performance, signaling major efficiency gains in China's LLM race.

Apr 4, 202695% relevant

Fireworks AI Launches 'Fire Pass' with Kimi K2.5 Turbo at 250 Tokens/Second

Fireworks AI has launched a new 'Fire Pass' subscription offering access to Kimi K2.5 Turbo at speeds up to 250 tokens/second. The service includes a free trial followed by a $7 weekly subscription.

Mar 27, 202685% relevant

Moonshot AI Launches Kimi Slides: AI Tool Converts Notes into Investor-Ready Presentations

Moonshot AI has launched Kimi Slides, an AI-powered presentation generator that converts unstructured notes into investor-ready slide decks. The tool is positioned as a direct competitor to high-cost freelance presentation designers.

Mar 24, 202685% relevant

Kimi Launches 'Kimi Slides' AI Presentation Tool, Claims 5-Minute Investor Deck Creation

Moonshot AI's Kimi chatbot has launched a new feature called Kimi Slides that generates investor-ready presentations from messy notes in 5 minutes, positioning itself against professional design services.

Mar 24, 202685% relevant

Kimi 2.5's 1T Parameter MoE Model Runs on 96GB Mac Hardware via SSD Streaming

Developers have demonstrated that Kimi 2.5's 1 trillion parameter Mixture-of-Experts model can run on Mac hardware with just 96GB RAM by streaming expert weights from SSD, with only 32B parameters active per token.

Mar 24, 202685% relevant

Step-3.5-Flash: 196B Open-Source MoE Model Activates Only 11B Parameters, Outperforms Kimi K2.5 and Claude Opus 4.5 on Key Benchmarks

Shanghai-based StepFun's Step-3.5-Flash, a 196B parameter sparse mixture-of-experts model that activates only 11B parameters per token, achieves top scores on AIME 2025 (97.3) and LiveCodeBench-V6 (86.4) while costing 18.9x less to run than Kimi K2.5.

Mar 24, 202695% relevant

Moonshot AI's Kimi Introduces Attention Residuals to Mitigate Deep-Layer Information Loss in LLMs

Moonshot AI's Kimi team proposes Attention Residuals, a novel mechanism replacing standard residual connections. It allows each layer to attend to and selectively retrieve information from any previous layer, improving performance on long-context reasoning tasks.

Mar 16, 202689% relevant

Kimi's Selective Layer Communication Improves Training Efficiency by ~25% with Minimal Inference Overhead

Kimi has developed a method that replaces uniform residual connections with selective information routing between layers in deep AI models. This improves training stability and achieves ~25% better compute efficiency with negligible inference slowdown.

Mar 16, 202687% relevant

NVIDIA's Kimi-K2.5 Eagle Head: Supercharging Moonshot's Reasoning with Speculative Decoding

NVIDIA has released the Kimi-K2.5 Eagle head on Hugging Face, implementing Eagle-3 speculative decoding to dramatically accelerate inference for Moonshot's reasoning models. This breakthrough promises blazing-fast performance while maintaining accuracy.

Mar 12, 202689% relevant

Cursor AI Meets Kimi K2.5: The Rapid Prototyping Revolution in Software Development

The integration of Cursor AI's code editor with Kimi's K2.5 model enables developers to transform simple prompts into functional applications in under a minute, dramatically accelerating the prototyping phase and lowering barriers to software creation.

Mar 6, 202685% relevant

Kimi's Meteoric Rise: How Moonshot AI's Chatbot Became China's Fastest $10B Unicorn

Moonshot AI's Kimi chatbot generated more revenue in just 20 days than in all of 2025, achieving a $10 billion valuation in just over two years. This explosive growth signals a major shift in China's AI landscape and global AI competition.

Feb 24, 202675% relevant

Kimi Launches OpenClaw-Powered Workspace: China's Browser-Based AI Revolution

Kimi has unveiled Kimi Claw, a browser-based AI workspace featuring 24/7 operation, 5,000+ community skills, 40GB cloud storage, and native OpenClaw integration. This development represents China's growing influence in accessible, cloud-native AI tools.

Feb 15, 202685% relevant

Kimi K3 Tops US Models in Front-End Coding at Smaller Scale

Moonshot AI's K3 tops US models in front-end coding at 89.2% on SWE-bench while being smaller and cheaper to train.

Jul 17, 2026100% relevant

Kimi Team's 'Attention Residuals' Replace Fixed Summation with Softmax Attention, Boosts GPQA-Diamond by +7.5%

Researchers propose Attention Residuals, a content-dependent alternative to standard residual connections in Transformers. The method improves scaling laws, matches a baseline trained with 1.25x more compute, and adds under 2% inference overhead.

Mar 16, 202697% relevant

Moonshot AI Pauses K3 Subscriptions as Demand Exceeds GPU Capacity

Moonshot AI paused Kimi K3 subscriptions due to GPU capacity limits. The open-weight release by July 27 aims to offload compute demand.

Jul 20, 2026100% relevant

Moonshot AI, State Bank Launch First AI-Native Credit Card in China

Moonshot AI's Kimi launches world's first AI-native credit card with state-owned bank, converting spending into compute credits.

Jun 13, 202690% relevant

mlx-vlm v0.5.0 Adds Continuous Batching, Distributed Inference for Apple Silicon

mlx-vlm v0.5.0 adds continuous batching, speculative decoding, and distributed inference for Apple Silicon. The release supports Qwen3.5, Kimi K2.5, Gemma 4 video, and new models with 21 contributors.

May 6, 202687% relevant

Free-Claude-Code Proxy Routes Anthropic API to Free NVIDIA NIM Models

A developer released free-claude-code, a proxy that intercepts Claude Code's API calls and routes them to free NVIDIA NIM endpoints, unlocking free access to models like Kimi K2 and GLM 4.7. This bypasses Anthropic's subscription fees and adds remote execution via a Telegram bot.

Apr 22, 202691% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety