kubernetes

30 articles about kubernetes in AI news

Kelos: The Kubernetes Framework That's Turning AI Coding Agents Into Self-Developing Systems

Kelos introduces a Kubernetes-native framework for orchestrating autonomous AI coding agents through declarative YAML workflows. This approach transforms AI-assisted development from manual interactions to continuous, automated pipelines that can self-improve projects.

Mar 2, 202675% relevant

IREN Acquires Mirantis for $625M to Own AI Data Center Stack

IREN acquired Mirantis for $625M to add Kubernetes and OpenStack expertise, aiming to control the full AI infrastructure stack and compete with cloud providers.

May 5, 202690% relevant

xAI Launches Grok Plugin Marketplace to Counter Claude Code's Ecosystem

xAI launched Grok Build Plugin Marketplace with 6 plugins, directly competing with Claude Code's 224,691-star open-source ecosystem. The move mirrors xAI's strategy of absorbing community momentum.

Jun 13, 202688% relevant

Anthropic Acquires Stainless for ~$300M, Owns MCP Toolchain

Anthropic acquired Stainless for ~$300M, gaining the dominant MCP server generator and key SDK tooling, signaling a bet on integration-layer moats over model differentiation.

May 18, 2026100% relevant

Meshwatch GNN Stack Ships Fraud Detection with 17.2% Lift over XGBoost

Meshwatch GNN fraud stack achieves 17.2% recall lift over XGBoost at sub-50ms latency, shipping a custom GraphSAGE variant with online neighbor sampling.

May 13, 202692% relevant

OpenAI Open-Sources Datacenter Networking Tech

OpenAI open-sourced its datacenter networking tech (Tectonic filesystem, custom stack) to challenge Google Cloud's proprietary AI infrastructure and set an open standard.

May 7, 202692% relevant

How a Custom Multimodal Transformer Beat a Fine-Tuned LLM for Attribute

LeBonCoin's ML team built a custom late-fusion transformer that uses pre-computed visual embeddings and character n-gram text vectors to predict ad attributes. It outperformed a fine-tuned VLM while running on CPU with sub-200ms latency, offering calibrated probabilities and 15-minute retraining cycles.

Apr 29, 2026100% relevant

Cua Driver Open-Sourced: macOS Agent Control for Any App

Cua released Cua Driver as open-source, allowing agents like Claude Code and Codex to drive any macOS app through visual understanding and direct UI interaction.

Apr 23, 202685% relevant

Stateless Memory for Enterprise AI Agents: Scaling Without State

The paper replaces stateful agent memory with immutable decision logs using event-sourcing, allowing thousands of concurrent agent instances to scale horizontally without state bottlenecks.

Apr 23, 202685% relevant

Turn Claude Code Into an AI SRE

Five proven outer-loop workflows for using Claude Code as an AI SRE: incident triage, runbook execution, postmortem drafting, SLO investigation, and on-call handoffs. The bottleneck isn't the model — it's the MCP runtime.

Apr 22, 2026100% relevant

A Practical Framework for Moving Enterprise RAG from POC to Production

The article presents a detailed, production-ready framework for building an enterprise RAG system, covering architecture, security, and deployment. It provides a concrete path for companies to move beyond experimental prototypes.

Apr 22, 202672% relevant

Google Cloud Next '26: 8th-gen TPUs, agent platform, $750M fund

At Cloud Next 2026, Google unveiled two 8th-gen TPU chips, a Gemini-based enterprise AI agent platform, and a $750 million partner fund to drive secure, large-scale automation and heavy AI workloads.

Apr 22, 202688% relevant

NVIDIA, Google Cloud Expand AI Partnership for Agentic & Physical AI

NVIDIA and Google Cloud announced an expanded partnership to advance agentic and physical AI, focusing on new infrastructure and software integrations. This builds on their existing collaboration to provide optimized AI training and inference platforms.

Apr 22, 2026100% relevant

Onyx: Open-Source AI Enterprise Search Challenges Glean's $7.2B Valuation

Open-source platform Onyx provides self-hosted AI enterprise search connecting to 40+ tools, offering a free alternative to Glean's $50/user/month SaaS. Backed by YC and $10M seed funding, it's used by Netflix and Ramp.

Apr 22, 202685% relevant

MCP's 'By Design' Security Flaw

The Model Context Protocol's power comes with risk: servers you install can run code on your system. Learn how to audit and manage MCP server permissions.

Apr 21, 2026100% relevant

Google, Marvell in Talks to Co-Develop New AI Chips, Including TPU-Optimized MPU

Google is reportedly in talks with Marvell Technology to co-develop two new AI chips: a memory processing unit (MPU) to pair with TPUs and a new, optimized TPU. This move is a direct effort to bolster Google's custom silicon stack and compete with Nvidia's dominance.

Apr 20, 202695% relevant

Stop Bloating Your CLAUDE.md: A 6-Layer Memory Architecture That Actually Works

Implement path-scoped rules and a wiki layer before reaching for complex RAG—this architecture saves tokens and prevents ignored instructions.

Apr 17, 202689% relevant

How to Manage Multiple Claude Code Sessions with Harness and Preview

Two actionable tools to solve the core productivity bottlenecks when running multiple Claude Code agents: session management and review speed.

Apr 14, 2026100% relevant

Pioneer Agent: A Closed-Loop System for Automating Small Language Model

Researchers present Pioneer Agent, a system that automates the adaptation of small language models to specific tasks. It handles data curation, failure diagnosis, and iterative training, showing significant performance gains in benchmarks and production-style deployments. This addresses a major engineering bottleneck for deploying efficient, specialized AI.

Apr 14, 202674% relevant

VMLOps Publishes 2026 AI Engineer Roadmap for Software Engineers

VMLOps published a comprehensive 2026 roadmap detailing the skills and knowledge software engineers need to transition into AI engineering. The guide reflects the current industry demand for engineers who can build and deploy production AI systems.

Apr 12, 202685% relevant

HeyGen Launches Avatar Engine, Open-Source Renderer & 175-Language Dubbing

HeyGen's major 2026 update includes a new avatar engine, an open-source video renderer, and 175-language dubbing capabilities, expanding its AI video generation platform for enterprise and creator use.

Apr 8, 202697% relevant

Jack Dorsey's Block Launches Free, Open-Source AI Coding Agent Goose

Jack Dorsey's Block has released Goose, a free and open-source AI agent for code execution and testing. It works with any LLM and supports MCP servers, offering a CLI and desktop app.

Apr 8, 202687% relevant

Production RAG: From Anti-Patterns to Platform Engineering

The article details common RAG anti-patterns like vector-only retrieval and hardcoded prompts, then presents a five-pillar framework for production-grade systems, emphasizing governance, hardened microservices, intelligent retrieval, and continuous evaluation.

Apr 6, 202690% relevant

Claude Mobile's Embedded Tools Are a Blueprint for Claude Code's Future

The new embedded Figma/Canva tools in Claude Mobile, powered by MCP, show where Claude Code is headed: from passive retrieval to active, in-context operation.

Mar 31, 202683% relevant

Requestly Launches Git-Synced API Client to Replace Scattered Postman Setups

Requestly has launched an AI-powered API client that automatically syncs team collections through Git, eliminating stale docs and configuration drift. The tool directly targets the collaboration pain points of Postman and Insomnia users.

Mar 28, 202685% relevant

Secure Your MCP Servers: ClawGuard Scans for Tool Poisoning and Rug Pulls

New security tool ClawGuard scans MCP servers for hidden instructions in tool descriptions, parameter exploits, and malicious updates—critical for Claude Code users connecting to external tools.

Mar 28, 202691% relevant

OpenReward Launches: A Minimalist Service for Scaling RL Environment Serving

OpenReward, a new product from Ross Taylor, launches as a focused service for serving reinforcement learning environments at scale. It aims to solve infrastructure bottlenecks for RL training pipelines.

Mar 24, 202685% relevant

Alibaba's XuanTie C950 CPU Hits 70+ SPECint2006, Claims RISC-V Record with Native LLM Support

Alibaba's DAMO Academy launched the XuanTie C950, a RISC-V CPU scoring over 70 on SPECint2006—the highest single-core performance for the architecture—with native support for billion-parameter LLMs like Qwen3 and DeepSeek V3.

Mar 24, 202695% relevant

Alibaba Open-Sources OpenSandbox: A gVisor/Firecracker-Based Execution Environment for AI Agent Security

Alibaba has open-sourced OpenSandbox, a general-purpose execution environment that isolates AI agents in secure runtimes like gVisor or Firecracker. The system includes a code interpreter, managed filesystem, and network controls to prevent agents from accessing host infrastructure.

Mar 17, 202697% relevant

How a GPU Memory Leak Nearly Cost an AI Team a Major Client During a Live Demo

A detailed post-mortem of a critical AI inference failure during a client demo reveals how silent GPU memory leaks, inadequate health checks, and missing circuit breakers can bring down a production pipeline. The author shares the architectural fixes implemented to prevent recurrence.

Mar 17, 202695% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety