pipeline parallelism

9 articles about pipeline parallelism in AI news

RoundPipe: Full Fine-Tune 32B Models on a Single 24GB GPU

RoundPipe fine-tunes 32B models on a single 24GB GPU with 1.5-2.2× speedups via round-robin pipeline dispatch.

May 3, 202685% relevant

Buffett Invests in Google After SemiAnalysis TPU Deep Dive

Berkshire Hathaway invested in Google in Q3 2025, after Buffett studied TPU v5p architecture. He compared it to railroads, citing 8,960 chips and 4.8 Tbps links.

May 19, 202685% relevant

New arXiv Paper Proposes LLM-Generated 'Reference Documents' to Speed Up

A new arXiv preprint introduces a method for efficient LLM-based reranking. It uses LLMs to generate 'reference documents' that help dynamically truncate long ranked lists and optimize batch processing, achieving up to 66% speedup on TREC benchmarks.

Apr 13, 202678% relevant

Inside Claude Code’s Leaked Source: A 512,000-Line Blueprint for AI Agent Engineering

A misconfigured npm publish exposed ~512,000 lines of Claude Code's TypeScript source, detailing a production-ready AI agent system with background operation, long-horizon planning, and multi-agent orchestration. This leak provides an unprecedented look at how a leading AI company engineers complex agentic systems at scale.

Apr 3, 202686% relevant

ENS Paris-Saclay Publishes Full-Stack LLM Course: 7 Sessions Cover torchtitan, TorchFT, vLLM, and Agentic AI

Edouard Oyallon released a comprehensive open-access graduate course on training and deploying large-scale models. It bridges theory and production engineering using Meta's torchtitan and torchft, GitHub-hosted labs, and covers the full stack from distributed training to agentic AI.

Mar 27, 202665% relevant

Sam Altman Predicts Next 'Transformer-Level' Architecture Breakthrough, Says AI Models Are Now Smart Enough to Help Find It

OpenAI CEO Sam Altman stated he believes a new AI architecture, offering gains as significant as transformers over LSTMs, is yet to be discovered. He argues current advanced models are now sufficiently capable of assisting in that foundational research.

Mar 26, 202687% relevant

Claude's Subagents vs. Agent Teams: A Practical Framework for Multi-Agent System Design

Anthropic's Claude offers two distinct multi-agent models: isolated subagents for parallel tasks and communicating agent teams for complex workflows. The key design principle is to split work by context, not role, and to default to a single agent until complexity is proven necessary.

Mar 16, 202687% relevant

Wikigen: Automate GitHub Wiki Generation with a Single CLI Command

Wikigen is a Go CLI that uses Claude Code to analyze your repo and generate comprehensive GitHub Wiki documentation automatically.

Mar 15, 202695% relevant

Ring All-Reduce: The Hidden Dance Powering Modern AI Training

A new visualization reveals the intricate communication patterns behind distributed AI training. The ring all-reduce algorithm enables efficient gradient synchronization across multiple GPUs, accelerating model development while minimizing bottlenecks.

Feb 25, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety