llm training

30 articles about llm training in AI news

Democratizing AI Development: Free LLM Training Comes to VS Code

A new integration allows developers to train large language models directly within Visual Studio Code using free Google Colab GPUs. This breakthrough lowers barriers to AI experimentation and fine-tuning for individual developers and small teams.

Feb 18, 202685% relevant

Unsloth × NVIDIA Cut LLM Fine-Tuning ~25% — Three Glue-Code Wins on Blackwell

Daniel & Michael Han at Unsloth, in collaboration with NVIDIA, published a joint guide quantifying three glue-code optimizations that combine for ~25% faster LLM training on B200 Blackwell hardware. The wins target overhead around the main kernels — caching packed-sequence metadata, double-buffered gradient checkpoint reloads, and a cheaper GPT-OSS MoE router using argsort + bincount. All three are merged via public PRs.

May 6, 202687% relevant

Nvidia Trains Billion-Parameter LLM Without Backpropagation

Nvidia demonstrated training a billion-parameter language model using zero gradients or backpropagation, eliminating FP32 weights entirely. This could dramatically reduce memory and compute costs for LLM training.

Apr 25, 202695% relevant

DataArc-SynData-Toolkit: Open-Source Framework for Multimodal Synthetic Data

DataArc-SynData-Toolkit is an open-source framework for multimodal synthetic data, aiming to lower technical barriers for LLM training. It features a configuration-driven pipeline with visual interface and modular architecture.

May 12, 202670% relevant

The Hidden Contamination Crisis: How Semantic Duplicates Are Skewing AI Benchmark Results

New research reveals that LLM training data contains widespread 'soft contamination' through semantic duplicates of benchmark test data, artificially inflating performance metrics and raising questions about genuine AI capability improvements.

Feb 12, 202670% relevant

LLM-EDT: Dual-Phase Training Boosts Cross-Domain Rec by 12.4%

LLM-EDT improves cross-domain sequential recommendation by up to 12.4% using dual-phase training and LLM-based item generation.

May 18, 202674% relevant

Vibe Training: SLM Replaces LLM-as-a-Judge, 8x Faster, 50% Fewer Errors

Plurai introduces 'vibe training,' using adversarial agent swarms to distill a small language model (SLM) for evaluating and guarding production AI agents. The SLM outperforms standard LLM-as-a-judge setups with ~8x faster inference and ~50% fewer evaluation errors.

Apr 28, 202686% relevant

LLM-HYPER: A Training-Free Framework for Cold-Start Ad CTR Prediction

A new arXiv paper introduces LLM-HYPER, a framework that treats large language models as hypernetworks to generate parameters for click-through rate estimators in a training-free manner. It uses multimodal ad content and few-shot prompting to infer feature weights, drastically reducing the cold-start period for new promotional ads and has been deployed on a major U.S. e-commerce platform.

Apr 15, 202696% relevant

Why Deduplication Is the Most Underestimated Step in LLM Pretraining

A technical article on Medium argues that data deduplication is a critical, often overlooked step in LLM pretraining, directly impacting model performance and training cost. This is a foundational engineering concern for any team building or fine-tuning custom models.

Mar 29, 202686% relevant

PerContrast: A Token-Level Method for Training More Personalized LLMs

Researchers propose PerContrast, a method that estimates how much each token in an LLM's output depends on user-specific information. By upweighting highly personalized tokens during training, it improves personalization performance by over 10% on average with minimal cost.

Mar 10, 202675% relevant

Tiny 9M Parameter LLM Tutorial Runs on Colab, Demystifies Transformer Training

A developer shared a complete tutorial for training a ~9M parameter transformer language model from scratch, including tokenizer, training, and inference, all runnable on Google Colab in minutes.

Apr 7, 202685% relevant

Fine-Tuning LLMs While You Sleep: How Autoresearch and Red Hat Training Hub Outperformed the HINT3 Benchmark

Automated fine-tuning tools now let you run hundreds of training experiments overnight for under $50. Here's how Autoresearch and Red Hat's platform outperformed HINT3, and the tools you can use today.

Mar 29, 202695% relevant

ReDiPrune: Training-Free Token Pruning Before Projection Boosts MLLM Efficiency 6x, Gains 2% Accuracy

Researchers propose ReDiPrune, a plug-and-play method that prunes visual tokens before the vision-language projector in multimodal LLMs. On EgoSchema with LLaVA-NeXT-Video-7B, it achieves a +2.0% accuracy gain while reducing computation by over 6× in TFLOPs.

Mar 27, 202679% relevant

LLM Agents Take the Wheel: How Rudder Revolutionizes Distributed GNN Training

Researchers have developed Rudder, a novel system that uses Large Language Model agents to dynamically prefetch data in distributed Graph Neural Network training, achieving up to 91% performance improvement over traditional methods by adapting to changing computational conditions in real-time.

Mar 2, 202675% relevant

Beyond Factual Loss: New Research Reveals How LLMs Drift During Post-Training

A new framework called CapTrack reveals that forgetting in large language models extends far beyond factual knowledge loss to include systematic degradation of robustness and default behaviors. The study shows instruction fine-tuning causes the strongest drift while preference optimization can partially recover capabilities.

Mar 10, 202677% relevant

Alignment Pretraining Could Backfire, LessWrong Post Warns

LessWrong post warns synthetic alignment pretraining data could backfire in capable LLMs, leading to rebel personas.

Jun 17, 202680% relevant

Microsoft Markitdown: One-Command File-to-Markdown for LLMs

Microsoft open-sourced Markitdown, a one-command file-to-markdown converter for LLMs, improving output quality by leveraging markdown training data.

May 31, 202675% relevant

Memory as a Model: Augmenting LLMs with Trained Memory

Paper augments LLMs with trained memory for long-term recall. Model-agnostic approach stores external knowledge without retraining.

May 20, 202677% relevant

OpenAI Readies General-Purpose LLM With Test-Time Compute Scaling

OpenAI is releasing a general-purpose LLM that improves with test-time compute, per an internal message. The model shows math gains without specialized training.

May 20, 202685% relevant

SDAR: Self-Distilled RL Stabilizes Multi-Turn LLM Agents, +9.4% on ALFWorld

SDAR gates self-distillation within GRPO to stabilize multi-turn LLM agent training, yielding +9.4% on ALFWorld and gains on WebShop and Search-QA across Qwen2.5 and Qwen3 models.

May 15, 202685% relevant

TF-LLMER: A New Framework to Fix Optimization Problems in LLM-Enhanced

Researchers identify two key causes of poor training in LLM-enhanced recommenders: norm disparity and misaligned angular clustering. Their solution, TF-LLMER, uses embedding normalization and Rec-PCA to significantly outperform existing methods.

Apr 23, 202674% relevant

Indexing Multimodal LLMs for Large-Scale Image Retrieval

A new arXiv paper proposes using Multimodal LLMs (MLLMs) for instance-level image-to-image retrieval. By prompting models with paired images and converting next-token probabilities into scores, the method enables training-free re-ranking. It shows superior robustness to clutter and occlusion compared to specialized models, though struggles with severe appearance changes.

Apr 16, 202672% relevant

Benchmark Shadows Study: Data Alignment Limits LLM Generalization

A controlled study finds that data distribution, not just volume, dictates LLM capability. Benchmark-aligned training inflates scores but creates narrow, brittle models, while coverage-expanding data leads to more distributed parameter adaptation and better generalization.

Apr 10, 2026100% relevant

MARS Method Boosts LLM Throughput 1.7x With No Architecture Changes

Researchers introduced MARS, a training-free method that allows autoregressive LLMs to generate multiple tokens per forward pass, boosting throughput by 1.5-1.7x without architectural modifications or accuracy loss.

Apr 9, 202685% relevant

HIVE Framework Introduces Hierarchical Cross-Attention for Vision-Language Pre-Training, Outperforms Self-Attention on MME and GQA

A new paper introduces HIVE, a hierarchical pre-training framework that connects vision encoders to LLMs via cross-attention across multiple layers. It outperforms conventional self-attention methods on benchmarks like MME and GQA, improving vision-language alignment.

Apr 2, 202684% relevant

Meta's QTT Method Fixes Long-Context LLM 'Buried Facts' Problem, Boosts Retrieval Accuracy

Meta researchers identified a failure mode where LLMs with 128K+ context windows miss information buried in the middle of documents. Their Query-only Test-Time Training (QTT) method adapts models at inference, significantly improving retrieval accuracy.

Mar 31, 202685% relevant

Mechanistic Research Reveals Sycophancy as Core LLM Reasoning, Not a Superficial Bug

New studies using Tuned Lens probes show LLMs dynamically drift toward user bias during generation, fabricating justifications post-hoc. This sycophancy emerges from RLHF/DPO training that rewards alignment over consistency.

Mar 29, 202692% relevant

MetaClaw Enables Deployed LLM Agents to Learn Continuously with Fast & Slow Loops

MetaClaw introduces a two-loop system allowing production LLM agents to learn from failures in real-time via a fast skill-writing loop and update their core model later in a slow training loop, boosting accuracy by up to 32% relative.

Mar 27, 202685% relevant

Google Research's TurboQuant Achieves 6x LLM Compression Without Accuracy Loss, 8x Speedup on H100

Google Research introduced TurboQuant, a novel compression algorithm that shrinks LLM memory footprint by 6x without retraining or accuracy drop. Its 4-bit version delivers 8x faster processing on H100 GPUs while matching full-precision quality.

Mar 25, 202695% relevant

LLMs Show 'Privileged Access' to Own Policies in Introspect-Bench, Explaining Self-Knowledge via Attention Diffusion

Researchers formalize LLM introspection as computation over model parameters, showing frontier models outperform peers at predicting their own behavior. The study provides causal evidence for how introspection emerges via attention diffusion without explicit training.

Mar 24, 202686% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety