unsloth
13 articles about unsloth in AI news
Unsloth × NVIDIA Cut LLM Fine-Tuning ~25% — Three Glue-Code Wins on Blackwell
Daniel & Michael Han at Unsloth, in collaboration with NVIDIA, published a joint guide quantifying three glue-code optimizations that combine for ~25% faster LLM training on B200 Blackwell hardware. The wins target overhead around the main kernels — caching packed-sequence metadata, double-buffered gradient checkpoint reloads, and a cheaper GPT-OSS MoE router using argsort + bincount. All three are merged via public PRs.
Unsloth Offers Free Fine-Tuning for Google Gemma 4 via Colab Notebook
Unsloth has released a Colab notebook enabling free fine-tuning of Google's Gemma 4 model. This simplifies the process of customizing a state-of-the-art open-weight LLM using just a browser.
Fine-Tune Phi-3 Mini with Unsloth: A Practical Guide for Product Information Extraction
A technical tutorial demonstrates how to fine-tune Microsoft's compact Phi-3 Mini model using the Unsloth library for structured information extraction from product descriptions, all within a free Google Colab notebook.
NVIDIA and Unsloth Release Comprehensive Guide to Building RL Environments from Scratch
NVIDIA and Unsloth have published a detailed practical guide on constructing reinforcement learning environments from the ground up. The guide addresses critical gaps often overlooked in tutorials, covering environment design, when RL outperforms supervised fine-tuning, and best practices for verifiable rewards.
vLLM Optimizations Cut Voice AI Latency by 40% on 6-GPU Cluster
vLLM optimizations on a 6-GPU cluster reduced voice AI latency by 40% for a Qwen-based system, enabling 500 concurrent sessions per node without hardware upgrades.
The Developer's Guide to Finetuning LLMs
A developer-focused article outlines decision frameworks for LLM finetuning—covering when it's worth the cost, how to approach it, and key trade-offs. For retail leaders, this is a practical primer on customizing models for brand-specific tasks.
AI Fine-Tuning: Why the Technique Matters More Than Which Model You Pick
Sanket Parmar argues that fine-tuning shapes model behaviour for your domain more than base model selection. The article emphasizes that investing in adaptation yields better returns than chasing the latest foundation model.
Qwen3.6-27B: How to Run a 17GB Local Model That Beats 397B MoE on Coding Tasks
Qwen3.6-27B delivers flagship-level coding performance in a 55.6GB model that can be quantized to 16.8GB, making high-quality local coding assistance accessible.
Hugging Face OCRs 27,000 arXiv Papers to Markdown with Open 5B Model
Hugging Face CEO Clement Delangue announced the OCR conversion of 27,000 arXiv papers to Markdown using an open 5B-parameter model and 16 parallel jobs on L40S GPUs. This demonstrates a scalable, open-source pipeline for large-scale academic document processing.
Fine-Tuning LLMs While You Sleep: How Autoresearch and Red Hat Training Hub Outperformed the HINT3 Benchmark
Automated fine-tuning tools now let you run hundreds of training experiments overnight for under $50. Here's how Autoresearch and Red Hat's platform outperformed HINT3, and the tools you can use today.
Open-Source LLM Course Revolutionizes AI Education: Free GitHub Repository Challenges Paid Alternatives
A comprehensive GitHub repository called 'LLM Course' by Maxime Labonne provides complete, free training on large language models—from fundamentals to deployment—threatening the market for paid AI courses with its organized structure and practical notebooks.
Open-Source Hack Enables Free Claude Code Execution with Local LLMs
Developers have discovered a method to run Anthropic's Claude Code using local LLMs without API costs or data leaving their machines. By redirecting API calls through environment variables, users can leverage open-source models like Qwen3.5 for private, cost-free coding assistance.
Democratizing AI Development: Free LLM Training Comes to VS Code
A new integration allows developers to train large language models directly within Visual Studio Code using free Google Colab GPUs. This breakthrough lowers barriers to AI experimentation and fine-tuning for individual developers and small teams.