Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A software developer's terminal window displays code with a Claude AI assistant panel open, highlighting the…

Distillery 0.4.0 Stabilizes Its MCP API

Distillery 0.4.0 stabilizes its MCP API surface, enabling reliable agent memory and team knowledge bases for Claude Code workflows.

AAAla SMITH & AI Research Desk·Apr 20, 2026·4 min read··105 views·AI-Generated·Report error

Source: dev.tovia devto_claudecode, hn_claude_codeMulti-Source

TL;DR

Distillery's 0.4.0 release fixes the MCP tool surface as a public contract, making it safe to build agents and dashboards on top of your team's Claude Code knowledge base.

Distillery 0.4.0 Stabilizes Its MCP API: Your Team's Claude Code Memory Just Got Production-Ready

What Changed — The API Hardening Release

Distillery 0.4.0 shipped on April 19, 2026, with one clear mission: turn the MCP tool surface into a public contract. This isn't about new features—it's about stability. The release hardens tool names, parameter shapes, error codes, and response envelopes. Breaking changes now require a major version bump, with deprecation warnings first.

This follows Distillery's initial launch as a solution to Claude Code's "evaporating knowledge" problem. While Claude Managed Agents launched with memory labeled "research preview," the community ecosystem (including Distillery, memsearch, Honcho, Hippo, and others) rushed to fill the gap. Teams need production-ready memory layers, not preview features.

What It Means For Your Claude Code Workflow

Stable Foundations for Agent Development

Cover image for Full-Proof: Distillery 0.4.0 and the Agent Memory Problem

If you're building agents with Claude Code, the memory layer sits under everything. Planners read from it, tools write to it, and evaluations depend on it being deterministic. Before 0.4.0, drifting tool names or changing response shapes could break downstream agents. Now you can pin against min_server_version=0.4.0 with confidence.

Practical Improvements That Matter Today

distillery_list defaults to output_mode="summary": A typical limit=50 GitHub sync response shrinks from ~300KB to a few kilobytes of titles, tags, and previews. This directly impacts your context window usage.
Async feed syncing: gh-sync now runs via server-side background jobs. Long syncs don't block your Claude Code session—poll distillery_sync_status for progress instead.
Better scheduling: /setup and /watch now configure Claude Code routines instead of CronCreate jobs or GitHub Actions. Three routines ship: hourly feed poll, daily stale check, weekly maintenance.
Cleaner interest profiles: Feed entries are excluded from the interest profile, so /radar won't drift toward whatever feed happens to be loudest that week.

Try It Now — Commands That Work Differently

Update Your Installation

# Update to the stable release
npm update -g @distillery/cli

# Or install fresh
npm install -g @distillery/cli

Configure Claude Code Routines

After updating, run:

claude code /setup

This configures three Claude Code routines instead of external cron jobs:

Hourly feed polling
Daily stale entry checking
Weekly maintenance

The deprecated webhook endpoints (/hooks/poll, /hooks/rescore, /hooks/classify-batch) will log warnings if hit.

Use the New Defaults

When querying your knowledge base:

# Old way (could flood context window)
claude code /recall "authentication patterns"

# New default behavior - automatically uses summary mode
claude code /recall "authentication patterns"
# Returns: titles, tags, previews instead of full content

# If you need full content, explicitly request it
claude code /recall "authentication patterns" output_mode="full"

Monitor Async Operations

For long-running sync operations:

# Start a sync
claude code /sync gh-sync

# Check status without blocking
claude code /recall sync_status

The Bigger Picture: Karpathy's LLM Wiki Pattern

Distillery implements what Andrej Karpathy calls the "LLM Wiki" pattern: raw sources → synthesized wiki → query layer. This aligns with the broader trend in agent memory systems, where three-tier architectures (fast index, episodes, raw transcripts) are becoming standard.

The community has seen rapid iteration here—within a week of Karpathy's post, projects like Knowledge Raven, Memoriki, OpenTrace KG-MCP, and OptiVault emerged. Distillery takes a pragmatic approach: single DuckDB table with hybrid BM25 plus vector search, rather than complex L0/L1/L2 access tiers.

What makes Distillery operational is its focus on provenance, correction, and expiration. Every entry carries author and session ID. Entries can be corrected without losing history. They can be marked expired without deletion. These primitives let a shared knowledge base admit when it was wrong—essential for a living memory layer.

What's Next: Dashboards and Integrations

With the MCP surface stabilized, the dashboard/ directory in the repo is becoming a SvelteKit dashboard. Community plugins can now build on Distillery with the same confidence as any public SDK. LangChain orchestrators, Letta-style frameworks, and MCP-native runtimes can treat Distillery as a durable backend.

The team is tracking LongMemEval benchmarks and transcript-mining integration in issue #233—this is how they'll measure against three-tier systems like Vektori and MemPalace.

For now, the message is clear: if you're using Claude Code with a team, your knowledge base just graduated from experimental to production-ready.

Source: gentic.news · Apr 20, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

**Immediate Action:** Update to Distillery 0.4.0 if you're using it for team knowledge management. The API stability means you can now build reliable agents on top of it without fear of breaking changes. **Workflow Change:** Stop using external cron jobs for Distillery scheduling. Run `claude code /setup` once to configure Claude Code routines instead. This keeps everything within your Claude Code environment. **Context Window Optimization:** Take advantage of the new `output_mode="summary"` default. When querying with `/recall`, you'll get compact results by default. Only use `output_mode="full"` when you specifically need full content. This can dramatically reduce token usage in agent workflows. **Async Pattern:** For any feed syncing (GitHub, RSS, etc.), use the new async pattern. Start the sync, then poll `distillery_sync_status` instead of waiting. This prevents your Claude Code session from blocking on long operations. **Interest Profile Cleanup:** If you use `/radar` for discovery, your results will now be cleaner—feed noise won't dominate your interest profile. This makes the feature more useful for finding genuinely relevant knowledge.

#release-update #team-workflows #agent-development #mcp-servers

Compare side-by-side

Claude Code vs Distillery

→

Mentioned in this article

Distillery Claude Code MCP API Claude Managed Agents memsearch Honcho Hippo

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Products & Launches2 shared topics

Claude Managed Agents: How to Build on the Platform Instead of in Its Gaps

Open Source

Compass v1.1.0 Ships Recall Consumption Fix 12 Hours After Launch

Open Source

Claude Code Users: Why Your Rules Get Ignored (And How to Fix It with CLAUDE.md)

Open Source

50-line script bypasses Anthropic's Claude pricing split for CI/CD

Open Source

Claude Code Autonomously Ported Lightroom CC to Linux

Open Source

Permission-first CLAUDE.md kit aims to fix agent overreach

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in Open Source

View all

Researchers collaborate on a dashboard displaying multimodal AI data pipelines merging text, images, and healthcare…

Open Source

DataArc-SynData-Toolkit: Open-Source Framework for Multimodal Synthetic Data

DataArc-SynData-Toolkit is an open-source framework for multimodal synthetic data, aiming to lower technical barriers for LLM training. It features a configuration-driven pipeline with visual interface and modular architecture.

arxiv.org/May 12, 2026/3 min read/Multi-Source

open-sourceresearchllm

Open SourceBreakthrough

100

Google Releases Gemma 4 Family Under Apache 2.0, Featuring 2B to 31B Models with MoE and Multimodal Capabilities

Google has released the Gemma 4 family of open-weight models, derived from Gemini 3 technology. The four models, ranging from 2B to 31B parameters and including a Mixture-of-Experts variant, are available under a permissive Apache 2.0 license and feature multimodal processing.

engadget.com/Apr 2, 2026/3 min read/Widely Reported

product launchopen sourcegoogle

A sleek interface shows a waveform graph with a transcription panel, highlighting Cohere's ASR model achieving top…

Open Source

Cohere Transcribe: 2B-Parameter Open-Source ASR Model Achieves 5.42% WER, Topping Hugging Face Leaderboard

Cohere released Transcribe, a 2B-parameter open-source speech recognition model. It claims a 5.42% average word error rate, beating OpenAI Whisper v3 and topping the Hugging Face Open ASR Leaderboard.

the-decoder.com/Mar 27, 2026/3 min read/Widely Reported

open-sourcespeech-aibenchmarks

What Changed — The API Hardening Release

What It Means For Your Claude Code Workflow

Stable Foundations for Agent Development

Practical Improvements That Matter Today

Try It Now — Commands That Work Differently

Update Your Installation

Configure Claude Code Routines

Use the New Defaults

Monitor Async Operations

The Bigger Picture: Karpathy's LLM Wiki Pattern

What's Next: Dashboards and Integrations

AI Analysis

✨AI Toolslive

Related Articles

Claude Managed Agents: How to Build on the Platform Instead of in Its Gaps

Compass v1.1.0 Ships Recall Consumption Fix 12 Hours After Launch

Claude Code Users: Why Your Rules Get Ignored (And How to Fix It with CLAUDE.md)

50-line script bypasses Anthropic's Claude pricing split for CI/CD

Claude Code Autonomously Ported Lightroom CC to Linux

Permission-first CLAUDE.md kit aims to fix agent overreach

The framework underneath this story

More in Open Source

DataArc-SynData-Toolkit: Open-Source Framework for Multimodal Synthetic Data

Google Releases Gemma 4 Family Under Apache 2.0, Featuring 2B to 31B Models with MoE and Multimodal Capabilities

Cohere Transcribe: 2B-Parameter Open-Source ASR Model Achieves 5.42% WER, Topping Hugging Face Leaderboard