PodcastBrain: A Technical Breakdown of a Multi-Agent AI System That Learns User Preferences

PodcastBrain: A Technical Breakdown of a Multi-Agent AI System That Learns User Preferences

A developer built PodcastBrain, an open-source, local AI podcast generator where two distinct agents debate any topic. The system learns user preferences via ratings and adjusts future content, demonstrating a working feedback loop with multi-agent orchestration.

Ggentic.news Editorial·2h ago·4 min read·6 views·via towards_ai
Share:

PodcastBrain: A Technical Breakdown of a Multi-Agent AI System That Learns User Preferences

A new open-source project, PodcastBrain, demonstrates a sophisticated approach to building multi-agent AI systems with persistent memory and user preference learning. Unlike many demos that perform a single impressive trick, this system orchestrates three distinct AI agents to generate dynamic podcast conversations that adapt based on user feedback. It's a concrete example of moving beyond stateless AI applications toward systems that improve with use.

The Architecture: Three Specialized Agents

The core innovation lies in its clear separation of concerns across three AI agents, orchestrated using the Agno framework and powered by Google Gemini 2.5 Flash.

  1. The Host (Alex): This agent is designed to mimic a real podcast host. Its system prompt includes a critical constraint: "BE BRIEF: 1–3 short sentences ONLY." This forces the LLM to make editorial choices, maintaining conversational rhythm instead of defaulting to comprehensive, paragraph-length responses. Before each episode, Alex reads a dynamically built preference context from a SQLite database, which includes the user's past liked topics, self-reported knowledge level, preferred depth, and tone.

  2. The Expert (Dr. Sam): This agent acts as the domain authority. It uses Agno's ReasoningTools (with think() and analyze() enabled) to perform internal chain-of-thought reasoning before generating a response, leading to more coherent and factual answers. Its brevity constraint is slightly looser (2–4 sentences) to allow for developing insights.

  3. The Summarizer: This agent has a single, precise job: convert the raw conversation transcript into a clean, structured JSON object using a Pydantic schema. This ensures reliable data storage for the learning system and episode history, decoupling data extraction from conversation generation.

A central podcast.py orchestrator manages the turn-by-turn conversation, handles API retries, and triggers the summarization and audio synthesis pipelines.

The Learning System: From Static to Adaptive

Most AI applications are stateless, treating each interaction as independent. PodcastBrain's key differentiator is its SQLite-based preference learning system. When a user rates an episode (thumbs up/down), that rating, along with the topic and user-provided preferences (knowledge level, depth, tone), is stored.

For subsequent episodes, the Host agent receives this consolidated preference context as part of its system prompt. This allows the conversation to adapt in real-time—skipping basic definitions for an "advanced" user, diving deeper for a "deep-dive" preference, or adopting a more debate-oriented tone if the user has consistently enjoyed contentious topics.

The author emphasizes that the interesting engineering challenge isn't the podcast format itself, but implementing this closed feedback loop where the system gets "measurably better at serving you the more you use it."

Technical Stack and Practical Considerations

The project is built to be run locally for free:

  • Framework: Agno 2.2+ for multi-agent orchestration.
  • LLM: Google Gemini 2.5 Flash (or 2.0 Flash on the free tier for higher request limits).
  • Voice Synthesis: ElevenLabs Text-to-Speech for generating a dual-voice MP3 from the transcript.
  • UI: Streamlit for the frontend interface.
  • Memory: SQLite for storing episodes, ratings, and user preferences.

The author provides clear guidance on API limits: a typical 3-turn episode consumes about 7 API calls. On Gemini 2.0 Flash's free tier (1,500 requests/day), this allows for ~200 episodes daily, while the Gemini 2.5 Flash free tier (20 requests/day) limits usage to 2-3 episodes.

Challenges and Key Insights

The development process highlighted non-obvious challenges:

  • Enforcing Brevity: Explicit, hard-coded sentence limits were necessary to achieve natural podcast dynamics, countering the LLM's default verbosity.
  • Agent Specialization: Rigidly separating the roles of Host (conversation flow), Expert (domain insight), and Summarizer (data extraction) proved superior to a single, multi-role agent.
  • Structured Outputs: Using Pydantic schemas with the Summarizer agent guaranteed data reliability for the learning subsystem, a crucial foundation for personalization.

The project is presented not as a consumer product but as a functional blueprint for building stateful, learning-capable multi-agent systems. The complete code is open-sourced, providing a practical reference for engineers exploring beyond simple chatbot prototypes.

AI Analysis

For retail and luxury AI practitioners, PodcastBrain is less about audio content and more about a **reference architecture for personalized, multi-agent systems**. The core concepts—specialized agents, a persistent preference memory, and a closed-loop feedback system—are directly transferable to high-value retail scenarios. Imagine a **virtual personal shopping assistant** built on similar principles. One agent could act as the stylist (akin to the Host), engaging the customer in conversation about an event or need. A second, specialized agent (the Expert) could have deep knowledge of current inventory, trends, and brand heritage. A third agent could structure the interaction data for the CRM. With each interaction and explicit feedback ("loved this suggestion," "not my style"), the system would refine its understanding of the customer's taste, budget sensitivity, and preferred brands, making future recommendations increasingly precise. The technical demonstration of using a simple SQLite database to create a persistent 'taste profile' is particularly relevant. Luxury retail thrives on deep client relationships and remembered preferences. This project shows a pragmatic, implementable path to codifying that relationship into an AI system that learns and adapts, moving beyond static recommendation engines to dynamic, conversational commerce advisors. The maturity is at the prototype level, but the architectural patterns are production-ready. The main implementation effort for a luxury brand would involve integrating with existing PIM (Product Information Management) and CRM systems, ensuring brand voice is perfectly encoded into the agent personas, and establishing rigorous guardrails for style and brand alignment.
Original sourcepub.towardsai.net

Trending Now

More in Products & Launches

View all