Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A technical architect configures a self-hosted observability stack with n8n, PostgreSQL, and OpenRouter to monitor…

From Prompting to Control Planes: A Self-Hosted Architecture for AI System Observability

A technical architect details a custom-built, self-hosted observability stack for multi-agent AI systems using n8n, PostgreSQL, and OpenRouter. This addresses the critical need for visibility into execution, failures, and costs in complex AI workflows.

AAAla SMITH & AI Research Desk·Mar 25, 2026·5 min read··182 views·AI-Generated·Report error

Source: medium.comvia medium_mlopsSingle Source

What Happened

Amod Karambelkar, a technical architect, has published a detailed guide on Medium outlining a custom-built system designed to bring observability to complex, multi-agent AI applications. The core problem he addresses is the transition from simple, single-prompt interactions to sophisticated "control plane" architectures where multiple AI agents, tools, and data sources interact. In these systems, understanding what happened, why it failed, and how much it cost becomes a significant operational challenge.

The article describes a self-hosted solution that avoids reliance on proprietary, vendor-locked platforms. The architecture leverages three main components:

n8n: An open-source workflow automation tool used as the orchestration engine. It defines and executes the multi-agent workflows, connecting various AI models (via APIs) and tools.
PostgreSQL: A relational database that serves as the centralized audit log. Every step of an AI workflow execution—prompts sent, responses received, tool calls, errors, and metadata—is stored here for later analysis.
OpenRouter: A unified API gateway to numerous large language models (LLMs). This provides flexibility in model choice and, crucially, a single point for tracking usage and costs across different AI providers.

The system is designed to make three key dimensions visible:

Execution Visibility: A complete, queryable audit trail of every step in a workflow.
Failure Analysis: Detailed error logging and tracing to pinpoint where and why a complex AI process broke down.
Cost Attribution: The ability to trace token usage and associated costs back to specific workflow runs, agents, or even end-users.

Technical Details

The architecture represents a pragmatic, integration-focused approach to AI Operations (AIOps). Instead of building monitoring from scratch, it stitches together mature open-source tools to create a cohesive observability plane.

Workflow Orchestration with n8n: n8n provides a low-code visual interface for designing agentic workflows. Each node in a workflow can represent an LLM call (via OpenRouter), a data processing step, a conditional branch, or a call to an external API. This makes the control logic explicit and manageable.

Centralized Logging with PostgreSQL: The critical innovation is the disciplined logging of all n8n node executions to a structured database schema. This includes timestamps, input/output payloads (which can be sanitized for PII), node execution status, and correlation IDs to trace a single user request through the entire graph of AI interactions. This transforms opaque AI execution into queryable data.

Unified Model Gateway with OpenRouter: By routing all LLM calls through OpenRouter, the system abstracts away individual provider APIs (OpenAI, Anthropic, Google, etc.). More importantly, OpenRouter's API provides standardized response headers containing precise token usage, enabling fine-grained cost tracking per request, which is then logged to PostgreSQL.

The result is a dashboard-ready data layer. Teams can use SQL or connect business intelligence (BI) tools to PostgreSQL to answer questions like: "What is the average cost per successful customer service resolution?" "Which agent in the workflow has the highest failure rate?" "What was the exact chain of events for the failed order status query from user X?"

Retail & Luxury Implications

For retail and luxury brands deploying AI beyond simple chatbots, this type of observability architecture is not a luxury—it's a prerequisite for responsible and scalable operation. The implications are significant for several high-value use cases:

1. Complex Customer Service & Concierge Agents: A luxury brand's AI concierge might chain together multiple steps: a sentiment-aware LLM interprets a customer's nuanced request, a retrieval-augmented generation (RAG) agent searches internal knowledge bases for product care instructions, a tool checks inventory in real-time, and another generates a personalized email summary. Without the observability plane described, diagnosing a failure in this chain is guesswork. With it, support teams can see the exact step where the RAG failed to retrieve the correct manual or where the inventory check timed out.

2. Agentic Personal Shopping & Styling: An AI stylist that interacts over multiple sessions, maintains a customer's style profile, and calls external APIs for product discovery and outfit visualization creates a complex stateful workflow. Observability enables tracking the stylist's "reasoning" (the chain of prompts and tool calls) for each recommendation, which is crucial for quality assurance, bias auditing, and iteratively improving the system's taste and brand alignment.

3. Supply Chain & Demand Forecasting Agents: AI systems that autonomously analyze social trends, sales data, and weather patterns to adjust production or logistics involve high-stakes decisions. The ability to audit the data sources and logical steps the AI took to arrive at a forecast is essential for risk management and regulatory compliance.

4. Cost Control at Scale: As covered in our previous analysis, enterprises are rapidly adopting RAG and agentic architectures. These systems can incur unpredictable and spiraling costs if not monitored. The integrated cost-tracking aspect of this architecture allows finance and tech teams to attribute AI spend to specific business units, campaigns, or product lines, moving AI from a nebulous R&D cost to a measurable operational expense.

The gap between this self-built solution and production readiness for a global brand like LVMH or Richemont is primarily in enterprise-grade features: robust security, access controls, scalability, and integration with existing IT monitoring stacks (e.g., Datadog, Splunk). However, the architectural blueprint is directly applicable. It demonstrates that the core need—turning AI agent workflows from a "black box" into a "glass box"—is solvable with today's technology.

Source: gentic.news · Mar 25, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This article underscores a maturation in the AI adoption curve within enterprises, a trend we've been tracking closely. It moves beyond the "what model should we use?" phase to the critical "how do we operate this reliably?" phase. This aligns with our recent coverage on enterprises favoring RAG for production (2026-03-23) and the self-healing MLOps blueprint (2026-03-16). Those articles focused on the patterns for building robust systems; this piece provides a concrete, integrable pattern for observing them. For retail and luxury AI leaders, the message is clear: the competitive advantage will soon come not just from having AI capabilities, but from having the **operational intelligence** to run them efficiently, safely, and continuously. A brand that cannot explain why its AI stylist made a specific recommendation or cannot contain the cost of its customer service automation will face significant brand, financial, and operational risks. The self-hosted approach outlined offers control and data ownership, which is paramount for luxury houses guarding proprietary data and customer relationships. However, we expect to see the vendor ecosystem rapidly mature to offer packaged solutions that incorporate these observability principles, reducing the need for in-house integration work. The next step for practitioners is to evaluate their most complex AI pipelines and demand this level of transparency from their engineering teams or platform vendors.

#ai-engineering #cost-optimization #mlops #agentic-ai

Compare side-by-side

OpenRouter vs n8n

→

Mentioned in this article

AI Agents Retrieval-Augmented Generation Amod Karambelkar OpenRouter n8n PostgreSQL

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Opinion & Analysis

CPU Demand Flipping the AI Narrative as Datacenter Growth Shifts

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

More in Opinion & Analysis

View all

Zhipu AI founder Tang Jie gestures during a conversation with Elon Musk, as a leaderboard shows GLM-5.2 ranked No. 2…

Opinion & Analysis

Zhipu GLM-5.2 Hits No. 2 Globally; Tang Tells Musk China Won't Wait Until

Zhipu's 744B-parameter GLM-5.2 ranks No. 2 globally on Code Arena. Tang Jie tells Musk China will match Fable 5 by end of 2026, not Q1 2027.

scmp.com/1d ago/3 min read/Widely Reported

chinafundingbenchmarks

Opinion & Analysis

Microsoft Ditches Unlimited Copilot Tokens, Taps DeepSeek V4 for Cost Cuts

Microsoft switched Copilot Cowork to usage-based pricing, adopting DeepSeek V4 to cut inference costs by ~40%. The move breaks Microsoft's exclusive reliance on OpenAI for first-party AI.

pandaily.com/2d ago/3 min read/Widely Reported

open-sourcemicrosoftpricing

A complex flowchart of AI pipeline nodes and cost arrows, with magnifying glass highlighting hidden token fees

Opinion & Analysis

Thinking Tokens Drive Hidden Inference Costs in Agentic Pipelines

Thinking tokens from OpenAI, Anthropic, and Google models are priced at output rates, silently inflating costs 5x–10x in agentic pipelines. Google's 80% price cut threat exposes a structural asymmetry between startups and tech giants.

pub.towardsai.net/2d ago/3 min read/Multi-Source

agentic aiaiinference

What Happened

Technical Details

Retail & Luxury Implications

AI Analysis

✨AI Toolslive

Related Articles

Fable 5: Claude's Biggest Leap Since Opus 4.5, Says Beta Tester

How Claude Code scales to 500K+ line monorepos

CLAUDE.md Wastes 7K+ Tokens Per Turn; Skills Cut to 50

Anthropic Co-Founder Predicts Self-Improving AI by 2028

How a Custom Multimodal Transformer Beat a Fine-Tuned LLM for Attribute

CPU Demand Flipping the AI Narrative as Datacenter Growth Shifts

The framework underneath this story

More in Opinion & Analysis

Zhipu GLM-5.2 Hits No. 2 Globally; Tang Tells Musk China Won't Wait Until

Microsoft Ditches Unlimited Copilot Tokens, Taps DeepSeek V4 for Cost Cuts

Thinking Tokens Drive Hidden Inference Costs in Agentic Pipelines