How I Built a Production AI Query Engine on 28 Tables — And Why I Used Both Text-to-SQL and Function Calling

A detailed case study on building a secure, production-grade AI query engine for an affiliate marketing ERP. The key innovation is a hybrid architecture using Text-to-SQL for complex analytics and MCP-based function calling for actions, secured by a 3-layer AST validator.

AAAla SMITH & AI Research Desk·Mar 18, 2026·6 min read··260 views·AI-Generated·Report error

Source: pub.towardsai.netvia towards_aiMulti-Source

The Innovation — What the Source Reports

This is a detailed, first-person engineering case study from an affiliate marketing company. The core problem was operational: a non-technical team needed to query a complex, 28-table MySQL database containing live financial and performance data (revenue, fraud flags, campaign ROI) without writing SQL.

The author's journey reveals two common architectural pitfalls and the hybrid solution that succeeded.

Mistake #1: Starting with Function Calling. For a complex analytical query like “Compare campaign ROI this month vs last month, by traffic source, excluding fraud flags, grouped by affiliate tier,” defining a JSON schema for function calling required 15 nested parameters. The LLM hallucinated values, leading to garbage results. The author concluded that while excellent for simple, single-table retrievals, function calling breaks down for multi-table analytical queries with temporal logic—precisely what SQL was designed for.

Mistake #2: Relying Solely on Text-to-SQL. Letting an LLM write SQL directly solved the complexity and accuracy problem. However, the prospect of granting an LLM direct access to a production database was terrifying. Relying solely on a system prompt like “only write SELECT statements” was deemed an insufficient guarantee against prompt injection or model regression that could lead to a DELETE or DROP command.

The breakthrough was abandoning the either/or choice and implementing a Router Pattern.

The Hybrid Architecture: Brain and Hands

🧠 The Brain (Text-to-SQL): Used for analytical queries crossing multiple tables. The LLM writes real SQL, offering full flexibility for data exploration.
✋ The Hands (Function Calling via MCP): Used for system actions—like suspending an affiliate or flagging a transaction. This path uses predefined tools only, with a human-in-the-loop approval step for anything irreversible.
A lightweight intent classifier (the router) determines the path before any LLM call or execution.

The Critical Security Layer: The 3-Layer AST Validator

This is the core technical contribution that made the Text-to-SQL path safe for production. Instead of trusting the LLM's instructions, the system intercepts and mathematically validates generated SQL before execution.

Layer 1 — Regex: A fast (<1ms) filter that blocks obvious dangerous keywords (DELETE, DROP, UPDATE) and catches multi-statement injections via semicolon patterns.
Layer 2 — AST (Abstract Syntax Tree): The node-sql-parser converts the SQL string into a typed AST. The validator checks stmt.type === 'select' on the parsed semantic structure. This provides a mathematical guarantee that the query is a SELECT, even if the raw text is obfuscated (e.g., /*DELETE*/ SELECT ...).
Layer 3 — Allowlist: Recursively walks the AST (checking FROM clauses, JOINs, subqueries) to validate every table name against a whitelist (ALLOWED_TABLES). This prevents lateral movement, ensuring the LLM cannot access unauthorized tables.

The Action Layer: MCP for Tools and Workflows

For the “Hands” path, the author used the Model Context Protocol (MCP) to expose predefined tools (e.g., suspend_affiliate()). These tools are hand-crafted Node.js functions that assemble safe queries internally. For irreversible actions, the system mandates a human approval step (e.g., via a Slack or Telegram alert) before execution.

Why This Matters for Retail & Luxury

While the case study is from affiliate marketing, the architectural pattern and challenges are directly transferable to retail and luxury enterprises. These companies sit on vast, siloed data landscapes: ERP systems (SAP, Oracle), CRM platforms (Salesforce), PIMs, e-commerce databases, inventory management systems, and clienteling platforms.

Concrete Scenarios:

Merchandising & Planning: A planner asks, “What was the sell-through rate for handbags in EMEA last quarter, broken down by subcategory and price tier, excluding markdowns?”
Retail Operations: A store manager queries, “Which top-tier clients haven't visited in 90 days but had an average transaction value over €5,000 in the past year?”
E-commerce & Marketing: A digital lead asks, “Compare the conversion rate of our new loyalty campaign landing page vs. the standard product page for users from paid social, segmented by device type.”

Currently, answering these questions requires a ticket to the BI team, SQL expertise, or navigating pre-built dashboards that lack flexibility. This architecture promises to unlock that data for strategic decision-makers in near real-time.

Business Impact

The impact is operational efficiency and accelerated decision-making. The author reports 18 automation workflows running in production using this pattern. The validator blocked zero legitimate queries while catching multiple injection attempts during testing.

Quantifying the impact for retail:

Reduced Time-to-Insight: From hours/days (submitting a BI request) to seconds.
Increased Data Utilization: Empowers non-technical teams in merchandising, retail, and marketing to explore data independently, leading to more data-driven decisions.
Operational Automation: Safe, approval-based workflows for actions like placing a high-value client on a restricted list or flagging suspicious inventory movement.

The author notes that most “production AI” demos skip the security layer. This architecture bridges the gap between a flashy demo and a system you can trust with sensitive financial and customer data.

Implementation Approach

Technical Requirements & Complexity:

Database & Schema: A structured relational database (MySQL, PostgreSQL, Snowflake). A well-defined, documented schema is non-negotiable.
LLM Foundation: A capable LLM (GPT-4, Claude 3, etc.) with strong reasoning and code-generation abilities.
Orchestration Layer: Custom application logic to handle the routing, LLM calls, validation, and tool execution. The author used Node.js.
Security Infrastructure: The 3-layer validator (node-sql-parser for AST) and a robust tool/action framework (MCP is a strong candidate).
Observability: Logging for validation errors (which layer failed) and query results is critical for monitoring and improvement.

Effort & Maturity: This is not a plug-and-play solution. It requires significant in-house engineering expertise or a trusted systems integrator. The core complexity lies in the semantic layer—curating context (table meanings, relationships, business logic) for the LLM and meticulously defining the safe action tools. The author notes that adding semantic descriptions to tables (e.g., “affiliate_tier (Platinum/Gold/Silver)”) dramatically improved SQL quality.

Governance & Risk Assessment

Data Privacy & Security: The AST validator with allowlisting is the primary technical control, providing a mathematical guarantee against unauthorized data access or destructive commands. This is essential for compliance with regulations like GDPR, where customer data access must be strictly controlled.
Bias & Accuracy: The system is only as good as the LLM's SQL generation and the provided context. Hallucinated JOIN conditions or incorrect aggregations could lead to flawed business insights. The author identifies the need for an “Eval CI gate”—automated accuracy testing on a suite of benchmark queries with every deployment—as a critical next step.
Audit Trail: All queries and actions, especially those requiring human approval, must be logged immutably. This is crucial for troubleshooting and compliance audits.
Maturity Level: This pattern represents an advanced, production-ready approach for a specific use case (natural language to database query/action). It moves beyond prototype stage by addressing the paramount concern of security in a mission-critical environment. It is a blueprint for enterprises serious about operationalizing generative AI for data access.

Sources cited in this article

The LLM

Source: gentic.news · Mar 18, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

For AI leaders in retail and luxury, this case study is a masterclass in pragmatic, secure AI architecture. The most valuable takeaway is the **hybrid router pattern**—the explicit separation of analytical `READ` operations from system `WRITE/ACTION` operations. In retail, this maps perfectly to separating data exploration (e.g., "analyze regional sales trends") from operational commands (e.g., "issue a refund" or "block this SKU from online sale"). Trying to force both through a single LLM pathway is where systems fail or become dangerously insecure. The **3-layer AST validator** is the non-negotiable component for any Text-to-SQL system touching production data. Luxury houses, with their exceptionally sensitive client data (purchase history, preferences, contact details), cannot rely on prompt-based security. The mathematical guarantee provided by AST parsing and table allowlisting is what transforms a risky experiment into a governable enterprise tool. This level of rigor should be a requirement in any vendor evaluation or internal build specification. Finally, the author's note on the **semantic layer** is crucial. Simply feeding an LLM your database schema (`inventory_table`, `client_table`) is insufficient. You must invest in curating a business glossary that explains what `loyalty_tier` means, how `seasonal_markdown` is calculated, and the relationship between a `style_code` and a `product_sku`. This contextual knowledge is the difference between a query that runs and a query that delivers accurate, actionable insight. This work is domain-specific and cannot be outsourced; it requires deep collaboration between AI engineers and merchandising, retail, and CRM business experts.

#case study #data architecture #security #ai engineering

Compare side-by-side

text-to-SQL vs Function Calling

→

Mentioned in this article

text-to-SQL Function Calling AST Validator MySQL

Enjoyed this article?