systems architecture
30 articles about systems architecture in AI news
UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems
A new arXiv paper introduces UniMixer, a unified scaling architecture for recommender systems. It bridges attention-based, TokenMixer-based, and factorization-machine-based methods into a single theoretical framework, aiming to improve parameter efficiency and scaling return on investment (ROI).
AI Agent Types and Communication Architectures: From Simple Systems to Multi-Agent Ecosystems
A guide to designing scalable AI agent systems, detailing agent types, multi-agent patterns, and communication architectures for real-world enterprise production. This represents the shift from reactive chatbots to autonomous, task-executing AI.
Multi-Agent AI Systems: Architecture Patterns and Governance for Enterprise Deployment
A technical guide outlines four primary architecture patterns for multi-agent AI systems and proposes a three-layer governance framework. This provides a structured approach for enterprises scaling AI agents across complex operations.
Beyond Self-Play: The Triadic Architecture for Truly Self-Evolving AI Systems
New research reveals why AI self-play systems plateau and proposes a triadic architecture with three key design principles that enable sustainable self-evolution through measurable information gain across iterations.
Memory Systems for AI Agents: Architectures, Frameworks, and Challenges
A technical analysis details the multi-layered memory architectures—short-term, episodic, semantic, procedural—required to transform stateless LLMs into persistent, reliable AI agents. It compares frameworks like MemGPT and LangMem that manage context limits and prevent memory drift.
Beyond Architecture: How Training Tricks Make or Break AI Fraud Detection Systems
New research reveals that weight initialization and normalization techniques—often overlooked in AI development—are critical for graph neural networks detecting financial fraud on blockchain networks. The study shows these training practices affect different GNN architectures in dramatically different ways.
8 RAG Architectures Explained for AI Engineers: From Naive to Agentic Retrieval
A technical thread explains eight distinct RAG architectures with specific use cases, from basic vector similarity to complex agentic systems. This provides a practical framework for engineers choosing the right approach for different retrieval tasks.
Alibaba DAMO Academy Releases AgentScope: A Python Framework for Multi-Agent Systems with Visual Design
Alibaba's DAMO Academy has open-sourced AgentScope, a Python framework for building coordinated AI agent systems with visual design, MCP tools, memory, RAG, and reasoning. It provides a complete architecture rather than just building blocks.
AI Agents Get a Memory Upgrade: New Framework Treats Multi-Agent Memory as Computer Architecture
A new paper proposes treating multi-agent memory systems as a computer architecture problem, introducing a three-layer hierarchy and identifying critical protocol gaps. This approach could significantly improve reasoning, skills, and tool usage in collaborative AI systems.
Google DeepMind Unveils 'Intelligent AI Delegates': A Paradigm Shift in Autonomous Agent Architecture
Google DeepMind has introduced a groundbreaking framework called 'Intelligent AI Delegates' that fundamentally reimagines how AI agents operate. This new architecture enables more autonomous, efficient, and collaborative problem-solving by allowing AI systems to delegate tasks dynamically.
Beyond Simple Retrieval: The Rise of Agentic RAG Systems That Think for Themselves
Traditional RAG systems are evolving into 'agentic' architectures where AI agents actively control the retrieval process. A new 5-layer evaluation framework helps developers measure when these intelligent pipelines make better decisions than static systems.
OpenDev Paper Formalizes the Architecture for Next-Generation Terminal AI Coding Agents
A comprehensive 81-page research paper introduces OpenDev, a systematic framework for building terminal-based AI coding agents. The work details specialized model routing, dual-agent architectures, and safety controls that address reliability challenges in autonomous coding systems.
Subagent AI Architecture: The Key to Reliable, Scalable Retail Technology Development
Subagent AI architectures break complex development tasks into specialized roles, enabling more reliable implementation of retail systems like personalization engines, inventory APIs, and clienteling tools. This approach prevents context collapse in large codebases.
Beyond RAG: How AI Memory Systems Are Creating Truly Adaptive Agents
AI development is shifting from static retrieval systems to dynamic memory architectures that enable continual learning. This evolution from RAG to agent memory represents a fundamental change in how AI systems accumulate and utilize knowledge over time.
Foxconn and Intel Partner on AI Data Center Rack Systems
Foxconn and Intel partner on AI rack systems, integrating Intel components into Foxconn manufacturing for hyperscale customers. No financial terms disclosed.
Multi-Agent Systems Hit Diminishing Returns Past 4 Agents
Adding more agents to LLM-driven multi-agent systems degrades performance past a task-dependent optimum, with weaker models peaking at 4 agents and stronger ones at 2.
Anthropic Publishes Zero-Trust Architecture for AI Agents
Anthropic released a zero-trust architecture framework for AI agents addressing four threat vectors across three implementation tiers.
Multi-Agent LLM Systems Fail to Outperform Single Models, Study Finds
New paper finds multi-agent LLM systems underperform single models by 2.3% on reasoning benchmarks, challenging a core assumption in AI engineering.
Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models
Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.
Large Memory Models: New Architecture Beyond RAG and Vector Search
Researchers with 160+ Nature and ICLR publications have built Large Memory Models (LMMs), a new architecture designed to emulate human memory processes, offering an alternative to RAG and vector search paradigms.
AI Memory Survey: Three Systems Needed for Human-Like Recall
A new survey paper proposes that modern AI requires three distinct memory systems—parametric, retrieval, and agent memory—to achieve human-like cognition, highlighting control as the key bottleneck.
Layers on Layers — How You Can Improve Your Recommendation Systems
An IBM article critiques monolithic recommendation engines for trying to do too much with one score. It proposes a layered architecture—candidate generation, ranking, and business logic—to improve performance and adaptability. This is a direct, practical framework for engineering teams.
A Reference Architecture for Agentic Hybrid Retrieval in Dataset Search
A new research paper presents a reference architecture for 'agentic hybrid retrieval' that orchestrates BM25, dense embeddings, and LLM agents to handle underspecified queries against sparse metadata. It introduces offline metadata augmentation and analyzes two architectural styles for quality attributes like governance and performance.
Poisoned RAG: 5 Documents Can Corrupt 'Hallucination-Free' AI Systems
Researchers proved that planting a handful of poisoned documents in a RAG system's database can cause it to generate confident, incorrect answers. This exposes a critical vulnerability in systems marketed as 'hallucination-free'.
Akshay Pachaar Inverts LLM Agent Architecture with 'Harness' Design
AI engineer Akshay Pachaar outlined a novel 'harness' architecture for LLM agents that externalizes intelligence into memory, skills, and protocols. He is building a minimal, didactic open-source implementation of this design.
Claude Code's Architecture Revealed
An analysis of Claude Code's source code shows its core is a simple loop, but its power comes from systems like a 5-layer compaction pipeline and a 7-mode permission system, which developers can leverage for better performance.
A Practical Guide to Building Real-Time Recommendation Systems
This article provides a practical overview of building real-time recommendation systems, covering core components like data ingestion, feature stores, and model serving. It matters because real-time personalization is becoming a baseline expectation in digital commerce.
Aehr Test Systems Lands $41M AI Chip Order; H2 Bookings Top $92M
Aehr Test Systems received a record $41 million production order from a key hyperscale AI customer. Total bookings for the second half of its fiscal year exceeded $92 million, highlighting surging demand for semiconductor test and burn-in equipment.
Seven Voice AI Architectures That Actually Work in Production
An engineer shares seven voice agent architectures that have survived production, detailing their components, latency improvements, and failure modes. This is a practical guide for building real-time, interruptible, and scalable voice AI.
Why Most RAG Systems Fail in Production: A Critical Look at Common Pitfalls
An expert article diagnoses the primary reasons RAG systems fail in production, focusing on poor retrieval, lack of proper evaluation, and architectural oversights. This is a crucial reality check for teams deploying AI assistants.