rag
30 articles about rag in AI news
HyperAgent Raises $10M Grant Pool, Targets Zapier Replacement
HyperAgent, from ex-Airtable team, launches with $10M grant pool for 500 founders to build agentic automation that aims to replace Zapier.
ColPali Beats OCR Pipelines for Document RAG: 8× Storage Cost, 0% Chunking
ColPali eliminates OCR and chunking for document-heavy RAG by encoding each 16×16 image patch into a 128-dim vector. It outperforms prior SOTA on the ViDoRe benchmark but costs 8× more storage per page.
Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents
Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.
Blockify Cuts RAG Corpus by 40x, Boosts Retrieval 2.3x
Blockify claims 40x corpus reduction and 2.3x relevance gain over naive RAG. Open-source on GitHub, but lacks benchmark details.
New RAG method ditches vector DB, threatens industry
New RAG method ditches vector DB, threatening incumbents. Claim from single tweet, no verification yet.
New CASIA Benchmark Exposes Fragmented Face Swapping Evaluation
CASIA researchers released a face swapping survey and benchmark on April 27, 2026, aiming to standardize evaluation across fragmented GAN and diffusion model methods.
RAG's New Frontier: When to Retrieve During Reasoning
A new RAG paradigm retrieves at multiple reasoning steps via a learned gate, boosting multi-hop QA by 15-20% on HotpotQA.
Vector DBs Can't Reason: GraphRAG-Bench Shows 83.6% Gap on Complex Queries
FalkorDB's GraphRAG-Bench benchmarks show vector databases struggle on multi-hop reasoning (83.6% gap) and contextual summarization (85.1% gap), highlighting graph-based retrieval's advantage for complex queries.
Large Memory Models: New Architecture Beyond RAG and Vector Search
Researchers with 160+ Nature and ICLR publications have built Large Memory Models (LMMs), a new architecture designed to emulate human memory processes, offering an alternative to RAG and vector search paradigms.
The Semantic Void: A RAG Detective Story
A first-person technical blog chronicles rebuilding a vector store index on GCP, exposing a 'semantic void' where embeddings fail to capture meaning. This serves as a cautionary tale for any RAG implementation, including retail chatbots and product search.
ERA Framework Improves RAG Honesty by Modeling Knowledge Conflicts as
ERA replaces scalar confidence scores with explicit evidence distributions to distinguish between uncertainty and ambiguity in RAG systems, improving abstention behavior and calibration.
Mirage's Cappy Edits Video via Text Message with No App
Mirage launched Cappy, a text-based video editing service that delivers fully edited videos via SMS. This first-of-its-kind approach eliminates traditional editing interfaces entirely.
ItemRAG: A New RAG Approach for LLM-Based Recommendation That Retrieves
ItemRAG shifts RAG for LLM-based recommenders from user-history retrieval to fine-grained item-level retrieval, using co-purchase and semantic data to prioritize informative items. Experiments show consistent outperformance over existing methods, especially for cold-start items.
ESGLens: A New RAG Framework for Automated ESG Report Analysis and Score
ESGLens combines RAG with prompt engineering to extract structured ESG data, answer questions, and predict scores. Evaluated on ~300 reports, it achieved a Pearson correlation of 0.48 against LSEG scores. The paper highlights promise but also significant limitations.
RAG vs Fine-Tuning: A Practical Guide for Choosing the Right LLM
The article provides a clear, decision-oriented comparison between Retrieval-Augmented Generation (RAG) and fine-tuning for customizing LLMs in production, helping practitioners choose the right approach based on data freshness, cost, and output control needs.
A Practical Framework for Moving Enterprise RAG from POC to Production
The article presents a detailed, production-ready framework for building an enterprise RAG system, covering architecture, security, and deployment. It provides a concrete path for companies to move beyond experimental prototypes.
GraphRAG-IRL: A Hybrid Framework for More Robust Personalized Recommendation
Researchers propose GraphRAG-IRL, a hybrid recommendation framework that addresses LLMs' weaknesses as standalone rankers. It uses a knowledge graph and inverse reinforcement learning for robust pre-ranking, then applies persona-guided LLM re-ranking to a shortlist, achieving significant NDCG improvements.
Fine-Tuning vs RAG: A Foundational Comparison for AI Strategy
The source provides a foundational comparison of fine-tuning and Retrieval-Augmented Generation (RAG) for enhancing AI models. It uses the analogy of teaching during training versus providing a book during an exam, clarifying their distinct roles in AI application development.
RAG vs Fine-Tuning vs Prompt Engineering
A technical blog clarifies that Retrieval-Augmented Generation (RAG), fine-tuning, and prompt engineering should be viewed as a layered stack, not mutually exclusive options. It provides a decision framework for when to use each technique based on specific needs like data freshness, task specificity, and cost.
Mind Games Fragrance Achieves 56% Growth Without a Hero SKU
Mind Games, a chess-inspired luxury fragrance brand, achieved $28.9M in 2025 US sales with 56% YoY growth despite having no dominant hero SKU. 65% of sales come from 14 different scents, targeting young male collectors. The brand is projecting $120M in global retail sales for 2026.
Poisoned RAG: 5 Documents Can Corrupt 'Hallucination-Free' AI Systems
Researchers proved that planting a handful of poisoned documents in a RAG system's database can cause it to generate confident, incorrect answers. This exposes a critical vulnerability in systems marketed as 'hallucination-free'.
PoisonedRAG Attack Hijacks LLM Answers 97% of Time with 5 Documents
Researchers demonstrated that inserting only 5 poisoned documents into a 2.6 million document database can hijack a RAG system's answers 97% of the time, exposing critical vulnerabilities in 'hallucination-free' retrieval systems.
Skill-RAG Uses Hidden-State Probes to Trigger Retrieval Only When Needed
Researchers introduced Skill-RAG, a system that uses hidden-state probing to detect when an LLM is about to fail, triggering targeted retrieval. This improves over uniform RAG baselines on HotpotQA, Natural Questions, and TriviaQA.
RAG-Anything: Multimodal RAG for Text, Images, Tables & Formulas
An open-source project, RAG-Anything, tackles a major flaw in most RAG systems by enabling them to process and connect information from text, images, tables, and formulas within documents.
FRAGATA: A Hybrid RAG System for Semantic Search Over 20 Years of HPC
A new paper details FRAGATA, a system enabling semantic search over two decades of technical support tickets at a supercomputing center. It uses hybrid retrieval-augmented generation (RAG) to find relevant past incidents despite typos, language, or wording differences, showing a qualitative improvement over the legacy search.
Fine-Tuning vs RAG: Clarifying the Core Distinction in LLM Application Design
The source article aims to dispel confusion by explaining that fine-tuning modifies a model's knowledge and behavior, while RAG provides it with external, up-to-date information. Choosing the right approach is foundational for any production LLM application.
IBM Demonstrates Extreme Scale for Content-Aware Storage with 100-Billion
IBM Research announced a breakthrough in vector database technology, achieving storage capacity of 100 billion vectors. This enables content-aware storage systems that can understand and retrieve data based on semantic meaning rather than just metadata.
PRAGMA: Revolut's Foundation Model for Banking Event Sequences
A new research paper introduces PRAGMA, a family of foundation models designed specifically for multi-source banking event sequences. The model uses masked modeling on a large corpus of financial records to create general-purpose embeddings that achieve strong performance on downstream tasks like fraud detection with minimal fine-tuning.
Why Most RAG Systems Fail in Production: A Critical Look at Common Pitfalls
An expert article diagnoses the primary reasons RAG systems fail in production, focusing on poor retrieval, lack of proper evaluation, and architectural oversights. This is a crucial reality check for teams deploying AI assistants.
Snap & Qualcomm Partner on Snapdragon XR for Future Spectacles
Snap has entered a strategic agreement with Qualcomm to power future generations of its Spectacles AR glasses with Snapdragon XR platforms. This hardware partnership is critical for Snap's long-term bet on AI-driven augmented reality.