semantic search
30 articles about semantic search in AI news
FRAGATA: A Hybrid RAG System for Semantic Search Over 20 Years of HPC
A new paper details FRAGATA, a system enabling semantic search over two decades of technical support tickets at a supercomputing center. It uses hybrid retrieval-augmented generation (RAG) to find relevant past incidents despite typos, language, or wording differences, showing a qualitative improvement over the legacy search.
Replace Claude Code's Context-Stuffing with git-semantic for Team-Wide Semantic Search
A new tool, git-semantic, lets teams build and share a semantic search index of their codebase via Git, eliminating redundant API calls and enabling faster, more accurate Claude Code queries.
From BM25 to Corrective RAG: A Benchmark Study Challenges the Dominance of Semantic Search for Tabular Data
A systematic benchmark of 10 RAG retrieval strategies on a financial QA dataset reveals that a two-stage hybrid + reranking pipeline performs best. Crucially, the classic BM25 algorithm outperformed modern dense retrieval models, challenging a core assumption in semantic search. The findings provide actionable, cost-aware guidance for building retrieval systems over heterogeneous documents.
Add Semantic Search to Claude Code with pmem: A Local RAG That Cuts Token Costs 75%
Install pmem, a local RAG MCP server, to give Claude Code instant semantic search over your entire project's history, slashing token usage for file retrieval.
Mediagenix Enhances Content Personalization with AI Semantic Search for Better Discovery
Media technology company Mediagenix has integrated AI-powered semantic search into its content management platform to improve content discovery and personalization for broadcasters and media companies. This represents a practical application of embedding technology in the media sector.
DoorDash Builds DashCLIP for Semantic Search Using 32 Million Labels
DoorDash has developed DashCLIP, a custom multimodal embedding model trained on 32 million proprietary labels to align images, text, and user queries for semantic search. This represents a significant move away from generic models for a critical e-commerce function.
Uber Eats Details Production System for Multilingual Semantic Search Across Stores, Dishes, and Items
Uber Eats engineers published a paper detailing their production semantic retrieval system that unifies search across stores, dishes, and grocery items using a fine-tuned Qwen2 model. The system leverages Matryoshka Representation Learning to serve multiple embedding sizes and shows substantial recall gains across six markets.
GameMatch AI Proposes LLM-Powered Identity Layer for Semantic Search in Recommendations
A new Medium article introduces GameMatch AI, a system that uses an LLM to create a user identity layer from descriptive paragraphs, aiming to move beyond click-based recommendations. The concept suggests a shift towards understanding user intent and identity for more personalized discovery.
Product Quantization: The Hidden Engine Behind Scalable Vector Search
The article explains Product Quantization (PQ), a method for compressing high-dimensional vectors to enable fast and memory-efficient similarity search. This is a foundational technology for scalable AI applications like semantic search and recommendation engines.
Andrej Karpathy's Personal Knowledge Management System Uses LLM Embeddings Without RAG for 400K-Word Research Base
AI researcher Andrej Karpathy has developed a personal knowledge management system that processes 400,000 words of research notes using LLM embeddings rather than traditional RAG architecture. The system enables semantic search, summarization, and content generation directly from his Obsidian vault.
Beyond Cosine Similarity: How Embedding Magnitude Optimization Can Transform Luxury Search & Recommendation
New research reveals that controlling embedding magnitude—not just direction—significantly boosts retrieval and RAG performance. For luxury retail, this means more accurate product discovery, personalized recommendations, and enhanced clienteling through superior semantic search.
How Weaviate Agent Skills Let Claude Code Build Vector Apps in Minutes
Weaviate's official Agent Skills give Claude Code structured access to vector databases, eliminating guesswork when building semantic search and RAG applications.
ECLASS-Augmented Semantic Product Search
Researchers systematically evaluated LLM-assisted dense retrieval for semantic product search on industrial electronic components. Augmenting embeddings with ECLASS hierarchical metadata created a crucial semantic bridge, achieving 94.3% Hit_Rate@5 versus 31.4% for BM25.
New Research Proposes Lightweight Method to Fix Stale Semantic IDs in
Researchers propose a method to update 'stale' Semantic IDs in generative retrieval systems without full retraining. Their alignment technique improves key metrics and reduces compute costs by ~8-9x, addressing a core challenge in dynamic recommendation environments.
New Research: ADC-SID Framework Improves Semantic ID Generation by Denoising Collaborative Signals
A new arXiv paper proposes ADC-SID, a framework that adaptively denoises collaborative information to create more robust Semantic IDs for recommender systems. It specifically addresses the corruption of long-tail item representations, a critical problem for large retail catalogs.
The Semantic Void: A RAG Detective Story
A first-person technical blog chronicles rebuilding a vector store index on GCP, exposing a 'semantic void' where embeddings fail to capture meaning. This serves as a cautionary tale for any RAG implementation, including retail chatbots and product search.
Continuous Semantic Caching
Researchers propose a theory-grounded semantic caching system that treats user queries as points in a continuous embedding space, using dynamic ε-net discretization and kernel ridge regression to cut inference costs and latency without switching overhead.
Semantic Needles in Document Haystacks
Researchers developed a framework to test how LLMs score similarity between documents with subtle semantic changes. They found models exhibit positional bias, are sensitive to topical context, and produce unique scoring 'fingerprints'. This matters for any application relying on LLM-as-a-Judge for document comparison.
CAST: A New Framework for Semantic-Level Complementary Recommendations
Researchers propose CAST, a sequential recommendation framework that models transitions between discrete item semantic codes (e.g., specifications) and injects LLM-verified complementary knowledge. It achieves significant performance gains by moving beyond simplistic co-purchase statistics to capture genuine complementarity.
Building a Semantic Recommendation System from Scratch
An engineer documents the process of building a semantic recommender using embeddings and vector search, focusing on the practical challenges and failures encountered. This is a crucial reality check for teams moving beyond collaborative filtering.
New Research Proposes Profiler and DAVINCI for Scalable
Researchers propose Profiler, a non-learnable module to efficiently capture human citation patterns, and DAVINCI, a reranking model that integrates these patterns with semantic data. They also introduce a strict inductive evaluation setting to better simulate real-world recommendation scenarios, achieving state-of-the-art results.
KARMA: Alibaba's Framework for Bridging the Knowledge-Action Gap in LLM-Powered Personalized Search
Alibaba researchers propose KARMA, a framework that regularizes LLM fine-tuning for personalized search by preventing 'semantic collapse.' Deployed on Taobao, it improved key metrics and increased item clicks by +0.5%.
GateSID: A New Framework for Adaptive Cold-Start Recommendation Using Semantic IDs
Researchers propose GateSID, an adaptive gating framework that dynamically balances semantic and collaborative signals for cold-start items. It uses hierarchical Semantic IDs and adaptive attention to improve recommendations, showing +2.6% GMV in online tests.
Brittlebench Framework Quantifies LLM Robustness, Finds Semantics-Preserving Perturbations Degrade Performance Up to 12%
Researchers introduce Brittlebench, a framework to measure LLM sensitivity to prompt variations. Applying semantics-preserving perturbations to standard benchmarks degrades model performance by up to 12% and alters model rankings in 63% of cases.
98× Faster LLM Routing Without a Dedicated GPU: Technical Breakthrough for vLLM Semantic Router
New research presents a three-stage optimization pipeline for the vLLM Semantic Router, achieving 98× speedup and enabling long-context classification on shared GPUs. This solves critical memory and latency bottlenecks for system-level LLM routing.
VLM4Rec: A New Approach to Multimodal Recommendation Using Vision-Language Models for Semantic Alignment
A new research paper proposes VLM4Rec, a framework that uses large vision-language models to convert product images into rich, semantic descriptions, then encodes them for recommendation. It argues semantic alignment matters more than complex feature fusion, showing consistent performance gains.
Building Semantic Product Recommendation Systems with Two-Tower Embeddings
A technical guide explains how to implement a two-tower neural network architecture for product recommendations, creating separate embeddings for users and items to power similarity search and personalized ads. This approach moves beyond simple collaborative filtering to semantic understanding.
StyleGallery: A Training-Free, Semantic-Aware Framework for Personalized Image Style Transfer
Researchers propose StyleGallery, a novel diffusion-based framework for image style transfer that addresses key limitations: semantic gaps, reliance on extra constraints, and rigid feature alignment. It enables personalized customization from arbitrary reference images without requiring model training.
Multi-TAP: A New Framework for Cross-Domain Recommendation Using Semantic Persona Modeling
Researchers propose Multi-TAP, a cross-domain recommendation framework that models intra-domain user preference heterogeneity through semantic personas. It selectively transfers knowledge between domains, outperforming existing methods on real-world datasets.
How Semantic AI Bridges Threat Intelligence to Automated Firewall Defense
Researchers propose a neuro-symbolic AI system that automatically converts cyber threat intelligence into firewall rules using semantic relationships. The approach leverages hypernym-hyponym relations to extract actionable security information, outperforming traditional methods.