knowledge management
30 articles about knowledge management in AI news
Andrej Karpathy's Personal Knowledge Management System Uses LLM Embeddings Without RAG for 400K-Word Research Base
AI researcher Andrej Karpathy has developed a personal knowledge management system that processes 400,000 words of research notes using LLM embeddings rather than traditional RAG architecture. The system enables semantic search, summarization, and content generation directly from his Obsidian vault.
Claude-Obsidian Open-Source Plugin Aims to Automate Knowledge Management
A developer announced Claude-Obsidian, an open-source plugin that uses AI to autonomously file, cross-reference, and research within Obsidian, citing it as a reason to delete Notion AI.
New Research Proposes FilterRAG and ML-FilterRAG to Defend Against Knowledge Poisoning Attacks in RAG Systems
Researchers propose two novel defense methods, FilterRAG and ML-FilterRAG, to mitigate 'PoisonedRAG' attacks where adversaries inject malicious texts into a knowledge source to manipulate an LLM's output. The defenses identify and filter adversarial content, maintaining performance close to clean RAG systems.
Microsoft's Satya Nadella Details Internal 'Lean for Knowledge Work' AI Initiative
Microsoft CEO Satya Nadella described the company's internal application of AI to streamline knowledge work, framing it as a 'Lean' manufacturing-style efficiency push for cognitive tasks. The initiative focuses on using AI to reduce process friction and improve productivity across internal operations.
Knowledge-RAG v3.0: The Local RAG MCP Server That Finally Just Works
Knowledge-RAG v3.0 eliminates Docker/Ollama setup, adds hybrid search with cross-encoder reranking, and auto-indexes your docs—making private RAG in Claude Code a one-command install.
Stanford/CMU Study: AI Agent Benchmarks Focus on 7.6% of Jobs, Ignoring Management, Legal, and Interpersonal Work
Researchers analyzed 43 AI benchmarks against 72,000+ real job tasks and found they overwhelmingly test programming/math skills, which represent only 7.6% of actual economic work. Management, legal, and interpersonal tasks—which dominate the labor market—are almost entirely absent from evaluation.
New Research Diagnoses LLMs' Struggle with Multiple Knowledge Updates in Context
A new arXiv paper reveals a persistent bias in LLMs when facts are updated multiple times within a long context. Models increasingly favor the earliest version, failing to track the latest state—a critical flaw for dynamic knowledge tasks.
Understanding the Interplay between LLMs' Utilisation of Parametric and Contextual Knowledge: A keynote at ECIR 2025
A keynote at ECIR 2025 will present research on how Large Language Models (LLMs) balance their internal, parametric knowledge with external, contextual information. This is critical for deploying reliable AI in knowledge-intensive tasks where models must correctly use provided context, not just their training data.
Reinforcement Learning Ushers in New Era of Autonomous Knowledge Agents
Researchers are developing knowledge agents powered by reinforcement learning that can autonomously gather, process, and apply information. These systems represent a significant evolution beyond traditional language models toward more independent problem-solving capabilities.
Future-Proof Your AI Search: Why Static Knowledge Bases Fail Luxury Retail
New research reveals AI retrieval benchmarks degrade over time as information changes. For luxury brands using AI for product recommendations and clienteling, this means static knowledge bases become stale, hurting customer experience and sales.
Ethan Mollick: AI's Jagged Intelligence Poses Unique Management Challenges
Ethan Mollick highlights that AI's weaknesses are non-intuitive, uniform across models, and shifting, making it uniquely challenging to manage compared to human teams. This complicates reliable deployment in professional workflows.
VMLOPS's 'Basics' Repository Hits 98k Stars as AI Engineers Seek Foundational Systems Knowledge
A viral GitHub repository aggregating foundational resources for distributed systems, latency, and security has reached 98,000 stars. It addresses a widespread gap in formal AI and ML engineering education, where critical production skills are often learned reactively during outages.
UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management
UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.
Vendasta Launches 'CRM AI' for Automated Client Management
Vendasta has launched a new AI-powered CRM designed to autonomously update client records and manage tasks, aiming to close the 'execution gap' for businesses. This represents a shift towards proactive, agentic systems in business software.
The File Paradigm: How Simple File Systems Could Revolutionize AI Context Management
New research proposes treating all AI context as files within a unified system, potentially solving memory and organization challenges in complex AI workflows. This approach could dramatically simplify how AI systems access and manage information.
AI Learns from Its Own Failures: New Framework Revolutionizes Autonomous Cloud Management
Researchers have developed AOI, a multi-agent AI system that transforms failed operational trajectories into training data for autonomous cloud diagnosis. The framework addresses key enterprise deployment challenges while achieving state-of-the-art performance on industry benchmarks.
The Unix Philosophy Returns: How File Systems Could Solve AI's Memory Crisis
A new research paper proposes treating AI context management like a Unix file system, with OpenClaw demonstrating that storing memory, tools, and knowledge as files creates traceable, auditable AI systems. This approach could solve fragmentation and transparency issues plaguing current agent frameworks.
Agent Harnessing: The Infrastructure That Makes AI Agents Work
A detailed technical guide argues that the model is not the hard part of building AI agents. The six-component harness — context management, memory, tools, control flow, verification, and coordination — is what separates production-grade agents from those that fail silently.
CS3: A New Framework to Boost Two-Tower Recommenders Without Slowing Them Down
Researchers propose CS3, a plug-and-play framework that strengthens the ubiquitous two-tower recommendation architecture. It uses three novel mechanisms to improve model alignment and knowledge transfer, delivering significant revenue gains in a live ad system while maintaining millisecond latency.
Logile to Showcase AI-Powered Connected Store Operations at Retail
Logile, a provider of AI-powered workforce solutions, announced its participation in Retail Technology Show 2026. The company will showcase its Connected Store Operations platform, emphasizing the industry trend toward integrating labor planning, task management, and store execution.
ML-Master 2.0 Hits 56.44% on MLE-Bench in 24-Hour Agentic Science Run
Researchers from Shanghai Jiao Tong University demonstrated ML-Master 2.0, an autonomous research agent that operated continuously for 24 hours on the MLE-Bench, achieving a 56.44% medal rate. The breakthrough centers on Hierarchical Cognitive Caching for state management, not reasoning, enabling long-horizon scientific workflows.
Oracle Blog Critiques the 'Guesswork' in Current CRM AI for Marketing
An Oracle blog post critiques the state of AI in CRM systems, asserting that most solutions still deliver vague insights that force marketing teams to guess rather than providing clear, actionable intelligence. This highlights a critical gap between AI promise and practical utility in customer relationship management.
Fine-Tuning vs RAG: Clarifying the Core Distinction in LLM Application Design
The source article aims to dispel confusion by explaining that fine-tuning modifies a model's knowledge and behavior, while RAG provides it with external, up-to-date information. Choosing the right approach is foundational for any production LLM application.
Microsoft Tests OpenClaw-Style AI Agents for Autonomous 365 Copilot
Microsoft is reportedly testing OpenClaw-style AI agents to evolve Microsoft 365 Copilot into an always-on, autonomous assistant. This move aims to directly handle complex, multi-step tasks like email triage and calendar management without constant user prompting.
Agentic Marketing AI Sustains Performance Gains in 11-Month Case Study
An 11-month longitudinal case study compared human-led vs. autonomous agentic personalization for marketing. While human management generated the highest lift, autonomous agents successfully sustained positive performance gains, pointing to a symbiotic operational model.
Omar Saadoun's PaperWiki AI Agents Now Generate Personalized Research Surveys
Omar Saadoun announced that his PaperWiki platform now uses AI agents to generate personalized survey papers from a user's LLM-generated knowledge base. These surveys are self-improving and update automatically as new papers are published.
Pika Labs Launches 'AI Self' Chatbot for Newsletter Creator Kimmonismus
Kimmonismus, who runs an AI newsletter with 225K+ readers, has launched a custom chatbot trained on his industry knowledge and opinions using Pika Labs' technology. The 'AI Self' is designed to handle reader inquiries at scale.
Anthropic's Claude Skills Implements 3-Layer Context Architecture to Manage Hundreds of Skills
Anthropic's Claude Skills framework employs a three-layer context management system that loads only skill metadata by default, enabling support for hundreds of specialized skills without exceeding context window limits.
How to Run Claude Code 24/7 Without Burning Your Context Window
Implement a hard 50K token session cap and a three-tier memory system (daily notes, MEMORY.md, PARA knowledge graph) to prevent context bloat and memory decay in long-running Claude Code agents.
How This Obsidian Vault Template Gives Claude Code a Long-Term Memory
A GitHub template creates a persistent knowledge graph for Claude Code, eliminating session amnesia and compounding engineering decisions across conversations.