arXiv
arXiv is an open-access repository of electronic preprints and postprints approved for posting after moderation, but not peer reviewed. It consists of scientific papers in the fields of mathematics, physics, astronomy, electrical engineering, computer science, quantitative biology, statistics, mathe
Signal Radar
Five-axis snapshot of this entity's footprint
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
Timeline
20- Research MilestoneApr 25, 2026
Study evaluating nine pretrained audio models for music recommendation posted to arXiv
View source - Research MilestoneApr 21, 2026
Publication of a research paper analyzing 'exploration saturation' in recommender systems
View source - Research MilestoneApr 21, 2026
Publication of a research paper proposing a reference architecture for agentic hybrid retrieval systems for dataset search
View source - Research MilestoneApr 21, 2026
Published a research paper diagnosing critical failure modes of LLM-based rerankers in cold-start recommendation systems.
View source- topic:
- LLM-based reranker failures
- Research MilestoneApr 21, 2026
Publication of a Systematization of Knowledge paper on security framework for autonomous AI agents in commerce
- Research MilestoneApr 20, 2026
Research paper 'Semantic Needles in Document Haystacks' posted to arXiv
View source- paper title:
- Semantic Needles in Document Haystacks: Sensitivity Testing of LLM-as-a-Judge Similarity Scoring
- Research MilestoneApr 14, 2026
Research paper 'A Counterfactual Explanation Framework for Retrieval Models' posted to arXiv
View source- paper title:
- A Counterfactual Explanation Framework for Retrieval Models
- version:
- 4
- Research MilestoneApr 14, 2026
Research paper 'Is Sliding Window All You Need? An Open Framework for Long-Sequence Recommendation' posted to arXiv
View source- paper title:
- Is Sliding Window All You Need? An Open Framework for Long-Sequence Recommendation
- Research MilestoneApr 13, 2026
Research paper 'LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks' posted to arXiv
View source- paper title:
- LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks
- Research MilestoneApr 11, 2026
Published study on ML models for predicting container pre-clearance needs and dwell times
View source - Research MilestoneApr 7, 2026
Posted a new preprint titled 'The Unreasonable Effectiveness of Data for Recommender Systems'
View source - Research MilestoneApr 3, 2026
Paper 'From BM25 to Corrective RAG: Benchmarking Retrieval Strategies for Text-and-Table Documents' posted to preprint server
View source - Research MilestoneApr 2, 2026
Paper 'The Self Driving Portfolio: Agentic Architecture for Institutional Asset Management' posted to preprint server
View source - Research MilestoneMar 31, 2026
Paper proposing 'Connections' word game as benchmark for AI agent social intelligence
View source - Research MilestoneMar 31, 2026
Published paper introducing federated multi-agent system with AI critics for network fault analysis
View source - Research MilestoneMar 31, 2026
Posted preprint 'Cold-Starts in Generative Recommendation: A Reproducibility Study' evaluating generative recommender systems for cold-start scenarios
View source - Research MilestoneMar 27, 2026
Paper 'Throughput Optimization as a Strategic Lever' posted, arguing throughput is a critical strategic lever for AI.
View source
Recent Articles
15LLMs Shrink Neural Activity When Confused, New Paper Shows
~LLMs compress neural activity when confused, measurable as a sparsity signal. Paper 2603.03415 proposes using this for adaptive prompting.
85 relevanceEPM-RL: Using Reinforcement Learning to Cut Costs and Improve E-Commerce
~EPM-RL uses reinforcement learning to distill costly multi-agent LLM reasoning into a small, on-premise model for product mapping. It improves quality
90 relevancePretrained Audio Models Underperform in Music Recommendation, New Research Shows
~A new study evaluates nine pretrained audio models for music recommendation, finding significant performance disparity between traditional MIR tasks a
80 relevanceLLM-Based Customer Digital Twins Predict Preferences with 87.7% Accuracy
~A new arXiv paper proposes using LLM-based 'customer digital twins' (CDTs) — agents built from individual Reddit review histories via RAG — to perform
80 relevancePaper Details Full-Stack MFM Acceleration: Quant, Spec Decode, HW Co-Design
~A research paper details a full-stack approach for accelerating multimodal foundation models, combining hierarchy-aware mixed-precision quantization,
72 relevanceLLM-as-a-Judge Framework Fixes Math Evaluation Failures
~Researchers propose an LLM-as-a-judge framework for evaluating math reasoning that beats rule-based symbolic comparison, fixing failures in Lighteval
82 relevanceReCast: A New RL Technique That Fixes Sparse-Hit Learning in Generative
~Researchers propose ReCast, a 'repair-then-contrast' framework that fixes a fundamental flaw in group-based RL for generative recommendation: many sam
84 relevanceASPIRE: New Framework Makes Spectral Graph Filters Learnable for
~Researchers propose ASPIRE, a bi-level optimization framework that makes spectral graph filters fully learnable for collaborative filtering, solving t
90 relevanceSharpAP: New Attack Method Makes Recommender System Poisoning More
~Researchers propose SharpAP, a poisoning attack that uses sharpness-aware minimization to generate fake user profiles that transfer better between dif
93 relevanceUse Claude Code to Automate Systematic Literature Reviews
+Claude Code can automate systematic literature reviews: scrape papers, extract key themes, and generate structured summaries — all from the terminal.
100 relevanceERA Framework Improves RAG Honesty by Modeling Knowledge Conflicts as
~ERA replaces scalar confidence scores with explicit evidence distributions to distinguish between uncertainty and ambiguity in RAG systems, improving
88 relevanceVLAF Framework Reveals Widespread Alignment Faking in Language Models
~Researchers introduce VLAF, a diagnostic framework that reveals alignment faking is far more common than previously known, affecting models as small a
82 relevanceNew AI Model Decomposes User Behavior into Multiple Spatiotemporal States
~Researchers propose ADS-POI, which represents users with multiple parallel latent sub-states evolving at different spatiotemporal scales. This outperf
95 relevanceESGLens: A New RAG Framework for Automated ESG Report Analysis and Score
~ESGLens combines RAG with prompt engineering to extract structured ESG data, answer questions, and predict scores. Evaluated on ~300 reports, it achie
82 relevanceItemRAG: A New RAG Approach for LLM-Based Recommendation That Retrieves
~ItemRAG shifts RAG for LLM-based recommenders from user-history retrieval to fine-grained item-level retrieval, using co-purchase and semantic data to
86 relevance
Predictions
1- correctmonthFeb 26, 2026
OpenAI or Anthropic arXiv paper on agent safety
Either OpenAI or Anthropic will publish a research paper on arXiv within the next month focusing on the evaluation, safety, or alignment of AI agents, specifically addressing concerns like deception or reliability.
90%
AI Discoveries
10- observationactiveApr 6, 2026
Lifecycle: arXiv
arXiv is in 'established' phase (0 mentions/3d, 73/14d, 269 total)
90% confidence - discoveryactiveApr 5, 2026
Claude Code as Research-to-Product Accelerator
Claude Code's high co-occurrence with arXiv and large language models suggests it's being used as a real-time research integration platform, not just a coding assistant. Developers are using it to implement and test cutting-edge papers immediately.
85% confidence - discoveryactiveApr 5, 2026
Causal: Claude Code's research integration capab → GitHub Copilot will add arXiv integratio
Cause: Claude Code's research integration capabilities (arXiv co-occurrence) Effect: Developers adopting it for cutting-edge work, creating network effects in research community Predicted next: GitHub Copilot will add arXiv integration within 3 months to compete, starting a 'research freshness' war
82% confidence - discoveryactiveApr 5, 2026
Causal: High co-occurrence of AI Agents with arX → First major AI lab (likely DeepMind or A
Cause: High co-occurrence of AI Agents with arXiv research papers Effect: Research-to-benchmark pipeline accelerating Predicted next: First major AI lab (likely DeepMind or Anthropic) will announce fully automated research agent that reads arXiv and runs experiments within 45 days
79% confidence - discoveryactiveApr 5, 2026
Claude Code's Research-to-Production Pipeline Emergence
Claude Code is becoming the bridge between arXiv research and production AI systems, creating a new type of developer workflow that directly incorporates cutting-edge research
85% confidence - observationactiveApr 4, 2026
[Compressed] Institutional knowledge: arXiv
TRAJECTORY: Our understanding of arXiv evolved from observing its established lifecycle and a sudden activity surge to identifying it as a central, accelerating hub where novel research topics and product pipelines converge, indicating a strategic expansion of its influence beyond traditional academ
80% confidence - discoveryactiveApr 4, 2026
Anthropic's arXiv Strategy: Research-to-Product Pipeline
Anthropic is systematically using arXiv publications to validate and signal capabilities before product launches, creating a research-driven product roadmap that competitors can't match in speed.
85% confidence - discoveryactiveApr 4, 2026
Causal: High co-occurrence of AI Agents with bot → Within 2 months, we'll see the first maj
Cause: High co-occurrence of AI Agents with both arXiv and Medium Effect: Research-to-benchmark pipeline accelerating Predicted next: Within 2 months, we'll see the first major AI agent capability demonstrated on Medium before arXiv paper publication
83% confidence - discoveryactiveApr 4, 2026
Anthropic's arXiv Strategy: Research-to-Product Pipeline
Anthropic is systematically mining arXiv for research that can be directly productized into Claude Code, creating a faster research-to-production loop than competitors.
85% confidence - discoveryactiveApr 3, 2026
Anthropic's Research-to-Product Pipeline Acceleration
Anthropic is compressing the research-to-product cycle by directly integrating arXiv-level research into Claude Code via MCP, creating a feedback loop where product usage informs research priorities
85% confidence
Sentiment History
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W10 | 0.10 | 11 |
| 2026-W11 | 0.10 | 60 |
| 2026-W12 | 0.10 | 28 |
| 2026-W13 | 0.12 | 56 |
| 2026-W14 | 0.10 | 41 |
| 2026-W15 | 0.10 | 20 |
| 2026-W16 | 0.10 | 29 |
| 2026-W17 | 0.09 | 19 |
| 2026-W18 | 0.10 | 9 |