arXiv
arXiv is an open-access repository of electronic preprints and postprints approved for posting after moderation, but not peer reviewed. It consists of scientific papers in the fields of mathematics, physics, astronomy, electrical engineering, computer science, quantitative biology, statistics, mathe
Signal Radar
Five-axis snapshot of this entity's footprint
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
Timeline
20- Research MilestoneJun 3, 2026
Paper on strategic attack timing published on arXiv
View source- paper id:
- 2606.06529
- Research MilestoneMay 7, 2026
Paper on SAE-based probes for predicting agent tool failures posted to arXiv
View source - Research MilestoneApr 25, 2026
Study evaluating nine pretrained audio models for music recommendation posted to arXiv
View source - Research MilestoneApr 21, 2026
Publication of a research paper proposing a reference architecture for agentic hybrid retrieval systems for dataset search
View source - Research MilestoneApr 21, 2026
Publication of a research paper analyzing 'exploration saturation' in recommender systems
View source - Research MilestoneApr 21, 2026
Published a research paper diagnosing critical failure modes of LLM-based rerankers in cold-start recommendation systems.
View source- topic:
- LLM-based reranker failures
- Research MilestoneApr 21, 2026
Publication of a Systematization of Knowledge paper on security framework for autonomous AI agents in commerce
- Research MilestoneApr 20, 2026
Research paper 'Semantic Needles in Document Haystacks' posted to arXiv
View source- paper title:
- Semantic Needles in Document Haystacks: Sensitivity Testing of LLM-as-a-Judge Similarity Scoring
- Research MilestoneApr 14, 2026
Research paper 'A Counterfactual Explanation Framework for Retrieval Models' posted to arXiv
View source- paper title:
- A Counterfactual Explanation Framework for Retrieval Models
- version:
- 4
- Research MilestoneApr 14, 2026
Research paper 'Is Sliding Window All You Need? An Open Framework for Long-Sequence Recommendation' posted to arXiv
View source- paper title:
- Is Sliding Window All You Need? An Open Framework for Long-Sequence Recommendation
- Research MilestoneApr 13, 2026
Research paper 'LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks' posted to arXiv
View source- paper title:
- LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks
- Research MilestoneApr 11, 2026
Published study on ML models for predicting container pre-clearance needs and dwell times
View source - Research MilestoneApr 7, 2026
Posted a new preprint titled 'The Unreasonable Effectiveness of Data for Recommender Systems'
View source - Research MilestoneApr 3, 2026
Paper 'From BM25 to Corrective RAG: Benchmarking Retrieval Strategies for Text-and-Table Documents' posted to preprint server
View source - Research MilestoneApr 2, 2026
Paper 'The Self Driving Portfolio: Agentic Architecture for Institutional Asset Management' posted to preprint server
View source - Research MilestoneMar 31, 2026
Paper proposing 'Connections' word game as benchmark for AI agent social intelligence
View source - Research MilestoneMar 31, 2026
Published paper introducing federated multi-agent system with AI critics for network fault analysis
View source
Relationships
9Licensed
Uses
Frequently appears with
10Entities that show up in the same articles — shared coverage, not a stated relationship.
Recent Articles
11SVoT Boosts MLLM Spatial Reasoning by 65% via RL-Verified Visual Chains
~SVoT uses RL to verify MLLM spatial reasoning states, achieving up to 65% accuracy gains on OOD tests across five domains including Pacman and Gather.
88 relevanceMacArena: 421-Task macOS Benchmark Reveals 26% CUA Ranking Inversion
~MacArena benchmark of 421 macOS tasks reveals 26% performance gap for top models on native tasks, suggesting current CUAs overfit to Linux distributio
95 relevanceSelective Attackers Cut Agent Safety by 28pp, Paper Finds
~Strategic attack timing cuts agent AI safety by up to 28pp, showing current evaluations overestimate safety.
100 relevanceMIT Paper Formalizes Self-Revising AI Scientists That Can Change Their Own Language
~MIT paper 2606.01444 formalizes self-revising AI scientists that can change their conceptual schema. Novelty is defined by what could not be expressed
87 relevanceSMAC-Talk: StarCraft Benchmark Tests LLM Agents Against Deceptive Allies
~SMAC-Talk extends StarCraft Multi-Agent Challenge with natural language communication, testing LLM agents against deceptive allies. Qwen3.5 models ben
70 relevanceMeta-Stanford Survey: Code as Agent Harness Improves AI Reasoning
~Meta, Stanford, Illinois survey argues AI agents work better with code as their main working layer, calling it an agent harness.
89 relevanceDualFashion: Dual-Diffusion Transformer Generates Outfit Images & Text
~DualFashion uses a dual-diffusion Transformer to jointly generate fashion images and text, outperforming SOTA on iFashion and Polyvore-U with interpre
82 relevanceMLLM Raters Show Central Tendency Bias in Clinical Scoring
~Study finds GPT-5 and other MLLMs show central tendency bias in clinical scoring, compressing predictions toward scale midpoint despite prompt modific
70 relevanceLLM-EDT: Dual-Phase Training Boosts Cross-Domain Rec by 12.4%
~LLM-EDT improves cross-domain sequential recommendation by up to 12.4% using dual-phase training and LLM-based item generation.
74 relevanceCascaded LLMs Lift E-Commerce Cart Adds 2.7% in Online Test
~A cascaded LLM framework for e-commerce storefront generation lifted cart adds by +2.7% in online tests, using teacher-student fine-tuning to approach
100 relevanceAgentStop Cuts Local AI Agent Energy by 15-20% With Minimal Performance Loss
~AgentStop cuts local AI agent energy by 15-20% with <5% utility loss using token log-probabilities.
85 relevance
Predictions
1- correctmonthFeb 26, 2026
OpenAI or Anthropic arXiv paper on agent safety
Either OpenAI or Anthropic will publish a research paper on arXiv within the next month focusing on the evaluation, safety, or alignment of AI agents, specifically addressing concerns like deception or reliability.
90%
AI Discoveries
10- observationactive1d ago
[Compressed] Institutional knowledge: arXiv
TRAJECTORY: Our understanding of arXiv evolved from viewing it as a stable, neutral repository to recognizing it as a dynamic entity with surging mentions, active hypotheses about formal AI partnerships and disclosure mandates, and indirect talent pipeline connections to Anthropic. KEY FACTS: - arX
80% confidence - observationactive4d ago
Lifecycle: arXiv
arXiv is in 'established' phase (2 mentions/3d, 4/14d, 360 total)
90% confidence - discoveryactive5d ago
Causal: Anthropic publishes Claude Opus 4.6 pape → Within 120 days, top-tier AI conference
Cause: Anthropic publishes Claude Opus 4.6 papers on arXiv at 3x OpenAI's rate Effect: Academic researchers (MIT, Stanford) increasingly cite and build on Anthropic's work, not Google's or OpenAI's Predicted next: Within 120 days, top-tier AI conference papers will cite Anthropic research more than
71% confidence - discoveryactive5d ago
arXiv is becoming a competitive intelligence battlefield — Anthropic is winning
arXiv (4 mentions/7d) is unconnected to Google (18 mentions) despite Google being a research powerhouse. This is a strategic failure. Anthropic's Claude Opus 4.6 papers on arXiv (3x OpenAI's rate) are systematically capturing academic mindshare. The unconnected pair Google ↔ arXiv is the signal: Goo
82% confidence - discoveryactive5d ago
Anthropic's arXiv dominance signals a research-led market capture strategy
Claude Opus 4.6 papers appearing on arXiv at 3x the rate of OpenAI's equivalent models isn't just transparency — it's a systematic talent and mindshare acquisition play. Anthropic is using arXiv to attract academic researchers who will build on Claude, creating a self-reinforcing ecosystem that comp
90% confidence - discoveryactive5d ago
Causal: Anthropic publishes Claude Opus 4.6 pape → Anthropic will release a research-specif
Cause: Anthropic publishes Claude Opus 4.6 papers on arXiv at 3x OpenAI's rate Effect: Academic researchers increasingly adopt Claude for research, building a community-driven ecosystem around Anthropic's models. Predicted next: Anthropic will release a research-specific API tier with academic prici
85% confidence - discoveryactive6d ago
arXiv Is Becoming Anthropic's Secret Weapon Against OpenAI
arXiv (4 mentions) and Anthropic are an unconnected pair, but Anthropic's Claude Opus 4.6 papers are appearing on arXiv at 3x the rate of OpenAI's. This isn't coincidence — Anthropic is using arXiv as a strategic publication venue to establish technical legitimacy, while OpenAI has shifted to blog p
85% confidence - discoveryactive6d ago
Research Convergence: Neural Memory + Agent Autonomy = Self-Improving Systems
The simultaneous emergence of persistent memory architectures (Titan, Dreaming—from prior discovery) and autonomous agent capabilities (Claude Code porting Lightroom) creates a new capability frontier: agents that remember and improve across sessions. CLAUDE.md is the primitive version; Titan/Dreami
75% confidence - observationactive6d ago
Velocity spike: arXiv
arXiv (organization) surged from 1 to 3 mentions in 3 days (velocity_spike).
80% confidence - hypothesisactiveFeb 24, 2026
H: arXiv will launch a 'verified replication' or 'live benchmark' feature within 2 months, allowing rea
arXiv will launch a 'verified replication' or 'live benchmark' feature within 2 months, allowing real-time testing of AI models against new research benchmarks, becoming the de facto validation layer for the AI industry.
75% confidence
Sentiment History
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W17 | 0.09 | 19 |
| 2026-W18 | 0.10 | 9 |
| 2026-W19 | 0.10 | 1 |
| 2026-W20 | 0.07 | 3 |
| 2026-W21 | 0.04 | 5 |
| 2026-W22 | 0.00 | 1 |
| 2026-W23 | 0.10 | 2 |
| 2026-W24 | 0.10 | 3 |