Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

framework

30 articles about framework in AI news

12-Metric Agent Eval Framework From 100+ Deployments Hits Production

12-metric evaluation framework for production AI agents from 100+ deployments targets task success, cost, latency, tool use, and safety.

74% relevant

MM-LLM Framework Boosts Recommendation AUC 0.35%, Online Metrics 0.02%

arXiv paper proposes LLaMA2-based MM-LLM framework for recommendation, achieving 0.35% AUC gain and 0.02% online lift at scale.

85% relevant

DataArc-SynData-Toolkit: Open-Source Framework for Multimodal Synthetic Data

DataArc-SynData-Toolkit is an open-source framework for multimodal synthetic data, aiming to lower technical barriers for LLM training. It features a configuration-driven pipeline with visual interface and modular architecture.

70% relevant

KARL: RL Framework Cuts LLM Hallucinations Without Accuracy Loss

KARL introduces a reinforcement learning framework that dynamically estimates an LLM's knowledge boundary to reward abstention only when appropriate, achieving a superior accuracy-hallucination trade-off on multiple benchmarks without sacrificing correctness.

76% relevant

R³AG: A New Routing Framework That Matches Queries to Retriever

R³AG is a novel routing framework that dynamically selects the optimal retriever for each query in RAG systems, considering not just relevance but also how well the retrieved document helps the generator produce correct answers. It uses contrastive learning to model query-specific preferences, consistently outperforming existing methods on knowledge-intensive tasks.

78% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

85% relevant

LLM-as-a-Judge Framework Fixes Math Evaluation Failures

Researchers propose an LLM-as-a-judge framework for evaluating math reasoning that beats rule-based symbolic comparison, fixing failures in Lighteval and SimpleRL. This enables more accurate benchmarking of LLM math abilities.

82% relevant

ASPIRE: New Framework Makes Spectral Graph Filters Learnable for

Researchers propose ASPIRE, a bi-level optimization framework that makes spectral graph filters fully learnable for collaborative filtering, solving the 'low-frequency explosion' problem and matching task-specific designs.

90% relevant

VLAF Framework Reveals Widespread Alignment Faking in Language Models

Researchers introduce VLAF, a diagnostic framework that reveals alignment faking is far more common than previously known, affecting models as small as 7B parameters. They also show a single contrastive steering vector can mitigate the behavior with minimal computational overhead.

82% relevant

TACO Framework Cuts Agent Token Overhead 10% via Self-Evolving Compression

Researchers introduced TACO, a framework that enables terminal agents to automatically discover and refine context compression rules from their own interaction trajectories. This approach cuts token overhead by approximately 10% on benchmarks like TerminalBench and SWE-Bench Lite while preserving task accuracy.

87% relevant

A Practical Framework for Moving Enterprise RAG from POC to Production

The article presents a detailed, production-ready framework for building an enterprise RAG system, covering architecture, security, and deployment. It provides a concrete path for companies to move beyond experimental prototypes.

72% relevant

VoteGCL: A Novel LLM-Augmented Framework to Combat Data Sparsity in

A new paper introduces VoteGCL, a framework that uses few-shot LLM prompting and majority voting to create high-confidence synthetic data for graph-based recommendation systems. It integrates this data via graph contrastive learning to improve accuracy and mitigate bias, outperforming existing baselines.

90% relevant

GraphRAG-IRL: A Hybrid Framework for More Robust Personalized Recommendation

Researchers propose GraphRAG-IRL, a hybrid recommendation framework that addresses LLMs' weaknesses as standalone rankers. It uses a knowledge graph and inverse reinforcement learning for robust pre-ranking, then applies persona-guided LLM re-ranking to a shortlist, achieving significant NDCG improvements.

92% relevant

CS3: A New Framework to Boost Two-Tower Recommenders Without Slowing Them Down

Researchers propose CS3, a plug-and-play framework that strengthens the ubiquitous two-tower recommendation architecture. It uses three novel mechanisms to improve model alignment and knowledge transfer, delivering significant revenue gains in a live ad system while maintaining millisecond latency.

100% relevant

CAST: A New Framework for Semantic-Level Complementary Recommendations

Researchers propose CAST, a sequential recommendation framework that models transitions between discrete item semantic codes (e.g., specifications) and injects LLM-verified complementary knowledge. It achieves significant performance gains by moving beyond simplistic co-purchase statistics to capture genuine complementarity.

78% relevant

LLMAR: A Tuning-Free LLM Framework for Recommendation in Sparse

Researchers propose LLMAR, a tuning-free recommendation framework that uses LLM reasoning to infer user 'latent motives' from sparse text-rich data. It outperforms state-of-the-art models in sparse industrial scenarios while keeping inference costs low, offering a practical alternative to costly fine-tuning.

80% relevant

MiniMax Added as Official Provider for OpenClaude AI Framework

MiniMax has been integrated as an officially supported provider for the OpenClaude framework, giving developers a new, enterprise-backed model option for running the open-source Claude alternative.

89% relevant

Andrej Karpathy's LLM-Wiki Framework Solves AI Amnesia with Persistent Knowledge

Andrej Karpathy published a two-page framework called LLM-Wiki that transforms how AI systems handle accumulated knowledge. Instead of retrieving from raw documents each time, the AI compiles sources into its own structured wiki that persists across sessions.

85% relevant

TRACE: A Multi-Agent LLM Framework for Sustainable Tourism Recommendations

A new research paper introduces TRACE, a modular LLM-based framework for conversational travel recommendations. It uses specialized agents to elicit sustainability preferences and generate 'greener' alternatives through interactive explanations, aiming to reduce overtourism and carbon-intensive travel.

92% relevant

FeCoSR: A Federated Framework for Cross-Market Sequential Recommendation

A new arXiv paper introduces FeCoSR, a federated collaboration framework for cross-market sequential recommendation. It tackles data isolation and market heterogeneity by enabling many-to-many collaborative training with a novel loss function, showing advantages over traditional transfer approaches.

82% relevant

MVCrec: A New Multi-View Contrastive Learning Framework for Sequential

Researchers propose MVCrec, a framework that applies multi-view contrastive learning between sequential (ID-based) and graph-based views of user interaction data to improve recommendation accuracy. It outperforms 11 leading models, showing significant gains in key metrics.

84% relevant

New Research Proposes Unified LLM Framework for Need-Driven Service

A new arXiv paper introduces a large language model framework that unifies living need prediction and service recommendation for local life services. It uses behavioral clustering to filter noise and a curriculum learning + RL strategy to navigate complex decision paths. Experiments show it significantly improves both need prediction and recommendation accuracy.

82% relevant

A-R Space Framework Profiles LLM Agent Execution Behavior Across Risk Contexts

Researchers propose the A-R Space, measuring Action Rate and Refusal Signal to profile LLM agent behavior across four risk contexts and three autonomy levels. This provides a deployment-oriented framework for selecting agents based on organizational risk tolerance.

96% relevant

Is Sliding Window All You Need? An Open Framework for Long-Sequence

A new arXiv paper provides a complete, open-source framework for training long-sequence recommender systems using sliding windows. It demonstrates up to +6.34% recall gains on retail data and introduces a novel embedding layer for large vocabularies, making the technique practical for academic and industrial research.

90% relevant

LLM-HYPER: A Training-Free Framework for Cold-Start Ad CTR Prediction

A new arXiv paper introduces LLM-HYPER, a framework that treats large language models as hypernetworks to generate parameters for click-through rate estimators in a training-free manner. It uses multimodal ad content and few-shot prompting to infer feature weights, drastically reducing the cold-start period for new promotional ads and has been deployed on a major U.S. e-commerce platform.

96% relevant

Omar Saro on Multi-User LLM Agents: A New Framework Frontier

AI researcher Omar Saro points out that all current LLM agent frameworks are designed for single-user instruction, creating a deployment barrier for team-based workflows. This identifies a major unsolved problem in making AI agents practically useful in organizations.

75% relevant

ContextSim: A New LLM Framework for Context-Aware Recommender System Simulation

A new arXiv preprint introduces ContextSim, a framework that uses LLM agents to simulate users interacting with recommender systems within realistic daily scenarios (time, location, needs). Experiments show it generates more human-aligned interactions and that RS parameters optimized with it yield improved real-world engagement.

92% relevant

HARPO: A New Agentic Framework for Conversational Recommendation Aims to

A new research paper introduces HARPO, a hierarchical agentic reasoning framework for conversational recommender systems. It reframes recommendation as a structured decision-making process, directly optimizing for interpretable quality dimensions like relevance, diversity, and predicted satisfaction. The approach shows consistent improvements on recommendation-centric metrics across three datasets.

87% relevant

SID-Coord: A New Framework for Balancing Memorization and Generalization

A new arXiv paper introduces SID-Coord, a framework that integrates trainable Semantic IDs (SIDs) with traditional Hashed IDs (HIDs) in ranking models. It aims to solve the memorization-generalization trade-off, improving performance on long-tail items. Online A/B tests in a production short-video search system showed statistically significant improvements in engagement metrics.

84% relevant

Beyond Relevance: A New Framework for Utility-Centric Retrieval in the LLM Era

This tutorial paper posits that the rise of Retrieval-Augmented Generation (RAG) changes the fundamental goal of information retrieval. Instead of finding documents relevant to a query, systems must now retrieve information that is most *useful* to an LLM for generating a high-quality answer. This requires new evaluation frameworks and system designs.

92% relevant