sequence modeling

30 articles about sequence modeling in AI news

IAT: Instance-As-Token Compression for Historical User Sequence Modeling

Researchers propose Instance-As-Token (IAT), which compresses all features of each historical interaction into a unified embedding token, then applies standard sequence modeling. This approach outperforms state-of-the-art methods and has been deployed in e-commerce advertising, shopping mall marketing, and live-streaming e-commerce with substantial business metric improvements.

Apr 13, 202693% relevant

WeightCaster: How Sequence Modeling in Weight Space Could Solve AI's Extrapolation Problem

Researchers propose WeightCaster, a novel framework that treats out-of-support generalization as a sequence modeling problem in neural network weight space. This approach enables AI models to make plausible, interpretable predictions beyond their training distribution without catastrophic failure.

Feb 17, 202675% relevant

PRAGMA: Revolut's Foundation Model for Banking Event Sequences

A new research paper introduces PRAGMA, a family of foundation models designed specifically for multi-source banking event sequences. The model uses masked modeling on a large corpus of financial records to create general-purpose embeddings that achieve strong performance on downstream tasks like fraud detection with minimal fine-tuning.

Apr 13, 202674% relevant

SELLER: A New Sequence-Aware LLM Framework for Explainable Recommendations

Researchers propose SELLER, a framework that uses Large Language Models to generate explanations for recommendations by modeling user behavior sequences. It outperforms prior methods by integrating explanation quality with real-world utility metrics.

Mar 26, 202692% relevant

New MoE Framework Tames User Interest Shifts in Long-Sequence Recommendations

Researchers propose MoS, a model-agnostic MoE approach that handles long user sequences by detecting session hopping – where user interests shift across sessions. The theme-aware routing mechanism filters irrelevant sessions, while multi-scale fusion captures global and local patterns. Results show SOTA on benchmarks with fewer FLOPs than alternatives.

Apr 24, 202694% relevant

SPPO: Sequence-Level PPO Cuts RL Training Time 5.9x for Math Reasoning

Researchers introduced SPPO, a sequence-level PPO algorithm that reformulates reasoning as a contextual bandit. It achieves a 5.9x speedup over GRPO while matching performance on AIME, AMC, and MATH benchmarks at 1.5B and 7B scales.

Apr 15, 202691% relevant

Is Sliding Window All You Need? An Open Framework for Long-Sequence

A new arXiv paper provides a complete, open-source framework for training long-sequence recommender systems using sliding windows. It demonstrates up to +6.34% recall gains on retail data and introduces a novel embedding layer for large vocabularies, making the technique practical for academic and industrial research.

Apr 15, 202690% relevant

Beyond Sequence Generation: The Emergence of Agentic Reinforcement Learning for LLMs

A new survey paper argues that LLM reinforcement learning must evolve beyond narrow sequence generation to embrace true agentic capabilities. The research introduces a comprehensive taxonomy for agentic RL, mapping environments, benchmarks, and frameworks shaping this emerging field.

Mar 7, 202685% relevant

MIT Open-Sources AI That Turns Photos Into Editable CAD Models

MIT open-sourced an AI that turns photos into editable CAD files, threatening $150/hour modeling work. No benchmarks or training details disclosed.

May 17, 202687% relevant

Shopify Engineering details 'Flow generation through natural language'

Shopify Engineering describes a 2026 approach to generating complex workflows (flows) from natural language prompts using an agentic modeling framework, enabling non-technical users to create automation.

Apr 22, 202698% relevant

New Research Proposes Collaborative Contrastive Network for Generalizable

Researchers propose the Collaborative Contrastive Network (CCN) to solve Trigger-Induced Recommendation challenges in ephemeral e-commerce scenarios like Black Friday. Instead of modeling ambiguous intent, CCN learns context-specific preferences from user-trigger pairs via novel contrastive signals. In online A/B tests on Taobao, CCN increased CTR by 12.3% and order volume by 12.7% in unseen scenarios.

Apr 16, 202680% relevant

TME-PSR: A New Sequential Recommendation Model Unifies Time

Researchers propose TME-PSR, a model integrating personalized time patterns, multi-interest modeling, and explanation alignment for sequential recommendations. It shows improved accuracy and explanation quality with lower computational cost in experiments.

Apr 13, 202690% relevant

Tencent Launches 2025 Ad Algorithm Challenge with Massive All-Modality Recommendation Datasets

Tencent has launched an open competition and released two industrial-scale datasets (TencentGR-1M and TencentGR-10M) to advance generative recommender systems. This has spurred related research into debiasing techniques and novel reranking frameworks, moving the field toward more holistic, multi-modal user modeling.

Apr 8, 202687% relevant

Research Exposes Hidden Data Splitting in Sequential Recommendation Models, Questioning SOTA Claims

Researchers found that sub-sequence splitting (SSS), a data augmentation technique, is widely but covertly used in recent sequential recommendation models. When removed, model performance often plummets, suggesting many published SOTA results are misleading. The study calls for more rigorous and transparent evaluation standards.

Apr 8, 202682% relevant

SLSREC: A New Self-Supervised Model for Disentangling Long- and Short-Term User Interests in Recommendations

A new arXiv preprint introduces SLSREC, a self-supervised model that disentangles long-term user preferences from short-term intentions using contrastive learning and adaptive fusion. It outperforms state-of-the-art models on three benchmark datasets, addressing a core challenge in dynamic user modeling.

Apr 7, 202688% relevant

New Relative Contrastive Learning Framework Boosts Sequential Recommendation Accuracy by 4.88%

A new arXiv paper introduces Relative Contrastive Learning (RCL) for sequential recommendation. It solves a data scarcity problem in prior methods by using similar user interaction sequences as additional training signals, leading to significant accuracy improvements.

Apr 3, 202688% relevant

MMM4Rec: A New Multi-Modal Mamba Model for Faster, More Transferable Sequential Recommendations

Researchers propose MMM4Rec, a novel sequential recommendation framework using State Space Duality for efficient multi-modal learning. It claims 10x faster fine-tuning convergence and improved accuracy by dynamically prioritizing key visual/textual information over user interaction sequences.

Mar 30, 202690% relevant

VISTA: A Novel Two-Stage Framework for Scaling Sequential Recommenders to Lifelong User Histories

Researchers propose VISTA, a two-stage modeling framework that decomposes target attention to scale sequential recommendation to a million-item user history while keeping inference costs fixed. It has been deployed on a platform serving billions.

Mar 27, 202690% relevant

TimeSqueeze: A New Method for Dynamic Patching in Time Series Forecasting

Researchers introduce TimeSqueeze, a dynamic patching mechanism for Transformer-based time series models. It adaptively segments sequences based on signal complexity, achieving up to 20x faster convergence and 8x higher data efficiency. This addresses a core trade-off between accuracy and computational cost in long-horizon forecasting.

Mar 13, 202670% relevant

New Research Improves Text-to-3D Motion Retrieval with Interpretable Fine-Grained Alignment

Researchers propose a novel method for retrieving 3D human motion sequences from text descriptions using joint-angle motion images and token-patch interaction. It outperforms state-of-the-art methods on standard benchmarks while offering interpretable correspondences.

Mar 11, 202675% relevant

Amazon's T-REX: A Transformer Architecture for Next-Basket Grocery Recommendations

Amazon researchers propose T-REX, a transformer-based model for grocery basket recommendations. It addresses unique challenges like repetitive purchases and sparse patterns through category-level modeling and causal masking, showing significant improvements in offline/online tests.

Mar 10, 202690% relevant

Survey Paper 'The Latent Space' Maps Evolution from Token Generation to Latent Computation in Language Models

Researchers have published a comprehensive survey charting the evolution of language model architectures from token-level autoregression to methods that perform computation in continuous latent spaces. This work provides a unified framework for understanding recent advances in reasoning, planning, and long-context modeling.

Apr 3, 202685% relevant

MCLMR: A Model-Agnostic Causal Framework for Multi-Behavior Recommendation

Researchers propose MCLMR, a causal learning framework that addresses confounding effects in multi-behavior recommendation systems. It uses adaptive aggregation and bias-aware contrastive learning to improve preference modeling from diverse user interactions like views, clicks, and purchases.

Mar 27, 202686% relevant

Exploration Space Theory: A Formal Framework for Prerequisite-Aware Recommendation Systems

Researchers propose Exploration Space Theory (EST), a lattice-theoretic framework for modeling prerequisite dependencies in location-based recommendations. It provides structural guarantees and validity certificates for next-step suggestions, with potential applications beyond tourism.

Mar 10, 202695% relevant

Simple Graph Heuristic Beats Generative Recommenders on 10 of 14 Benchmarks

A no-training graph heuristic beats generative recommenders on 10 of 14 benchmarks, exposing shortcut-solvable datasets. Relative NDCG@10 gains hit 44% on Amazon CDs.

May 11, 2026100% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

Apr 27, 202685% relevant

Paper Details Full-Stack MFM Acceleration: Quant, Spec Decode, HW Co-Design

A research paper details a full-stack approach for accelerating multimodal foundation models, combining hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, model cascading, and a specialized hardware accelerator. Demonstrated on medical and code generation tasks.

Apr 27, 202672% relevant

New AI Model Decomposes User Behavior into Multiple Spatiotemporal States

Researchers propose ADS-POI, which represents users with multiple parallel latent sub-states evolving at different spatiotemporal scales. This outperforms state-of-the-art on Foursquare and Gowalla benchmarks, offering more robust next-POI recommendations.

Apr 24, 202695% relevant

CAST: A New Framework for Semantic-Level Complementary Recommendations

Researchers propose CAST, a sequential recommendation framework that models transitions between discrete item semantic codes (e.g., specifications) and injects LLM-verified complementary knowledge. It achieves significant performance gains by moving beyond simplistic co-purchase statistics to capture genuine complementarity.

Apr 22, 202678% relevant

Layers on Layers — How You Can Improve Your Recommendation Systems

An IBM article critiques monolithic recommendation engines for trying to do too much with one score. It proposes a layered architecture—candidate generation, ranking, and business logic—to improve performance and adaptability. This is a direct, practical framework for engineering teams.

Apr 21, 202682% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety