MVCrec: A New Multi-View Contrastive Learning Framework for Sequential

Researchers propose MVCrec, a framework that applies multi-view contrastive learning between sequential (ID-based) and graph-based views of user interaction data to improve recommendation accuracy. It outperforms 11 leading models, showing significant gains in key metrics.

AAAla SMITH & AI Research Desk·Apr 16, 2026·5 min read··152 views·AI-Generated·Report error

Source: arxiv.orgvia arxiv_irCorroborated

TL;DR

A new AI model for sequential recommendation fuses ID and graph views with contrastive learning, achieving up to a 14.4% performance gain over state-of-the-art baselines.

Key Takeaways

Researchers propose MVCrec, a framework that applies multi-view contrastive learning between sequential (ID-based) and graph-based views of user interaction data to improve recommendation accuracy.
It outperforms 11 leading models, showing significant gains in key metrics.

What Happened

A new research paper, "ID and Graph View Contrastive Learning with Multi-View Attention Fusion for Sequential Recommendation," was posted to the arXiv preprint server. The work introduces a novel framework called Multi-View Contrastive learning for sequential recommendation (MVCrec). Its core innovation is the joint optimization of two complementary perspectives on user interaction data: the sequential (ID-based) view and the graph-based view.

Sequential recommendation is a cornerstone of modern e-commerce, aiming to predict a user's next likely interaction (e.g., purchase, click) based on their historical sequence of actions. While recent models have used either contrastive learning (to create robust representations) or graph neural networks (to model relational structure), MVCrec is one of the first to deeply integrate both through a multi-view contrastive approach, specifically designed for settings with only basic interaction data.

Technical Details

The MVCrec framework is built on a key insight: the ID-based sequential view and the graph-based relational view offer different, complementary signals about user preferences. The ID view treats each item as a unique token, learning patterns in the sequence order. The graph view constructs a graph where items are nodes, and connections are based on co-occurrence in sequences, capturing broader relational patterns (e.g., "users who viewed this also viewed that").

The model employs three contrastive learning objectives:

Intra-sequential contrastive loss: Augments the raw interaction sequence (e.g., via masking or cropping) to learn robust representations within the sequential view.
Intra-graph contrastive loss: Applies graph augmentation techniques (e.g., edge dropping) to learn invariant representations within the graph structure.
Cross-view contrastive loss: This is the crucial innovation. It aligns the representations of the same user or item from the sequential and graph views, ensuring they are consistent and mutually informative.

Finally, a multi-view attention fusion module dynamically combines the learned representations from both views. It uses a combination of global attention (to capture overall preference) and local attention (to focus on recent interactions) to generate the final prediction score for a user-item pair.

The authors validated MVCrec on five public benchmark datasets (Amazon-Beauty, Amazon-Sports, Yelp, MovieLens-1M, and Gowalla). It was tested against 11 state-of-the-art baselines, including models like SASRec, BERT4Rec, S3-Rec, and LightGCN. The results are compelling: MVCrec achieved improvements of up to 14.44% in NDCG@10 and 9.22% in HitRatio@10 over the strongest baseline. The code and datasets have been made publicly available on GitHub.

Retail & Luxury Implications

This research is directly applicable to the core operational challenge of personalization in retail and luxury. The performance gains demonstrated by MVCrec translate to a more accurate, next-best-offer engine that can operate using a brand's existing first-party interaction data—clicks, views, purchases—without requiring rich, hard-to-obtain auxiliary data like detailed product attributes or user demographics.

Figure 3: Performance with different batch sizes under NDCG@20. The results from Reddit have been divided by 10 to ensur

For a luxury brand, the implications are significant:

Post-Session Engagement: Predicting the next item a high-intent browser might desire after viewing a handbag, based on the sequential patterns of similar clients and the broader "style graph" of the collection.
Email & Push Notification Curation: Dynamically generating personalized product sequences for re-engagement campaigns that feel coherent and stylistically relevant, not just based on simple co-view statistics.
Enhancing In-Session Discovery: On a product detail page, the "Complete the Look" or "You May Also Like" sections could be powered by this fused ID-graph logic, suggesting items that are both sequentially probable and graph-theoretically complementary.
Cold-Start Mitigation: The graph component can help position a new season's item within the existing relational web of products, providing a better starting point for recommendations even with zero purchase history.

The framework's strength in "settings where only interaction data is available" is particularly relevant for luxury houses prioritizing data privacy and working with constrained, but high-quality, purchase histories.

Implementation Approach

Adopting MVCrec is a non-trivial engineering undertaking, suitable for brands with mature data science and MLOps teams. The prerequisite is a robust data pipeline that can transform raw event logs (user ID, item ID, timestamp, event type) into two parallel data structures: chronologically ordered user sequences and a global item-item interaction graph.

Figure 1: Our proposed framework, MVCrec, consists of multi-view contrastive learning and multi-view attention fusion mo

The model itself, as described, requires expertise in PyTorch or TensorFlow, contrastive learning, and graph neural networks (GNNs). The training process involves the simultaneous optimization of the three contrastive losses and the final recommendation loss, which demands careful hyperparameter tuning and significant computational resources for training on large-scale interaction graphs. However, the inference step—generating recommendations for a user—would be comparable in latency to other deep learning-based recommenders once the model is deployed and the user/item embeddings are pre-computed.

For most luxury brands, the path to value would not be a from-scratch implementation of the academic paper. Instead, the strategic approach is to pressure-test existing recommendation vendors on whether their architectures incorporate similar multi-view or contrastive learning principles. Internally, the research validates a direction: investment in teams and platforms capable of building and maintaining graph-based representations alongside sequential models is a credible path to superior personalization.

Source: gentic.news · Apr 16, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

For AI leaders in retail and luxury, MVCrec is a signal, not a sudden solution. It reinforces a clear trend in recommendation research: the move beyond single-model paradigms towards **multi-modal or multi-view fusion systems** that combine different data perspectives. This follows a week of intense activity on arXiv for recommender systems, including papers on long-sequence modeling and LLM-based hypernetworks for cold-start scenarios, indicating the field's rapid evolution. This research aligns with the broader industry shift we noted in our coverage of **HARPO**, an agentic framework for conversational recommendation. Both approaches seek to create a more holistic, context-aware understanding of user intent, though through different architectural means. The performance gains claimed by MVCrec (9-14%) are substantial in the realm of academic benchmarks. In a production retail environment, even a fraction of that gain in metrics like conversion rate or average order value would represent a major commercial impact. However, the gap between a well-executed academic proof-of-concept on curated datasets and a stable, scalable production system is wide. The key challenge for luxury brands will be adapting such a model to their unique data characteristics—sparse but high-value purchase sequences, the importance of seasonality and collection cycles, and the need for explainability in high-touch clienteling scenarios. The framework provides a compelling blueprint, but its real-world utility will depend on a brand's ability to execute the complex data engineering and model integration required. This paper also sits at an interesting intersection of trends highlighted in our Knowledge Graph. While not directly using **Retrieval-Augmented Generation (RAG)**, MVCrec's philosophy of fusing different data "views" is conceptually similar to RAG's fusion of parametric knowledge with retrieved context. As Ethan Mollick recently declared the end of the 'RAG era' as the dominant paradigm for agents, we see its principles being absorbed and specialized into core domains like recommendation—a sign of a maturing technology landscape.

#recommender systems #machine learning #ai research

Compare side-by-side

MVCrec vs Deep Interest Network

→

Mentioned in this article

MVCrec Deep Interest Network arXiv sequential recommendation

Enjoyed this article?