Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Products & LaunchesBreakthroughScore: 92

Pinterest Builds Dedicated Conversion Candidate Generation Model

Pinterest details the design and deployment of a dedicated shopping conversion candidate generation model, replacing engagement-based retrieval. Key innovations include a parallel DCN v2 and MLP architecture (+11% recall) and a unified multi-task approach that boosted conversion recall by +42% over their 2023 model.

GAla Smith & AI Research Desk·1d ago·7 min read·2 views·AI-Generated·Report error

Source: medium.comvia pinterest_engineeringSingle Source

Key Takeaways

Pinterest details the design and deployment of a dedicated shopping conversion candidate generation model, replacing engagement-based retrieval.
Key innovations include a parallel DCN v2 and MLP architecture (+11% recall) and a unified multi-task approach that boosted conversion recall by +42% over their 2023 model.

What Happened

HBase Deprecation at Pinterest. Alberto Ordonez Pereira | Senior Staff ...

Pinterest's engineering team has published a detailed technical post documenting the development of a dedicated candidate generation model for shopping conversions. The work, led by Machine Learning Engineers Richard Huang, Yu Liu, Ziwei Guo, Andy Mao, and Supeng Ge, addresses a fundamental challenge in ad tech: optimizing for rare but high-value offsite conversion events (checkouts, add-to-cart) rather than the more abundant onsite engagement signals (clicks, saves).

Previously, Pinterest's shopping ads retrieval relied on engagement-based models. While effective for driving interaction, this system was not designed to optimize for lower-funnel conversions. The team launched their first shopping conversion model in 2023, achieving meaningful wins across both conversion and engagement, including a higher clickthrough rate (CTR). Further iterations in 2025 unlocked even stronger conversion value and improved Return on Ad Spend (RoAS) for advertisers.

Technical Details

The core challenge Pinterest faced is one familiar to any platform with offsite conversion tracking: data sparsity, noise, and reporting delay. Unlike clicks that happen in real-time on Pinterest's own servers, conversion events are reported by advertisers, making them significantly sparser and noisier.

Training Data Design

To address sparsity, Pinterest made several key design decisions:

Multi-Surface Model: A single model trained across all shopping surfaces (Homefeed, Related Pins, Search) to avoid fragmenting sparse conversion labels, with surface-specific features to learn contextual differences.
Dual Positive Signals: Supplementing primary conversion signals with onsite engagement data (clicks, repins). To mitigate click data noise, they apply a log-based re-weighting function based on click duration.
Negative Sampling: Using ad impressions with no engagement as "harder negatives" to expose the model to a more representative inventory distribution.

Model Architecture: Parallel DCN v2 and MLP

The most significant architectural innovation is a parallel cross-layer design. Early iterations used a stacked (sequential) architecture where a DCN v2 cross network processed input first, feeding its output into an MLP. The team hypothesized this created an information bottleneck.

Their solution: a parallel architecture where both the DCN v2 cross network and a 3-layer MLP learn directly and simultaneously from the same input features. The cross network captures explicit feature interactions by referencing the original input at every layer, while the MLP learns implicit abstract patterns in parallel. This design delivered a +11% gain in offline recall@1000 for the conversion task and was subsequently adopted by all of Pinterest's production engagement retrieval models.

From Multi-Head to Unified Multi-Task

The 2023 model used a multi-head architecture with shared encoders followed by separate engagement and conversion heads. Through analysis, the team identified sparsity and noise in conversion labels as a key bottleneck. Their 2025 iteration moved to a unified single-head multi-task architecture, merging the two heads so the final embeddings directly benefit from multi-task optimization during serving.

Additionally, they introduced an advertiser-level loss function as an additional training objective, enabling the model to capture conversion signals at a more stable granularity than noisy Pin-level supervision. Combined with other improvements, this achieved an average +42% increase in recall@100 for conversion tasks compared to the 2023 model.

Retail & Luxury Implications

While Pinterest's post is technically focused on ad tech, the architectural innovations have direct relevance for retail and luxury e-commerce platforms:

1. Solving the Sparse Signal Problem

Luxury retailers face an extreme version of Pinterest's challenge. High-value purchases are rare events — a customer might browse for weeks before a single conversion. Models trained primarily on engagement signals (page views, time spent) optimize for browsing, not buying. Pinterest's approach of supplementing sparse conversion labels with re-weighted engagement signals offers a template for luxury e-commerce platforms that struggle to train effective purchase-intent models.

2. Multi-Surface Personalization

Luxury brands increasingly operate across multiple surfaces: website, app, email, social, and in-store. Pinterest's multi-surface model architecture — training one model across contexts with surface-specific features — provides a blueprint for unified personalization without fragmenting data across channels.

3. The Parallel Architecture Pattern

For retail AI teams building recommendation or search systems, Pinterest's parallel DCN v2 + MLP architecture is a concrete, validated design pattern. The finding that sequential architectures create information bottlenecks is a cautionary note for teams using stacked cross networks. The +11% recall improvement from this architectural change alone is substantial and directly applicable to product retrieval at any scale.

4. Advertiser-Level Granularity

The introduction of an advertiser-level loss function is particularly relevant for retail platforms that serve multiple brands. Pinterest found that Pin-level conversion data had too much variance to be reliable; aggregating to a higher level (advertiser) provided more stable supervision. For luxury marketplaces or multi-brand retailers, this suggests that brand-level or category-level training signals may outperform item-level signals for purchase prediction.

Implementation Approach

How To Use Conversion AI to Optimize Lead Generation

For a retail or luxury brand considering a similar approach:

Technical Requirements

Infrastructure: Two-tower retrieval architecture (standard for large-scale recommendation systems)
Data Pipeline: Ability to join onsite engagement data with offsite conversion data (requires pixel/server-side tracking)
Feature Engineering: User-side features (context, historical embeddings via Transformer) and item-side features (multi-modal embeddings, performance features)
Model Complexity: DCN v2 cross layers + parallel MLP — moderate complexity, well-documented in TensorFlow/Keras

Effort Estimate

Data Engineering: 2-4 weeks to establish reliable conversion tracking pipeline
Model Development: 4-8 weeks for initial implementation and training
Online Testing: 4-8 weeks for A/B experimentation (conversion events are sparse, requiring longer test windows)

Key Risks

Data Sparsity: Luxury brands with low transaction volumes may not have enough conversion data
Label Delay: Offsite conversions can take days to report, complicating real-time optimization
Cold Start: New products or brands have no conversion history

Governance & Risk Assessment

Data Privacy: Pinterest's approach relies on advertiser-reported conversion data. In Europe, this requires clear consent mechanisms and data processing agreements under GDPR. Luxury brands must ensure customer purchase data is handled with appropriate anonymization and consent.
Bias Risk: Models optimized for conversions may disproportionately surface lower-priced, higher-velocity items, potentially undermining luxury brand positioning. Pinterest's advertiser-level loss function partially addresses this by smoothing across a brand's entire catalog.
Maturity Level: This is production-proven technology at Pinterest scale (600M+ MAUs). The architectural patterns are mature and well-documented, though the specific application to luxury retail would require adaptation.

gentic.news Analysis

Pinterest's post is a masterclass in pragmatic ML engineering for the ad tech space, but its relevance extends well beyond advertising. The core problem — optimizing for rare, high-value events in the presence of abundant but noisy proxy signals — is the central challenge for AI in luxury retail.

This follows a pattern we've observed across multiple platforms. As we covered in our analysis of [prior article on recommendation systems], the industry is moving from engagement optimization to conversion optimization. Pinterest's work validates that this transition requires fundamentally different architectures, not just different loss functions.

The parallel DCN v2 + MLP architecture is particularly noteworthy. While sequential cross-and-deep networks have been standard since Google's DCN paper, Pinterest's finding that sequential processing creates an information bottleneck is a genuine contribution. The +11% recall improvement from this change alone, validated across all engagement retrieval models, suggests many teams may be leaving performance on the table with their current architectures.

The advertiser-level loss function also addresses a tension in luxury retail: individual item signals are noisy, but brand-level signals may be too coarse. Pinterest's solution — using both levels with different loss weights — offers a practical middle ground.

For luxury brands, the key takeaway is that conversion optimization is technically achievable, but requires deliberate architectural choices. The days of using engagement models as proxies for purchase intent are numbered. Pinterest has provided a clear roadmap for the transition.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

**For AI Practitioners in Retail/Luxury:** Pinterest's post is unusually valuable because it shares both the successes and the failures. The admission that their 2023 multi-head architecture underperformed due to label sparsity, and the specific architectural change (unified single-head) that fixed it, is the kind of practical insight rarely found in research papers. **Architecture Takeaway:** The parallel DCN v2 + MLP design is immediately actionable. Any team using two-tower retrieval with cross networks should test this change. The information bottleneck problem is real, and Pinterest's solution is simple to implement (concatenate the outputs of two parallel streams rather than feeding one into the other). **Data Strategy Takeaway:** Pinterest's approach to handling sparse conversion data — multi-surface training, re-weighted engagement supplements, advertiser-level loss — is a complete toolkit for any platform with sparse purchase data. Luxury brands with low transaction volumes should pay particular attention to the advertiser-level loss function, which stabilizes training when item-level signals are too noisy. **Maturity Assessment:** This is production-grade work at massive scale (600M+ users). The techniques are battle-tested. However, the specific implementation details (feature engineering choices, loss weight tuning) are likely Pinterest-specific. Teams should treat this as a validated architectural pattern, not a drop-in solution.

#pinterest #e-commerce #machine learning #recommendation systems #ad tech

Mentioned in this article

Richard Huang Yu Liu Ziwei Guo Andy Mao Supeng Ge

Enjoyed this article?

Get the weekly AI intelligence briefing

Products & Launches

Pinterest Builds Dedicated Conversion Candidate Generation Model

Key Takeaways

What Happened

Technical Details

Training Data Design

Model Architecture: Parallel DCN v2 and MLP

From Multi-Head to Unified Multi-Task

Retail & Luxury Implications

1. Solving the Sparse Signal Problem

2. Multi-Surface Personalization

3. The Parallel Architecture Pattern

4. Advertiser-Level Granularity

Implementation Approach

Technical Requirements

Effort Estimate

Key Risks

Governance & Risk Assessment

gentic.news Analysis

AI Analysis

Related Articles

Google Splits TPU Line: 8t for Training, 8i for Inference

China Blocks Meta's $2B Manus Acquisition Over AI Tech Transfer Fears

The 2026 CLAUDE.md Playbook: 8 Rules That Make Your Agent 2x More Effective

GPT-5.5 Tops Benchmarks, Costs 2x API Price, Still Hallucinates

DeepSeek-V4 Ported to MLX for Apple Silicon Inference

Claude Code Digest — Apr 20–Apr 23

More in Products & Launches

SandboxAQ Raises $950M+ for LQMs to Simulate Physics and Chemistry

China's OpenClaw Mandate: Subsidies, Quotas, and Firing for Non-Use

The Agency: 147 Open Source AI Agents Hit 50K GitHub Stars in 2 Weeks