Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

model scaling

30 articles about model scaling in AI news

UniScale: A Co-Design Framework for Data and Model Scaling in E-commerce Search Ranking

Researchers propose UniScale, a framework that jointly optimizes data collection and model architecture for search ranking, moving beyond just scaling model parameters. It addresses diminishing returns from parameter scaling alone by creating a synergistic system for high-quality data and specialized modeling. This approach, validated on a large-scale e-commerce platform, shows significant gains in key business metrics.

95% relevant

Gamma 31B Model Reportedly Outperforms Qwen 3.5 397B, Highlighting Efficiency Leap

A developer's social media post claims the Gamma 31B model outperforms the much larger Qwen 3.5 397B. If verified, this would represent a dramatic efficiency gain in large language model scaling.

85% relevant

Qwen's Tiny Titan: How a 2B Parameter Multimodal Model Challenges AI Scaling Assumptions

Alibaba's Qwen team has released Qwen2-VL-2B, a surprisingly capable 2-billion parameter multimodal model with native 262K context length, extensible to 1M tokens. This compact model challenges assumptions about AI scaling while offering practical long-context capabilities for resource-constrained environments.

95% relevant

Beyond Better Models: The Compute Scaling Revolution Driving AI's Next Leap

New analysis reveals that scaling compute infrastructure may deliver 10× annual efficiency gains in AI development, surpassing algorithmic improvements alone. The real leverage comes from combining innovative ideas with massive computational resources.

85% relevant

LoopCTR: A New 'Loop Scaling' Paradigm for Efficient

A new research paper introduces LoopCTR, a method for scaling Transformer-based CTR models by recursively reusing shared layers during training. This 'train-multi-loop, infer-zero-loop' approach achieves state-of-the-art performance with lower deployment costs, directly addressing a core industrial constraint in recommendation systems.

92% relevant

Scaling Law Plateau Not Universal: More Tokens Boost Reasoning AI Performance

Empirical evidence indicates the 'second scaling law'—performance gains from increased computation—does not fully plateau for many reasoning tasks. Benchmark results may be artificially limited by token budgets, not model capability.

85% relevant

Lloyds Banking Group Details 'Atlas' ML Platform for Scaling AI in a

A technical blog post details how Lloyds Banking Group rebuilt its internal Machine Learning platform, Atlas, on a cloud-native architecture to overcome scaling limits and meet stringent regulatory requirements. This is a blueprint for operationalizing AI in high-stakes, governed industries.

88% relevant

UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems

A new arXiv paper introduces UniMixer, a unified scaling architecture for recommender systems. It bridges attention-based, TokenMixer-based, and factorization-machine-based methods into a single theoretical framework, aiming to improve parameter efficiency and scaling return on investment (ROI).

96% relevant

Roman Yampolskiy: 'AGI is a Question of Cost, Not Time' as Scaling Laws Hold

AI safety researcher Roman Yampolskiy argues that achieving AGI is now a matter of computational and financial resources, not theoretical possibility, citing the continued validity of scaling laws and early signs of recursive self-improvement.

87% relevant

Robotics' Scaling Breakthrough: How SONIC's 42M-Parameter Model Achieves Perfect Real-World Transfer

Researchers have demonstrated that robotics can scale like language models, with SONIC training a 42M-parameter model on 100M human motion frames. The system achieved 100% success transferring to real robots without fine-tuning, marking a paradigm shift in robotic learning.

95% relevant

OpenAI Readies General-Purpose LLM With Test-Time Compute Scaling

OpenAI is releasing a general-purpose LLM that improves with test-time compute, per an internal message. The model shows math gains without specialized training.

85% relevant

Cerebras IPO Challenges GPU Scaling Orthodoxy

Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.

98% relevant

VISTA: A Novel Two-Stage Framework for Scaling Sequential Recommenders to Lifelong User Histories

Researchers propose VISTA, a two-stage modeling framework that decomposes target attention to scale sequential recommendation to a million-item user history while keeping inference costs fixed. It has been deployed on a platform serving billions.

90% relevant

NVIDIA's Nemotron-Terminal: A Systematic Pipeline for Scaling Terminal-Based AI Agents

NVIDIA researchers introduce Nemotron-Terminal, a comprehensive data engineering pipeline designed to scale terminal-based large language model agents. The system bridges the gap between raw terminal data and high-quality training datasets, addressing key challenges in agent reliability and generalization.

85% relevant

Agent Harness Scaling: EFC Predicts Success at R2 0.99 vs 0.42

New research introduces Effective Feedback Compute (EFC), which predicts agent success at R2 0.99 vs 0.42 for raw tokens. Reallocating compute by EFC lifts success 3x at the same budget.

88% relevant

Stateless Memory for Enterprise AI Agents: Scaling Without State

The paper replaces stateful agent memory with immutable decision logs using event-sourcing, allowing thousands of concurrent agent instances to scale horizontally without state bottlenecks.

85% relevant

OpenReward Launches: A Minimalist Service for Scaling RL Environment Serving

OpenReward, a new product from Ross Taylor, launches as a focused service for serving reinforcement learning environments at scale. It aims to solve infrastructure bottlenecks for RL training pipelines.

85% relevant

Enterprise AI Goes Mainstream: How Major Corporations Are Scaling Operations with Intelligent Voice Systems

Major corporations including FedEx, Marriott, and Volkswagen are deploying advanced AI voice systems to handle millions of customer interactions, enabling instant scalability during peak demand periods without traditional hiring constraints.

85% relevant

Moonshot AI Ships Trillion-Parameter Open Model, Matches Claude Opus on Coding

Moonshot AI released a trillion-parameter open-source model that reportedly matches Anthropic's Claude Opus on most coding benchmarks. This follows the same day Anthropic committed $25B to AWS for compute, highlighting divergent AI scaling strategies.

100% relevant

Grok 4.20 at 0.5T Params, 1.5T Model in 5 Weeks

xAI's Grok 4.20 is reportedly a 0.5 trillion parameter model. The company plans to release a 1.5 trillion parameter version within 4-5 weeks, signaling rapid scaling.

85% relevant

daVinci-LLM 3B Model Matches 7B Performance, Fully Open-Sourced

The daVinci-LLM team has open-sourced a 3 billion parameter model trained on 8 trillion tokens. Its performance matches typical 7B models, challenging the scaling law focus on parameter count.

95% relevant

Frontier AI Models Reportedly Score Below 1% on ARC-AGI v3 Benchmark

A social media post claims frontier AI models have achieved below 1% performance on the ARC-AGI v3 benchmark, suggesting a potential saturation point for current scaling approaches. No specific models or scores were disclosed.

87% relevant

The Two-Year AI Leap: How Model Efficiency Is Accelerating Beyond Moore's Law

A viral comparison reveals AI models achieving dramatically better results with identical parameter counts in just two years, suggesting efficiency improvements are outpacing hardware scaling. This development challenges assumptions about AI progress and has significant implications for deployment costs and capabilities.

85% relevant

RxnNano: How a Tiny AI Model Outperforms Giants in Chemical Discovery

Researchers have developed RxnNano, a compact 0.5B-parameter AI model that outperforms models ten times larger in predicting chemical reactions. Using innovative training techniques that prioritize chemical understanding over brute-force scaling, it achieves 23.5% better accuracy on key benchmarks for drug discovery applications.

75% relevant

Beyond the Benchmark: New Model Separates AI Hype from True Capability

A new 'structured capabilities model' addresses a critical flaw in AI evaluation: benchmarks often confuse model size with genuine skill. By combining scaling laws with latent factor analysis, it offers the first method to extract interpretable, generalizable capabilities from LLM test results.

72% relevant

GPT-5.4 nano + critic loop hits 76.4% on SWE-Bench Verified

GPT-5.4 nano with critic-comparator loop scored 76.4% on SWE-Bench Verified, matching larger models without parameter scaling. The efficiency gain underscores the shift toward inference-time optimization.

85% relevant

OpenAI's 'Freebird' Data Center in Texas to Span 549K Sq Ft, Cost $470M

OpenAI is building a massive 548,950-square-foot data center in Milam, Texas, named 'Freebird,' with a first-phase cost of around $470 million. This infrastructure investment is critical for scaling next-generation AI model training and inference.

92% relevant

Pinterest Details 'Request-Level Deduplication' to Scale Massive

Pinterest's engineering team published a detailed technical breakdown of 'request-level deduplication'—a family of techniques that eliminate redundant processing of user data across thousands of candidate items in their recommendation system. This approach was critical to scaling their Foundation Model by 100x while controlling infrastructure costs.

98% relevant

Anthropic's Run Rate Hits $3.4B, Doubling in Six Months

Anthropic's annualized revenue run rate has reportedly reached $3.4 billion, doubling from ~$1.7B six months ago. The company is scaling enterprise deployments of its Claude models at a staggering pace.

81% relevant

Anthropic Warns Upcoming LLMs Could Cause 'Serious Damage'

Anthropic has issued a stark warning that its upcoming large language models could cause 'serious damage.' The company states there is 'no end in sight' to capability scaling and proliferation risks.

85% relevant