Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

research policy

30 articles about research policy in AI news

Tsinghua Researchers Diagnose On-Policy Distillation Failures, Propose Fixes

Researchers from Tsinghua University have pinpointed two necessary conditions for successful on-policy distillation: compatible thinking patterns and novel teacher capabilities. They propose two recovery methods to salvage failing distillation runs.

85% relevant

The Digital Twin Revolution: How LLMs Are Creating Virtual Testbeds for Social Media Policy

Researchers have developed an LLM-augmented digital twin system that simulates short-video platforms like TikTok to test policy changes before implementation. This four-twin architecture allows platforms to study long-term effects of AI tools and content policies in realistic closed-loop simulations.

79% relevant

One Policy to Rule Them All: AI Robot Masters Unseen Tools with Zero-Shot Generalization

Researchers have developed a single robot policy capable of manipulating diverse, never-before-seen tools using sim-to-real reinforcement learning. The system achieves zero-shot generalization across 24 tasks, 12 objects, and 6 tool categories without object-specific training.

85% relevant

OpenAI Publishes 'Intelligence Age' Policy Blueprint for Superintelligence Transition

OpenAI published a policy blueprint outlining governance and economic proposals for the 'Intelligence Age,' framing superintelligence as an active transition requiring new safety nets and international coordination.

97% relevant

Anthropic Forms Corporate PAC to Influence AI Policy Ahead of Midterms

Anthropic is forming a corporate PAC to lobby on AI policy, signaling a strategic shift towards direct political engagement as regulatory debates intensify in Washington. This move follows similar efforts by OpenAI and Google.

85% relevant

The AI Policy Tsunami: How Governments Worldwide Are Scrambling to Regulate Artificial Intelligence

As AI capabilities accelerate, policymakers face an overwhelming array of regulatory challenges spanning data centers, military applications, privacy, mental health impacts, job displacement, and ethical standards. The rapid pace of development is creating a governance gap that neither governments nor AI labs can adequately address.

85% relevant

The AI Policy Gap: Why Governments Are Struggling to Keep Pace with Rapid Technological Change

AI expert Ethan Mollick warns that rapid AI advancements combined with knowledge gaps and uncertain futures are leading to reactive, scattered policy responses rather than coherent governance frameworks.

85% relevant

Sam Altman Advocates for 32-Hour Work Week in AI-Driven Policy Paper

Sam Altman has proposed a 4-day, 32-hour work week as part of a new social contract, reflecting a growing trend among executives to advocate for reduced working hours in the age of AI.

75% relevant

Google's Cookie Policy Update and the Challenge of AI-Powered Personalization

Google has updated its user-facing cookie and data consent interface, emphasizing its use of data for personalization and ad measurement. This reflects the ongoing tension between data-driven AI services and user privacy, a critical issue for luxury retail's digital transformation.

82% relevant

ChatGPT's Android App Hints at Future 'Naughty Chats' Feature, Signaling a Potential Shift in AI Content Policy

A recent update to the ChatGPT Android app includes code referencing 'Naughty chats,' suggesting OpenAI may be developing an adult-themed, 18+ mode. This discovery hints at a potential strategic expansion into less restricted conversational AI.

85% relevant

RLSD Unifies Self-Distillation & Verifiable Rewards to Fix RL Leakage

Researchers propose RLSD, a method merging on-policy self-distillation with verifiable rewards to fix information leakage and training instability in language model reinforcement learning.

85% relevant

The Self Driving Portfolio: Agentic Architecture for Institutional Asset Management

Researchers propose an 'agentic strategic asset allocation pipeline' using ~50 specialized AI agents to forecast markets, construct portfolios, and self-improve. The system is governed by a traditional Investment Policy Statement, aiming to automate high-level asset management.

88% relevant

CanViT: First Active-Vision Foundation Model Hits 45.9% mIoU on ADE20K with Sequential Glimpses

Researchers introduce CanViT, the first task- and policy-agnostic Active-Vision Foundation Model (AVFM). It achieves 38.5% mIoU on ADE20K segmentation with a single low-resolution glimpse, outperforming prior active models while using 19.5x fewer FLOPs.

91% relevant

AIGQ: Taobao's End-to-End Generative Architecture for E-commerce Query Recommendation

Alibaba researchers propose AIGQ, a hybrid generative framework for pre-search query recommendations. It uses list-level fine-tuning, a novel policy optimization algorithm, and a hybrid deployment architecture to overcome traditional limitations, showing substantial online improvements on Taobao.

100% relevant

SPREAD Framework Solves AI's 'Catastrophic Forgetting' Problem in Lifelong Learning

Researchers have developed SPREAD, a new AI framework that preserves learned skills across sequential tasks by aligning policy representations in low-rank subspaces. This breakthrough addresses catastrophic forgetting in lifelong imitation learning, enabling more stable and robust AI agents.

75% relevant

Mapping the Minefield: New Study Charts Five-Stage Taxonomy of LLM Harms

A new research paper systematically categorizes the potential harms of large language models across five lifecycle stages—from training to deployment—and argues that only multi-layered technical and policy safeguards can manage the risks.

95% relevant

MLLMRec-R1: A New Framework for Efficient Multimodal Sequential Recommendation with LLMs

Researchers propose MLLMRec-R1, a framework that makes Group Relative Policy Optimization (GRPO) practical for multimodal sequential recommendation by addressing computational cost and reward inflation issues. This enables more explainable, reasoning-based recommendations.

90% relevant

Beyond the Simplex: How Hilbert Space Geometry is Revolutionizing AI Alignment

Researchers have developed GOPO, a new alignment algorithm that reframes policy optimization as orthogonal projection in Hilbert space, offering stable gradients and intrinsic sparsity without heuristic clipping. This geometric approach addresses fundamental limitations in current reinforcement learning methods.

80% relevant

The Digital Detox Effect: How Phone-Free Schools Are Boosting Academic Performance

A landmark study reveals that banning mobile phones in schools significantly improves academic performance, particularly for struggling students. The research provides compelling evidence for educational policy changes worldwide.

85% relevant

Anthropic Publishes US-China AI Competition Blueprint

Anthropic published a policy paper on US-China AI competition, warning the US lead could erode within 3-5 years without strategic action including export controls and talent investment.

93% relevant

Adobe, NVIDIA, WPP Launch Enterprise AI Agents for Marketing with OpenShell

NVIDIA expands collaborations with Adobe and WPP to build agentic AI systems for enterprise marketing workflows. The stack uses NVIDIA's OpenShell runtime to enforce security and policy compliance in multi-step creative and customer experience tasks.

100% relevant

OpenAI Proposes 4-Day Week, Robot Tax Amid Rising Anti-AI Violence

Following violent attacks on CEO Sam Altman, OpenAI has published a policy paper proposing a new social contract, including a four-day workweek and AI dividends, to address rising public anxiety over AI's societal impact.

95% relevant

OpenClaw-RL Enables Live RL Training for Self-Hosted AI Agents

OpenClaw-RL introduces a system for performing asynchronous reinforcement learning on self-hosted models within the OpenClaw agent framework, allowing continuous policy improvement while the agent remains online.

89% relevant

Microsoft's EMPO²: A Memory-Augmented RL Framework That Supercharges LLM Agent Exploration

Microsoft has unveiled EMPO², a hybrid reinforcement learning framework that enhances LLM agents with augmented memory for true exploration. The system combines on- and off-policy optimization to discover novel states, achieving 128.6% performance gains over existing methods on ScienceWorld benchmarks.

85% relevant

AI Meets Infrastructure: OpenAI's New Tool Could Slash Federal Permitting Time by 15%

OpenAI has partnered with Pacific Northwest National Laboratory to launch DraftNEPABench, a benchmark showing AI coding agents can reduce National Environmental Policy Act drafting time by up to 15%. This collaboration signals AI's growing role in modernizing government processes.

75% relevant

Anthropic Abandons Core Safety Commitment Amid Intensifying AI Race

Anthropic has quietly removed a key safety pledge from its Responsible Scaling Policy, no longer committing to pause AI training without guaranteed safety protections. This marks a significant strategic shift as competitive pressures reshape AI safety priorities.

95% relevant

From Dismissed Warnings to Economic Reality: How AI's Job Disruption Forecasts Are Gaining Urgency

After two years of largely ignored warnings from AI lab CEOs about massive job displacement, workers and policymakers are beginning to take these predictions seriously as AI capabilities accelerate, creating new pressures on the industry.

85% relevant

GDPval Benchmark Reveals AI's Professional Competence: A New Tool for Economic Planning

A new interactive demonstration using OpenAI's GDPval benchmark shows current AI capabilities across economically valuable professional tasks. The project aims to make AI's real-world impact tangible for policymakers and civil society organizations, bridging the gap between technical assessments and practical economic decisions.

75% relevant

Google DeepMind Researcher: LLMs Can Never Achieve Consciousness

A Google DeepMind researcher has publicly argued that large language models, by their algorithmic nature, can never become conscious, regardless of scale or time. This stance challenges a core speculative narrative in AI discourse.

85% relevant

Shopify Engineering Teases 'Autoresearch' Beyond Model Training in 2026 Preview

Shopify Engineering has previewed a 2026 perspective suggesting 'autoresearch'—automated research processes—will have applications extending beyond just training AI models. This signals a broader operational automation strategy for the e-commerce giant.

100% relevant