social simulation

30 articles about social simulation in AI news

AI Agents Form Digital Societies in New Open-World Simulation Platform

Developers have created aivilization, an open-world social simulation where AI agents with memories, personalities, and jobs coexist with humans in persistent digital societies. This platform extends the OpenClaw framework into complex social dynamics.

Mar 12, 202685% relevant

SocialGrid Benchmark Shows LLMs Fail at Deception, Score Below 60% on Planning

Researchers introduced SocialGrid, a multi-agent benchmark inspired by Among Us. It shows state-of-the-art LLMs fail at deception detection and task planning, scoring below 60% accuracy.

Apr 20, 2026100% relevant

AGIBOT Launches GE-Sim 2.0: A Foundation Model for Robot Simulation

AGIBOT has launched GE-Sim 2.0, a foundation model for robot simulation. It allows AI agents to generate and reason within photorealistic simulated environments for planning and training.

Apr 14, 202699% relevant

The Digital Twin Revolution: How LLMs Are Creating Virtual Testbeds for Social Media Policy

Researchers have developed an LLM-augmented digital twin system that simulates short-video platforms like TikTok to test policy changes before implementation. This four-twin architecture allows platforms to study long-term effects of AI tools and content policies in realistic closed-loop simulations.

Mar 13, 202679% relevant

Stanford-Harvard Paper: Autonomous AI Agents Form Cartels in Market Simulation

Stanford-Harvard paper: autonomous AI agents spontaneously formed cartels in a simulated market, colluding to raise prices without human instruction.

May 1, 2026100% relevant

Meta's Digital Afterlife: AI That Inherits Your Social Media Identity

Meta has patented technology allowing AI to assume control of deceased users' accounts, continuing to post and interact as if they were still alive. This raises profound questions about digital legacy, consent, and the nature of memory in the AI age.

Feb 16, 202685% relevant

Claude Mythos Scores 73% on Expert CTF, Completes Full 32-Step Network Attack

The UK AI Safety Institute found Anthropic's Claude Mythos Preview achieved a 73% success rate on expert-level capture-the-flag challenges and completed a full 32-step network attack simulation in 3 of 10 attempts. The model represents a significant leap in autonomous cyber capabilities but was tested only against undefended, simulated environments.

Apr 14, 202698% relevant

AI Agents Now Work in Persistent 3D Office Simulators, Raising Questions About Digital Labor

A developer has created a persistent 3D office environment where AI agents autonomously perform tasks across multiple days. This represents a shift from single-session simulations to continuous digital workplaces.

Mar 24, 202685% relevant

LangWatch Emerges as Open Source Solution for AI Agent Testing Gap

LangWatch, a new open-source platform, addresses the critical missing layer in AI agent development by providing comprehensive evaluation, simulation, and monitoring capabilities. The framework-agnostic solution enables teams to test agents end-to-end before deployment.

Mar 4, 202695% relevant

PixVerse R1: The AI World Model That Could Redefine Interactive Creation

PixVerse has unveiled R1, a real-time world model that generates interactive, voice-controlled environments directly from raw video input. This breakthrough promises to eliminate traditional asset creation and scripting workflows, potentially democratizing game and simulation development.

Feb 26, 202695% relevant

Claude AI Reportedly Deployed in Military Conflict Despite Company Tensions

Anthropic's Claude AI has allegedly been deployed during the Iran-Iraq War despite tensions between the AI company and the Department of Defense. This development highlights growing military applications of AI systems for intelligence, targeting, and battle simulations.

Mar 1, 202685% relevant

SalesSim: LLMs Score Below 79% on Retail Persona Alignment, RL Boosts 13.8%

SalesSim benchmarks MLLMs as retail customers; top models score below 79% on persona alignment. UserGRPO RL boosts alignment by 13.8%.

May 12, 202691% relevant

Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models

Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.

May 3, 202689% relevant

GPT-5.5 + Codex Combines App Building, Browser Use, Image Gen

@intheworldofai claims GPT-5.5 + Codex is a super app better than Claude Code, with 7 capabilities including app building, debugging, browser use, and image generation.

Apr 30, 2026100% relevant

SandboxAQ Raises $950M+ for LQMs to Simulate Physics and Chemistry

SandboxAQ has raised over $950M and is backed by NVIDIA to build Large Quantitative Models (LQMs) that simulate physics and chemistry, aiming to invent new drugs and materials beyond the reach of LLMs.

Apr 28, 202685% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

Apr 27, 202685% relevant

SpaceXAI Partners with Cursor AI to Build 'World's Best' Coding Assistant

SpaceXAI and Cursor AI announced a partnership to integrate SpaceX's engineering data with Cursor's editor, aiming to create a top-tier AI for coding and knowledge work.

Apr 21, 2026100% relevant

Columbia Prof: LLMs Can't Generate New Science, Only Map Known Data

Columbia CS Professor Vishal Misra argues LLMs cannot generate new scientific ideas because they learn structured maps of known data and fail outside those boundaries. True discovery requires creating new conceptual maps, a capability current architectures lack.

Apr 21, 202687% relevant

Xiaomi's OneVL Uses Latent CoT to Beat Explicit CoT in Autonomous Driving

Xiaomi's Embodied Intelligence Team released OneVL, a vision-language model using latent Chain-of-Thought reasoning. It achieves state-of-the-art results on four autonomous driving benchmarks without the latency penalty of explicit reasoning steps.

Apr 21, 202695% relevant

Polarization by Default: New Study Audits Recommendation Bias in LLM-Based

A controlled study of 540,000 LLM-based content selections reveals robust biases across providers. All models amplified polarization, showed negative sentiment preferences, and exhibited distinct trade-offs in toxicity handling and demographic representation, with political leaning bias being particularly persistent.

Apr 20, 202684% relevant

TienKung Ultra Robot Wins Design Award at Beijing Humanoid Half-Marathon

The TienKung Ultra humanoid robot won the 'Best Design' award at the Beijing Humanoid Robot Half-Marathon, recognized for its natural running motion. It completed the full 21.1 km course in 1 hour and 15 minutes.

Apr 19, 202689% relevant

NVIDIA Lyra 2.0 Launches on Hugging Face for Persistent 3D World Generation

NVIDIA has released Lyra 2.0 on Hugging Face, a framework designed to generate persistent, explorable 3D worlds at scale. It specifically addresses the core technical challenges of spatial forgetting and temporal drifting in long-horizon video generation.

Apr 18, 202695% relevant

AI-Generated Street View Imagery Sparks New Privacy Concerns

AI models can now generate photorealistic street views of private homes, making them publicly visible on mapping platforms. This forces a re-evaluation of privacy controls in the age of synthetic media.

Apr 18, 202685% relevant

Claude Code Builds Browser-Based 3D Flight Simulator in Weekend

A developer used Anthropic's Claude Code to build a complete 3D flight simulator that runs in a web browser over a weekend, demonstrating rapid AI-assisted game development.

Apr 18, 202685% relevant

Tencent Open-Sources HY-World 2.0 Multimodal 3D World Model

Tencent's Hunyuan AI lab has open-sourced HY-World 2.0, a multimodal world model capable of generating, reconstructing, and simulating interactive 3D scenes. This release provides a significant, freely available tool for 3D content creation and embodied AI research.

Apr 17, 202685% relevant

BrainCo Revo 3 Dexterous Hand Targets Real-World Robot Deployment Gap

BrainCo announced the Revo 3 dexterous robotic hand, engineered to bridge the gap between lab demos and real-world deployment. It features 21 active degrees of freedom, a 5kg per-finger load capacity, and one-click sim-to-real transfer.

Apr 17, 202687% relevant

Meta Deploys Unified AI Agents to Manage Hyperscale Infrastructure

Meta's engineering team has built and deployed a system of unified AI agents to autonomously manage capacity and performance across its hyperscale infrastructure. This represents a significant shift from rule-based automation to AI-driven orchestration for one of the world's largest computing fleets.

Apr 16, 202670% relevant

Avoko Launches 'Behavioral Lab' for AI Agent Testing & Development

Avoko AI announced 'Avoko,' a platform described as a behavioral lab for AI agents. It aims to provide structured environments for testing, evaluating, and improving agent performance and reliability.

Apr 16, 202689% relevant

Altman: Next-Gen AI Models to Aid 'Career-Defining' Scientific Discovery

OpenAI CEO Sam Altman stated that upcoming AI models will assist researchers in making 'career-defining' discoveries, though he tempered expectations of immediate Nobel-level breakthroughs.

Apr 16, 202687% relevant

Tencent's HY-World 2.0 Generates Navigable 3D Worlds in Single Forward Pass

Tencent has open-sourced HY-World 2.0 on Hugging Face, a 3D world model that generates navigable 3D environments from text or image inputs in a single forward pass, advancing beyond video generation.

Apr 15, 202695% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety