3d generation

30 articles about 3d generation in AI news

PartRAG Revolutionizes 3D Generation with Retrieval-Augmented Part-Level Control

Researchers introduce PartRAG, a breakthrough framework that combines retrieval-augmented generation with diffusion transformers for precise part-level 3D creation and editing from single images. The system achieves superior geometric accuracy while enabling localized modifications without regenerating entire objects.

Feb 20, 202670% relevant

Meshcraft Democratizes 3D Creation: Multi-Engine AI Platform Bridges Text-to-3D Gap

Meshcraft emerges as a web-based platform offering text-to-3D and image-to-3D generation with selectable AI engines. The tool provides both free and premium options, addressing quality bottlenecks in 3D generation through engine optimization rather than image model refinement.

Mar 7, 202680% relevant

NVIDIA Lyra 2.0 Launches on Hugging Face for Persistent 3D World Generation

NVIDIA has released Lyra 2.0 on Hugging Face, a framework designed to generate persistent, explorable 3D worlds at scale. It specifically addresses the core technical challenges of spatial forgetting and temporal drifting in long-horizon video generation.

Apr 18, 202695% relevant

Browser-Based Text-to-CAD Tool Emerges, Enabling Local 3D Model Generation from Prompts

A developer has built a text-to-CAD application that operates entirely within a web browser, enabling local generation and manipulation of 3D models from natural language descriptions. This approach eliminates cloud dependency and could lower barriers for rapid prototyping.

Apr 4, 202687% relevant

Developer Open-Sources 'Prompt-to-3D' Tool for Instant, Navigable World Generation

A developer has released an open-source tool that creates interactive 3D worlds from text or image inputs. This moves 3D asset generation from static models to instant, explorable environments.

Apr 3, 202691% relevant

Tencent's HY-World 2.0 Generates Navigable 3D Worlds in Single Forward Pass

Tencent has open-sourced HY-World 2.0 on Hugging Face, a 3D world model that generates navigable 3D environments from text or image inputs in a single forward pass, advancing beyond video generation.

Apr 15, 202695% relevant

Text-to-Game AI Emerges: How a Single Prompt Can Now Generate Complete 3D Worlds

A breakthrough AI system can transform simple text descriptions into fully playable 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. This development represents a quantum leap in procedural content generation and democratizes game development.

Feb 26, 202685% relevant

BetterScene Bridges the Gap: How Aligning AI Representations Unlocks Photorealistic 3D Synthesis

Researchers introduce BetterScene, a novel AI method that dramatically improves 3D scene generation from just a handful of photos. By aligning the internal representations of a powerful video diffusion model, it produces consistent, artifact-free novel views, pushing the boundary of what's possible in computational photography and virtual world creation.

Feb 27, 202678% relevant

DeemosTech Rodin Gen-2.5: 10M-Polygon 3D GenAI in 4 Seconds

DeemosTech claims Rodin Gen-2.5 generates 10M polygon 3D models in 4 seconds with skin microstructures, but provides no benchmarks or technical details.

May 27, 202685% relevant

Microsoft World-R1: RL Aligns Text-to-Video with 3D Physics

Microsoft's World-R1 framework applies reinforcement learning with feedback from pre-trained 3D foundation models to align text-to-video outputs with physical 3D constraints, improving structural coherence without modifying the underlying video diffusion architecture.

Apr 28, 202685% relevant

Microsoft's TRELLIS.2: 4B Model Turns Images to 3D in 3 Seconds

Microsoft released TRELLIS.2, a 4B parameter open-source model that generates fully textured, physically accurate 3D models with PBR materials from a single image in about 3 seconds, handling complex geometry like open surfaces and hollow interiors.

Apr 26, 202696% relevant

Modly Desktop App Generates 3D Models from Images, Runs Locally

A developer has launched Modly, a desktop application that creates 3D models from images and processes them entirely on a user's local machine, eliminating cloud dependency.

Apr 20, 202689% relevant

Claude Code Builds Browser-Based 3D Flight Simulator in Weekend

A developer used Anthropic's Claude Code to build a complete 3D flight simulator that runs in a web browser over a weekend, demonstrating rapid AI-assisted game development.

Apr 18, 202685% relevant

Tencent Open-Sources HY-World 2.0 Multimodal 3D World Model

Tencent's Hunyuan AI lab has open-sourced HY-World 2.0, a multimodal world model capable of generating, reconstructing, and simulating interactive 3D scenes. This release provides a significant, freely available tool for 3D content creation and embodied AI research.

Apr 17, 202685% relevant

Open-Source 3D Building Editor Runs in Browser, Powered by AI

A developer has open-sourced a full 3D building editor that runs entirely in a web browser. This tool uses AI to lower the barrier to architectural design, potentially disrupting professional software workflows.

Apr 15, 202685% relevant

AllenAI's WildDet3D Enables Promptable 3D Object Detection from Single Images

Allen Institute for AI (AllenAI) has open-sourced WildDet3D, a model for promptable 3D object detection from single RGB images. It predicts 3D bounding boxes using flexible prompts and can integrate optional depth data.

Apr 13, 202685% relevant

How to Build a 3D Engine with Claude Code: The Demoscene Case Study

A developer used Claude Code to build a complete 3D engine from scratch. Here are the actionable prompting techniques and CLAUDE.md strategies that made it work.

Mar 27, 202690% relevant

QuatRoPE: New Positional Embedding Enables Linear-Scale 3D Spatial Reasoning in LLMs, Outperforming Quadratic Methods

Researchers propose QuatRoPE, a novel positional embedding method that encodes 3D object relations with linear input scaling. Paired with IGRE, it improves spatial reasoning in LLMs while preserving their original language capabilities.

Mar 27, 202679% relevant

Momentum-Consistency Fine-Tuning (MCFT) Achieves 3.30% Gain in 5-Shot 3D Vision Tasks Without Adapters

Researchers propose MCFT, an adapter-free fine-tuning method for 3D point cloud models that selectively updates encoder parameters with momentum constraints. It outperforms prior methods by 3.30% in 5-shot settings and maintains original inference latency.

Mar 26, 202675% relevant

NVIDIA Releases NVPanoptix-3D on Hugging Face: Single-Image 3D Indoor Scene Reconstruction

NVIDIA has open-sourced NVPanoptix-3D, a model that reconstructs complete 3D indoor scenes—including panoptic segmentation, depth, and geometry—from a single RGB image in one forward pass.

Mar 24, 202690% relevant

Open-Source 'AI Office' Platform Lets Users Walk Through 3D Space to Monitor Autonomous Agents

An open-source project called AI Office creates a 3D virtual workspace where AI agents are visualized as avatars performing tasks. Users can navigate the space instead of reading logs, offering a novel interface for multi-agent systems.

Mar 23, 202685% relevant

New Research Improves Text-to-3D Motion Retrieval with Interpretable Fine-Grained Alignment

Researchers propose a novel method for retrieving 3D human motion sequences from text descriptions using joint-angle motion images and token-patch interaction. It outperforms state-of-the-art methods on standard benchmarks while offering interpretable correspondences.

Mar 11, 202675% relevant

From Flat Images to 3D Worlds: How Persistent 3D State Models Will Revolutionize Virtual Try-On and Digital Showrooms

PERSIST introduces world models with persistent 3D scene memory, enabling coherent, evolving 3D environments from single images. For luxury retail, this means photorealistic virtual try-on with perfect garment physics and immersive digital showrooms that customers can explore and customize.

Mar 5, 202660% relevant

Freepik's Imagen Nano 2: Democratizing AI Image Generation with Google's Compact Model

Freepik has launched Imagen Nano 2, a significantly upgraded version of Google's lightweight image generation model. The new iteration promises faster performance, reduced computational requirements, and greater affordability, potentially making AI image creation accessible to more users.

Mar 3, 202685% relevant

VGGT-Det: How AI Is Learning to See in 3D Without Camera Calibration

Researchers have developed VGGT-Det, a breakthrough framework for multi-view 3D object detection that works without calibrated camera poses. The system mines internal geometric priors through attention mechanisms, outperforming traditional methods in indoor environments.

Mar 3, 202685% relevant

AI Game Engine Breakthrough: Complete 3D Worlds Generated in Seconds

A revolutionary AI system can now generate fully functional 3D games in seconds, complete with interactive worlds, moving characters, and working gameplay systems. This browser-based technology represents a quantum leap in procedural content creation.

Mar 2, 202695% relevant

From Prompt to Playable: New AI Platform Generates Complete 3D Games Instantly

A groundbreaking AI system can now transform simple text prompts into fully functional 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. Backed by NVIDIA and YouTube's co-founder with $28M in funding, this represents a seismic shift in game development.

Feb 25, 202695% relevant

The Next Platform Shift: How Persistent 3D World Models Are Becoming the New Programmable Interface

A new collaboration between Baseten and World Labs signals a paradigm shift where persistent 3D world models become programmable platforms, potentially rivaling the transformative impact of large language models through accessible developer APIs.

Feb 25, 202685% relevant

Sparse Sensors, Rich Views: How Minimal Radar Data Supercharges AI Scene Generation

Researchers have developed a novel approach that combines single images with extremely sparse radar or LiDAR data to dramatically improve AI's ability to generate realistic 3D views from 2D photos. This multimodal technique overcomes fundamental limitations of vision-only systems in challenging conditions like bad weather and low texture.

Feb 23, 202670% relevant

Zatom-1: The First Unified AI Model for 3D Molecular and Materials Science

Researchers have developed Zatom-1, the first foundation model that simultaneously handles generative and predictive tasks for both molecules and materials. This multimodal flow matching approach enables faster sampling and improved accuracy across chemical domains.

Feb 27, 202675% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety