3d generation
30 articles about 3d generation in AI news
PartRAG Revolutionizes 3D Generation with Retrieval-Augmented Part-Level Control
Researchers introduce PartRAG, a breakthrough framework that combines retrieval-augmented generation with diffusion transformers for precise part-level 3D creation and editing from single images. The system achieves superior geometric accuracy while enabling localized modifications without regenerating entire objects.
Meshcraft Democratizes 3D Creation: Multi-Engine AI Platform Bridges Text-to-3D Gap
Meshcraft emerges as a web-based platform offering text-to-3D and image-to-3D generation with selectable AI engines. The tool provides both free and premium options, addressing quality bottlenecks in 3D generation through engine optimization rather than image model refinement.
Browser-Based Text-to-CAD Tool Emerges, Enabling Local 3D Model Generation from Prompts
A developer has built a text-to-CAD application that operates entirely within a web browser, enabling local generation and manipulation of 3D models from natural language descriptions. This approach eliminates cloud dependency and could lower barriers for rapid prototyping.
Developer Open-Sources 'Prompt-to-3D' Tool for Instant, Navigable World Generation
A developer has released an open-source tool that creates interactive 3D worlds from text or image inputs. This moves 3D asset generation from static models to instant, explorable environments.
NVIDIA Releases Brain MRI Generation Model on Hugging Face: 3D Latent Diffusion for T1, FLAIR, T2, and SWI Scans
NVIDIA has open-sourced a 3D latent diffusion model for generating high-resolution brain MRI scans across four modalities. The model claims state-of-the-art FID scores and 33× faster inference than prior methods.
Text-to-Game AI Emerges: How a Single Prompt Can Now Generate Complete 3D Worlds
A breakthrough AI system can transform simple text descriptions into fully playable 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. This development represents a quantum leap in procedural content generation and democratizes game development.
BetterScene Bridges the Gap: How Aligning AI Representations Unlocks Photorealistic 3D Synthesis
Researchers introduce BetterScene, a novel AI method that dramatically improves 3D scene generation from just a handful of photos. By aligning the internal representations of a powerful video diffusion model, it produces consistent, artifact-free novel views, pushing the boundary of what's possible in computational photography and virtual world creation.
How to Build a 3D Engine with Claude Code: The Demoscene Case Study
A developer used Claude Code to build a complete 3D engine from scratch. Here are the actionable prompting techniques and CLAUDE.md strategies that made it work.
QuatRoPE: New Positional Embedding Enables Linear-Scale 3D Spatial Reasoning in LLMs, Outperforming Quadratic Methods
Researchers propose QuatRoPE, a novel positional embedding method that encodes 3D object relations with linear input scaling. Paired with IGRE, it improves spatial reasoning in LLMs while preserving their original language capabilities.
Momentum-Consistency Fine-Tuning (MCFT) Achieves 3.30% Gain in 5-Shot 3D Vision Tasks Without Adapters
Researchers propose MCFT, an adapter-free fine-tuning method for 3D point cloud models that selectively updates encoder parameters with momentum constraints. It outperforms prior methods by 3.30% in 5-shot settings and maintains original inference latency.
NVIDIA Releases NVPanoptix-3D on Hugging Face: Single-Image 3D Indoor Scene Reconstruction
NVIDIA has open-sourced NVPanoptix-3D, a model that reconstructs complete 3D indoor scenes—including panoptic segmentation, depth, and geometry—from a single RGB image in one forward pass.
Open-Source 'AI Office' Platform Lets Users Walk Through 3D Space to Monitor Autonomous Agents
An open-source project called AI Office creates a 3D virtual workspace where AI agents are visualized as avatars performing tasks. Users can navigate the space instead of reading logs, offering a novel interface for multi-agent systems.
NVIDIA DLSS 5 Demo Shows 3D Guided Neural Rendering for Next-Gen Upscaling
A leaked demo of NVIDIA's upcoming DLSS 5 technology showcases 3D guided neural rendering, promising a significant leap in image reconstruction quality for real-time graphics.
New Research Improves Text-to-3D Motion Retrieval with Interpretable Fine-Grained Alignment
Researchers propose a novel method for retrieving 3D human motion sequences from text descriptions using joint-angle motion images and token-patch interaction. It outperforms state-of-the-art methods on standard benchmarks while offering interpretable correspondences.
From Flat Images to 3D Worlds: How Persistent 3D State Models Will Revolutionize Virtual Try-On and Digital Showrooms
PERSIST introduces world models with persistent 3D scene memory, enabling coherent, evolving 3D environments from single images. For luxury retail, this means photorealistic virtual try-on with perfect garment physics and immersive digital showrooms that customers can explore and customize.
Freepik's Imagen Nano 2: Democratizing AI Image Generation with Google's Compact Model
Freepik has launched Imagen Nano 2, a significantly upgraded version of Google's lightweight image generation model. The new iteration promises faster performance, reduced computational requirements, and greater affordability, potentially making AI image creation accessible to more users.
VGGT-Det: How AI Is Learning to See in 3D Without Camera Calibration
Researchers have developed VGGT-Det, a breakthrough framework for multi-view 3D object detection that works without calibrated camera poses. The system mines internal geometric priors through attention mechanisms, outperforming traditional methods in indoor environments.
AI Game Engine Breakthrough: Complete 3D Worlds Generated in Seconds
A revolutionary AI system can now generate fully functional 3D games in seconds, complete with interactive worlds, moving characters, and working gameplay systems. This browser-based technology represents a quantum leap in procedural content creation.
From Prompt to Playable: New AI Platform Generates Complete 3D Games Instantly
A groundbreaking AI system can now transform simple text prompts into fully functional 3D games complete with NPCs, physics, multiplayer capabilities, and persistent worlds. Backed by NVIDIA and YouTube's co-founder with $28M in funding, this represents a seismic shift in game development.
The Next Platform Shift: How Persistent 3D World Models Are Becoming the New Programmable Interface
A new collaboration between Baseten and World Labs signals a paradigm shift where persistent 3D world models become programmable platforms, potentially rivaling the transformative impact of large language models through accessible developer APIs.
Sparse Sensors, Rich Views: How Minimal Radar Data Supercharges AI Scene Generation
Researchers have developed a novel approach that combines single images with extremely sparse radar or LiDAR data to dramatically improve AI's ability to generate realistic 3D views from 2D photos. This multimodal technique overcomes fundamental limitations of vision-only systems in challenging conditions like bad weather and low texture.
Zatom-1: The First Unified AI Model for 3D Molecular and Materials Science
Researchers have developed Zatom-1, the first foundation model that simultaneously handles generative and predictive tasks for both molecules and materials. This multimodal flow matching approach enables faster sampling and improved accuracy across chemical domains.
DeepMind's Diffusion Breakthrough: Training Better Latents for Superior AI Generation
Google DeepMind researchers have developed new techniques for training latent representations in diffusion models, potentially leading to more efficient, higher-quality AI-generated content across images, audio, and video domains.
OpenCAD Browser Tool Enables Local, Private Text-to-CAD Conversion Without Cloud API
A developer has released an open-source text-to-CAD tool that runs entirely in a user's browser, enabling private, local 3D model generation from natural language descriptions. This approach bypasses cloud API costs and data privacy issues inherent in most current AI CAD solutions.
OpenSCAD Web: Open-Source Text-to-CAD Tool Runs Fully In-Browser via WebAssembly
A developer has released an open-source text-to-CAD tool that runs entirely in a web browser using WebAssembly. Users describe a 3D object in plain English, optionally upload a reference image, and receive a parametric model with adjustable dimensions that exports directly to 3D printer formats.
Higgsfield AI Pays Bartender $1M+ for Face Scan to Train AI Video Model Diffuse
AI startup Higgsfield paid a New Jersey bartender over $1 million for a full-face 3D scan to train its text-to-video model Diffuse. The deal highlights the emerging market for high-fidelity biometric data to create photorealistic digital humans.
Vision AI Trends 2026: Manufacturing, Warehouse Automation, and Luxury Authentication Enter Visual Data Era
A 2026 trends report highlights Vision AI's expansion into manufacturing quality inspection, warehouse automation, and luxury brand authentication, marking a shift toward 3D visual data systems. This reflects the maturation of computer vision beyond basic recognition into operational and trust applications.
BrepCoder: The AI That Speaks CAD's Native Language
Researchers have developed BrepCoder, a multimodal AI that understands CAD designs in their native B-rep format. By treating 3D models as structured code, it performs multiple engineering tasks without task-specific retraining, potentially revolutionizing design automation.
The One-Stop AI Platform Revolution: GlobalGPT Consolidates 100+ Models Without Barriers
GlobalGPT has launched a unified platform offering access to over 100 AI models for image and video generation without waitlists, restrictions, or invite codes. This consolidation represents a significant shift toward democratizing advanced AI tools for creators and businesses alike.
From Prompt to Play: How AI is Building Entire Games in Minutes
A developer has created 'Riftwater,' a sci-fi fishing game where every element—from 3D assets to NPC behavior—is generated through prompt-based AI. This breakthrough demonstrates how AI is evolving from content assistant to full game development engine.