image generation

30 articles about image generation in AI news

OpenAI Image Generation V2 Release Imminent, Per Leak

A post from a known leaker indicates OpenAI's next image generation model, potentially DALL-E 4, is about to be released. This would mark a major competitive move in the rapidly evolving text-to-image space.

Apr 4, 202687% relevant

Black Forest Labs Unleashes FLUX.2 klein: Sub-Second AI Image Generation Hits Hugging Face

Black Forest Labs has released FLUX.2 klein on Hugging Face, delivering state-of-the-art image generation and editing in under a second. The model runs on consumer GPUs with just 13GB VRAM, making high-speed AI art creation dramatically more accessible.

Mar 14, 202685% relevant

Seedream 5.0 Lite Emerges as a Precision Tool for AI Image Generation

Seedream 5.0 Lite has launched on HailuoAI, emphasizing unprecedented user control and consistency in AI image generation. The model introduces features like multi-reference image locking and precise editing, moving beyond random outputs toward reliable creative workflows.

Mar 11, 202685% relevant

Freepik's Imagen Nano 2: Democratizing AI Image Generation with Google's Compact Model

Freepik has launched Imagen Nano 2, a significantly upgraded version of Google's lightweight image generation model. The new iteration promises faster performance, reduced computational requirements, and greater affordability, potentially making AI image creation accessible to more users.

Mar 3, 202685% relevant

MeiGen Emerges as the 'Ultimate Prompt Collection' for AI Image Generation

A new tool called MeiGen has surfaced, described as the 'ultimate prompt collection' for AI image creators. It scrapes high-quality prompts from top AI artists and organizes them for easy access, potentially democratizing advanced image generation techniques.

Feb 28, 202685% relevant

The AI Image Generation Revolution Hits a Tipping Point: All Major Models Now Accessible in One Platform

A new platform has emerged that consolidates access to leading AI image models including Sora, Flux, and Seedream 4.5, enabling text-to-image generation, editing, and style swapping without multiple subscriptions or specialized software.

Feb 26, 202685% relevant

Freepik's Seedream 5.0 Lite: The Democratization of Professional AI Image Generation

Freepik's new Seedream 5.0 Lite eliminates traditional AI image generation barriers like credit limits, inconsistent characters, and subscription costs, offering free access to high-quality visual creation tools.

Feb 25, 202685% relevant

Luma Labs Launches Uni-1: An Autoregressive Transformer for Image Generation with a Pre-Generation Reasoning Phase

Luma Labs has released Uni-1, a foundational image model that uses an autoregressive transformer to reason about user intent before generating pixels. It aims to address the 'intent gap' common in diffusion models by adding a structured reasoning step.

Mar 24, 202688% relevant

Google's Nano-Banana 2: The Edge AI Revolution That Puts 4K Image Generation in Your Pocket

Google has officially unveiled Nano-Banana 2, a specialized AI model delivering sub-second 4K image synthesis with advanced subject consistency entirely on-device. This breakthrough represents a strategic pivot toward edge computing, challenging the cloud-centric paradigm of current generative AI.

Feb 26, 202675% relevant

Google's Gemini 3.1 Flash Image: A New Contender in the AI Visual Generation Race

Google is reportedly developing Gemini 3.1 Flash Image, a specialized image generation model that could challenge Midjourney and DALL-E 3. This lightweight variant promises faster, more efficient visual creation while expanding Google's multimodal AI ecosystem.

Feb 25, 202685% relevant

GPT-5.5 + Codex Combines App Building, Browser Use, Image Gen

@intheworldofai claims GPT-5.5 + Codex is a super app better than Claude Code, with 7 capabilities including app building, debugging, browser use, and image generation.

Apr 30, 2026100% relevant

PerfectSquashBench Tests Image Model Anchoring Bias vs. Text Models

Wharton professor Ethan Mollick released PerfectSquashBench, a test showing image generation models exhibit stronger anchoring bias than text models, getting 'stuck' on initial directions and requiring context window clearing.

Apr 22, 202685% relevant

GPT-Image-2 Adds Self-Review Loop for Iterative Image Correction

A new capability in GPT-Image-2 allows the model to review and iteratively correct its own image generations, aiming for higher accuracy before final output.

Apr 21, 202685% relevant

Inflection's MAI-Image-2-Efficient: 22% Faster, 4x More Efficient

Inflection AI has released MAI-Image-2-Efficient, a production-ready image generation model claimed to be 22% faster and 4x more efficient than its predecessor while maintaining quality.

Apr 14, 202685% relevant

OpenAI Testing New Image Model in ChatGPT, User Reports 'Very Good'

A user reports OpenAI is testing a new image generation model in ChatGPT, describing its output as 'very good.' This signals ongoing internal development of visual AI capabilities.

Apr 4, 202685% relevant

GPT-Image-2 Appears in ChatGPT App Images Tab, Signaling OpenAI Visual AI Push

A user spotted 'GPT-Image-2' listed in the images tab of the ChatGPT mobile app. This indicates OpenAI is testing a potential successor to its DALL-E image generation models directly within its flagship product.

Apr 4, 202685% relevant

NVIDIA's DiffiT: A New Vision Transformer Architecture Sets Diffusion Model Benchmark

NVIDIA has released DiffiT, a Diffusion Vision Transformer achieving state-of-the-art image generation with an FID score of 1.73 on ImageNet-256 while using fewer parameters than previous models.

Mar 9, 202695% relevant

OpenAI's GPT-Image-2 Model Reportedly Achieves Photorealistic Video Generation, Surpassing Prior Map-Generation Flaws

A social media user claims OpenAI's GPT-Image-2 model now produces video indistinguishable from reality, a significant leap from its predecessor's documented failure to generate coherent world maps.

Apr 4, 202685% relevant

Luma Labs Opens Uni-1.1 API for Production — Image, Not Video, and #1 ELO Comes With a Caveat

Luma Labs has shipped the Uni-1.1 API for production — an image-generation model (not video) with two REST endpoints, Python and JavaScript SDKs, and support for up to nine reference images per call. The widely-cited '#1 Human Preference ELO' is from Luma's own internal pairwise evaluation; on pure text-to-image Luma reports #2 behind Google Nano Banana. Pricing: ~$0.09 per 2K image, 10–30% below Nano Banana 2 / Pro.

May 6, 202691% relevant

Kyutai Labs Releases OVIE: Single-Image Novel View Synthesis Model

French AI lab Kyutai Labs released OVIE, a novel view generation model trained only on single images, bypassing the need for costly multi-view datasets. This could democratize 3D content creation from 2D photos.

Apr 15, 202685% relevant

ByteDance's OmniShow Unifies Text, Image, Audio, Pose for Video Gen

ByteDance introduced OmniShow, a unified multimodal framework for video generation that accepts text, reference images, audio, and pose inputs simultaneously. It claims state-of-the-art performance across diverse conditioning settings.

Apr 14, 202685% relevant

Developer Open-Sources 'Prompt-to-3D' Tool for Instant, Navigable World Generation

A developer has released an open-source tool that creates interactive 3D worlds from text or image inputs. This moves 3D asset generation from static models to instant, explorable environments.

Apr 3, 202691% relevant

Luma AI Launches Uni-1, a Unified Image Model Priced at $0.09 per 2K Image, Challenging Google Nano Banana

Luma AI released Uni-1, a single transformer model for image understanding and generation. It ranks first in human preference tests for style/editing and reference tasks, and is priced lower than Google's Nano Banana models.

Mar 23, 202695% relevant

New AI Framework Prevents Image Generators from Copying Training Data Without Sacrificing Quality

Researchers have developed RADS, a novel inference-time framework that prevents text-to-image diffusion models from memorizing and regurgitating training data. Using reachability analysis and constrained reinforcement learning, RADS steers generation away from memorized content while maintaining image quality and prompt alignment.

Mar 3, 202675% relevant

PartRAG Revolutionizes 3D Generation with Retrieval-Augmented Part-Level Control

Researchers introduce PartRAG, a breakthrough framework that combines retrieval-augmented generation with diffusion transformers for precise part-level 3D creation and editing from single images. The system achieves superior geometric accuracy while enabling localized modifications without regenerating entire objects.

Feb 20, 202670% relevant

ByteDance Open-Sources BAGEL: 7B Multimodal Model for Image Gen, Editing, Understanding

ByteDance open-sourced BAGEL, a 7B multimodal model for image gen, editing, style transfer, and understanding under Apache 2.0.

May 28, 202695% relevant

DualFashion: Dual-Diffusion Transformer Generates Outfit Images & Text

DualFashion uses a dual-diffusion Transformer to jointly generate fashion images and text, outperforming SOTA on iFashion and Polyvore-U with interpretable outputs.

May 19, 202682% relevant

Detecting AI Images: Metadata Exposes Generators, No GPU Needed

AI image detection via metadata analysis exposes generators like Google's Gemini and Meta's Llama without GPU clusters, highlighting a simple but effective method.

May 10, 202675% relevant

Pinterest Builds Dedicated Conversion Candidate Generation Model

Pinterest details the design and deployment of a dedicated shopping conversion candidate generation model, replacing engagement-based retrieval. Key innovations include a parallel DCN v2 and MLP architecture (+11% recall) and a unified multi-task approach that boosted conversion recall by +42% over their 2023 model.

Apr 27, 2026100% relevant

Microsoft's TRELLIS.2: 4B Model Turns Images to 3D in 3 Seconds

Microsoft released TRELLIS.2, a 4B parameter open-source model that generates fully textured, physically accurate 3D models with PBR materials from a single image in about 3 seconds, handling complex geometry like open surfaces and hollow interiors.

Apr 26, 202696% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety