Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

1.3B-Parameter Rectified Flow Transformer Generates Chest X-Rays

1.3B-parameter rectified flow transformer generates chest X-rays indistinguishable from real ones to clinical experts, trained on 1.2M radiographs.

AAAla SMITH & AI Research Desk·17h ago·3 min read··22 views·AI-Generated·Report error

Source: arxiv.orgvia arxiv_cvMulti-Source

Can a 1.3B-parameter generative model produce chest X-rays that fool clinical experts?

A 1.3B-parameter rectified flow transformer (RadiT XL) trained on 1.2M chest radiographs generates synthetic images clinical experts cannot distinguish from real ones, achieving near-chance accuracy on real-vs-synthetic tests.

TL;DR

1.3B parameter model for chest radiograph synthesis. · Trained on 1.2M images, 1.6T tokens. · Clinical experts near chance at distinguishing synthetic from real.

A 1.3B-parameter rectified flow transformer, RadiT XL, generates chest radiographs clinical experts cannot distinguish from real ones. The model, trained on 1.2M radiographs and 1.6T tokens, achieves near-chance accuracy in real-vs-synthetic tests.

Key facts

1.3 billion parameters in RadiT XL rectified flow transformer.
1.2 million chest radiographs in CXR7-1M training dataset.
1.6 trillion tokens processed during training.
Clinical experts at near-chance accuracy in real-vs-synthetic tests.
Supports controllable generation across 12 pathologies.

Researchers from multiple institutions, including Fabio De Sousa Ribeiro, Emma A. M. Stanley, and Charles Jones, released a paper on arXiv on June 17, 2026 introducing the largest specialist generative foundation model for chest radiographs. The model, named RadiT XL, uses a rectified flow transformer architecture with over 1.3 billion parameters. It was trained from scratch on a curated, heterogeneous dataset called CXR7-1M, comprising 1.2 million radiographs harmonized from seven existing datasets and augmented with radiologist-guided metadata.

The paper claims the model is the first generative foundation model for chest radiograph synthesis at the billion-parameter scale. The key architectural components include RadiT (a rectified flow transformer) and Rad-VAE, a VAE trained with a Rad-DINO perceptual loss. The model supports controllable generation and editing across multiple demographic subgroups, acquisition views, and a dozen pathologies.

Clinical Indistinguishability

In a real-vs-synthetic evaluation, clinical experts performed at near-chance accuracy across two presentations, with low intra- and inter-rater Cohen's κ values, indicating high synthetic image realism. The paper states the model "producing images that are indistinguishable from real radiographs to clinical experts." This represents a significant advance in radiographic synthesis fidelity, though the paper does not disclose exact accuracy percentages or FID scores.

Unique Take: Specialization Over Generality

While general-domain image generation models like those from Meta or OpenAI have rapidly improved, this work demonstrates that domain-specific scaling — training a billion-parameter model exclusively on chest radiographs — can yield specialist performance that general models cannot match. The model's ability to condition on demographic subgroups and specific pathologies suggests a path toward diversifying clinical datasets for training downstream diagnostic models, a longstanding bottleneck in medical AI. The paper explicitly notes that existing radiographic AI models "often suffer from poor generalisation across patient subpopulations, institutions, and acquisition settings." By generating controlled, high-fidelity synthetic data, RadiT XL could enable more robust diagnostic model evaluation and training.

The paper does not release the model weights or the CXR7-1M dataset, which limits reproducibility. The authors also do not report compute costs, training time, or hyperparameter details beyond the parameter and token counts.

What to watch

Step-by-Step Illustrated Explanations of Transformer | by Yule Wang ...

Watch for any open-source release of RadiT model weights or the CXR7-1M dataset. If released, expect a wave of downstream diagnostic model robustness studies. Also track FID score disclosures and comparisons to general-domain generative models like Stable Diffusion fine-tuned on medical data.

Figure 8: Rectified flow transformer architectures.(a) Latent-space rectified flow models operate on Rad-VAE latent tok

Source: arxiv.org

Source: gentic.news · 17h ago · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The paper represents a notable scaling effort in medical image generation, but several critical details are absent. The lack of quantitative fidelity metrics (FID, IS, or any distributional distance) is a significant omission for a computer vision paper. The claim of 'indistinguishable' relies solely on a small-scale human evaluation with undisclosed sample sizes and expert demographics. The paper also does not compare against existing generative models for chest radiographs (e.g., diffusion models like Med-DDPM) on standard benchmarks, making the 'state of the art' claim difficult to verify. The architectural choice of rectified flow transformers over standard diffusion or GANs is interesting but underexplored in the paper — no ablation studies justify this choice. The Rad-DINO perceptual loss is novel but not compared against other perceptual losses (LPIPS, PSNR). The paper's strength lies in its dataset curation and scale, but the evaluation methodology is thin for a 'foundation model' claim. The lack of model or data release reduces the paper's immediate impact on the field, though it sets a benchmark for future work in specialist medical generative models.

#medical imaging #ai models #generative ai

Compare side-by-side

Fabio De Sousa Ribeiro vs Emma A. M. Stanley

→

Mentioned in this article

RadiT XL CXR7-1M Fabio De Sousa Ribeiro Emma A. M. Stanley Charles Jones

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

AI Research

MiniMax M3 Exceeds Human Gold-Medal on Math Benchmarks via MaxProof

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

1.3B-Parameter Rectified Flow Transformer Generates Chest X-Rays

Clinical Indistinguishability

Unique Take: Specialization Over Generality

What to watch

AI Analysis

✨AI Toolslive

Related Articles

How to Govern Claude Code Across Your Team: 4 Gaps to Fix Before the Next CVE

OpenAI Can Predict Model Failures via Past Chat Replay

Anthropic Study: Senior Engineers Beat Juniors With AI by 31%

NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup

CoreWeave Trains DeepSeek-V3 in 2 Minutes, Claims MLPerf v6.0 Record

MiniMax M3 Exceeds Human Gold-Medal on Math Benchmarks via MaxProof

The framework underneath this story

More in AI Research

Qwen 2.5 7B Verbalized Confidence Is Epistemically Vacuous, Paper Finds

SciRisk-Bench Tests 10 Risk Dimensions Across 7 Science Disciplines

BeliefDiffusion Uses Diffusion Models for Robot Navigation in Partially