A 1.3B-parameter rectified flow transformer, RadiT XL, generates chest radiographs clinical experts cannot distinguish from real ones. The model, trained on 1.2M radiographs and 1.6T tokens, achieves near-chance accuracy in real-vs-synthetic tests.
Key facts
- 1.3 billion parameters in RadiT XL rectified flow transformer.
- 1.2 million chest radiographs in CXR7-1M training dataset.
- 1.6 trillion tokens processed during training.
- Clinical experts at near-chance accuracy in real-vs-synthetic tests.
- Supports controllable generation across 12 pathologies.
Researchers from multiple institutions, including Fabio De Sousa Ribeiro, Emma A. M. Stanley, and Charles Jones, released a paper on arXiv on June 17, 2026 introducing the largest specialist generative foundation model for chest radiographs. The model, named RadiT XL, uses a rectified flow transformer architecture with over 1.3 billion parameters. It was trained from scratch on a curated, heterogeneous dataset called CXR7-1M, comprising 1.2 million radiographs harmonized from seven existing datasets and augmented with radiologist-guided metadata.
The paper claims the model is the first generative foundation model for chest radiograph synthesis at the billion-parameter scale. The key architectural components include RadiT (a rectified flow transformer) and Rad-VAE, a VAE trained with a Rad-DINO perceptual loss. The model supports controllable generation and editing across multiple demographic subgroups, acquisition views, and a dozen pathologies.
Clinical Indistinguishability
In a real-vs-synthetic evaluation, clinical experts performed at near-chance accuracy across two presentations, with low intra- and inter-rater Cohen's κ values, indicating high synthetic image realism. The paper states the model "producing images that are indistinguishable from real radiographs to clinical experts." This represents a significant advance in radiographic synthesis fidelity, though the paper does not disclose exact accuracy percentages or FID scores.
Unique Take: Specialization Over Generality
While general-domain image generation models like those from Meta or OpenAI have rapidly improved, this work demonstrates that domain-specific scaling — training a billion-parameter model exclusively on chest radiographs — can yield specialist performance that general models cannot match. The model's ability to condition on demographic subgroups and specific pathologies suggests a path toward diversifying clinical datasets for training downstream diagnostic models, a longstanding bottleneck in medical AI. The paper explicitly notes that existing radiographic AI models "often suffer from poor generalisation across patient subpopulations, institutions, and acquisition settings." By generating controlled, high-fidelity synthetic data, RadiT XL could enable more robust diagnostic model evaluation and training.
The paper does not release the model weights or the CXR7-1M dataset, which limits reproducibility. The authors also do not report compute costs, training time, or hyperparameter details beyond the parameter and token counts.
What to watch

Watch for any open-source release of RadiT model weights or the CXR7-1M dataset. If released, expect a wave of downstream diagnostic model robustness studies. Also track FID score disclosures and comparisons to general-domain generative models like Stable Diffusion fine-tuned on medical data.

Source: arxiv.org








