clinical ai

30 articles about clinical ai in AI news

General LLMs Beat Clinical AI Tools in Doctor Study

Frontier LLMs beat clinical AI tools like OpenEvidence in all evaluations, matching Google Search AI Overview.

Jun 12, 202683% relevant

Benchmarking Crisis: Audit Reveals MedCalc-Bench Flaws, Calls for 'Open-Book' AI Evaluation

A new audit of the MedCalc-Bench clinical AI benchmark reveals over 20 implementation errors and shows that providing calculator specifications at inference time boosts accuracy dramatically, suggesting the benchmark measures formula memorization rather than clinical reasoning.

Mar 4, 202675% relevant

GPT-5 Shows Promise as Clinical Assistant but Can't Replace Specialized Medical AI

New research evaluates GPT-5's clinical reasoning capabilities, finding significant improvements over GPT-4o in medical text analysis but limitations in specialized imaging tasks. The study reveals generalist AI models are advancing toward integrated clinical reasoning but still trail domain-specific systems in critical diagnostic areas.

Mar 6, 202675% relevant

Beyond the Black Box: New Framework Tests AI's True Clinical Reasoning on Heart Signals

Researchers have developed a novel framework to evaluate how well multimodal AI models truly reason about ECG signals, separating perception from deduction. This addresses critical gaps in validating AI's clinical logic beyond superficial metrics.

Mar 3, 202675% relevant

Medical AI Breakthrough: New Method Teaches Vision-Language Models to Understand Clinical Negation

Researchers have developed a novel fine-tuning technique that significantly improves how medical vision-language models understand negation in clinical reports. The method uses causal tracing to identify which neural network layers are most responsible for processing negative statements, then selectively trains those layers.

Feb 13, 202670% relevant

Inner Ear Gene Therapy Injection Reverses Deafness in All 10 Patients in Clinical Trial

A clinical trial has reported that a single injection of gene therapy into the inner ear successfully reversed deafness in all ten participating patients. This marks a significant threshold in treating genetic hearing loss, with some patients regaining hearing within weeks.

Apr 4, 202697% relevant

Clinical LLM Rejection Predictor Hits AUROC 0.719 in 4.5-Month Study

Clinical LLM rejection predictor achieves AUROC 0.719 in 4.5-month study using deployment-specific context to forecast user rejection before response generation.

Jun 12, 202672% relevant

MLLM Raters Show Central Tendency Bias in Clinical Scoring

Study finds GPT-5 and other MLLMs show central tendency bias in clinical scoring, compressing predictions toward scale midpoint despite prompt modifications.

May 19, 202670% relevant

DISCO-TAB: Hierarchical RL Framework Boosts Clinical Data Synthesis by 38.2%, Achieves JSD < 0.01

Researchers propose DISCO-TAB, a reinforcement learning framework that guides a fine-tuned LLM with multi-granular feedback to generate synthetic clinical data. It improves downstream classifier utility by up to 38.2% versus GAN/diffusion baselines and achieves near-perfect statistical fidelity (JSD < 0.01).

Apr 3, 202698% relevant

AI Generates Chest X-Rays Clinicians Cannot Tell Apart From Real Ones

RadiT XL, a 1.3B-parameter rectified flow transformer trained on 1.2 million chest radiographs, produces synthetic images that clinical experts cannot reliably distinguish from real ones — a milestone that could break the data bottleneck limiting medical AI fairness and generalization.

Jun 19, 202685% relevant

FDA to Use AI for Real-Time Drug Trial Monitoring

Bloomberg reports the FDA will deploy AI to monitor clinical trial data in real time, potentially reducing drug testing duration by months by catching issues early.

Apr 29, 202685% relevant

FDA-Designated AI 'Vox' Detects Heart Failure from 5-Second Voice Clip

An AI tool named Vox can detect signs of worsening heart failure from a 5-second patient voice clip. It's trained on >3M voice samples and backed by five clinical trials, targeting a condition affecting 64M people globally.

Apr 6, 202695% relevant

Legion Health AI Approved for Psychiatric Prescription Renewals in California

San Francisco startup Legion Health received regulatory approval for its AI system to autonomously renew a narrow set of psychiatric prescriptions for stable patients. This represents a carefully guardrailed but significant step toward AI-assisted clinical workflow.

Apr 6, 202687% relevant

NYC Hospital CEO: AI Could Replace Significant Share of Admin Staff

Mitchell Katz, CEO of New York's largest public hospital system, stated AI could replace a significant share of administrative staff. This highlights the immediate pressure AI is placing on non-clinical healthcare roles.

Apr 5, 202685% relevant

Anthropic Acquires AI Biotech Coefficient Bio for ~$400M to Build 'Virtual Biologist'

Anthropic acquired AI biotech startup Coefficient Bio for approximately $400M. The small team was building AI to plan drug R&D, manage clinical strategy, and identify new drug opportunities, aligning with CEO Dario Amodei's vision of AI as a 'virtual biologist.'

Apr 5, 202695% relevant

Microsoft & CUHK Debut 'Medical AI Scientist' Agent That Generates Ideas, Runs Experiments, and Writes Papers

Microsoft Research and CUHK have developed an autonomous AI agent that can formulate research ideas, execute experiments, and author papers, achieving near-MICCAI quality on 171 clinical cases across 19 tasks.

Mar 31, 202695% relevant

Aletta Robot Uses AI & Ultrasound to Fully Automate Blood Draws

Aletta is a robotic system that automates the entire blood draw process, using ultrasound to locate veins, position the arm, collect the sample, and apply a bandage. This addresses a critical bottleneck in healthcare by reducing failed sticks and freeing up clinical staff.

Mar 29, 202685% relevant

Microsoft's Copilot Health Enters the AI Medical Arena, Paving the Way for 'Medical Superintelligence'

Microsoft launches Copilot Health, an AI assistant that aggregates data from wearables, medical records, and labs to provide personalized health insights. It joins OpenAI and Anthropic in a competitive race to transform healthcare with AI, backed by clinical oversight and stringent privacy measures.

Mar 12, 202695% relevant

Meissa: The 4B-Parameter Medical AI That Outperforms Giants While Running Offline

Researchers have developed Meissa, a lightweight 4B-parameter medical AI that matches or exceeds proprietary frontier models in clinical tasks while operating fully offline with 22x lower latency. This breakthrough addresses critical cost, privacy, and deployment barriers in healthcare AI.

Mar 11, 202677% relevant

MAPLE: How Process-Aligned Rewards Are Solving AI's Medical Reasoning Crisis

Researchers introduce MAPLE, a new AI training paradigm that replaces statistical consensus with expert-aligned process rewards for medical reasoning. This approach ensures clinical correctness over mere popularity in medical LLMs, significantly outperforming current methods.

Mar 11, 202677% relevant

CoRe-BT: The Missing Piece for AI Brain Tumor Diagnosis

Researchers introduce CoRe-BT, a multimodal benchmark combining MRI, pathology images, and text reports for brain tumor typing. The dataset addresses real-world clinical challenges where diagnostic data is often incomplete, enabling more robust AI models for glioma classification.

Mar 5, 202680% relevant

MedFeat: How AI is Revolutionizing Medical Feature Engineering with Model-Aware Intelligence

Researchers have developed MedFeat, an innovative framework that combines large language models with clinical expertise to create smarter features for medical predictions. Unlike traditional approaches, MedFeat incorporates model awareness and explainability to generate features that improve accuracy and generalization across healthcare settings.

Mar 4, 202675% relevant

Beyond the Hype: New Benchmark Reveals When AI Truly Benefits from Combining Medical Data

A comprehensive new study systematically benchmarks multimodal AI fusion of Electronic Health Records and chest X-rays, revealing precisely when combining data types improves clinical predictions and when it fails. The research provides crucial guidance for developing effective and reliable AI systems for healthcare deployment.

Mar 2, 202675% relevant

MediX-R1: How MBZUAI's New Framework is Revolutionizing Medical AI with Limited Data

MBZUAI researchers have developed MediX-R1, an open-ended reinforcement learning framework that teaches medical AI models to generate clinically grounded free-form answers. Using innovative Group-Based RL with composite rewards, it achieves 73.6% accuracy on medical benchmarks with only ~51K training examples.

Mar 1, 202685% relevant

AI-Powered Digital Twins Herald New Era of Personalized Cancer Radiotherapy

Researchers have developed COMPASS, an AI system that creates patient-specific digital twins to predict radiation toxicity in lung cancer patients. By analyzing real-time treatment data, it identifies early warning signs days before clinical symptoms appear, enabling truly adaptive radiotherapy.

Feb 24, 202670% relevant

Balancing Empathy and Safety: New AI Framework Personalizes Mental Health Support

Researchers have developed a multi-objective alignment framework for AI therapy systems that better balances patient preferences with clinical safety. The approach uses direct preference optimization across six therapeutic dimensions, achieving superior results compared to single-objective methods.

Feb 19, 202672% relevant

Multimodal RAG System for Chest X-Ray Reports Achieves 0.95 Recall@5, Reduces Hallucinations with Citation Constraints

Researchers developed a multimodal retrieval-augmented generation system for drafting radiology impressions that fuses image and text embeddings. The system achieves Recall@5 above 0.95 on clinically relevant findings and enforces citation coverage to prevent hallucinations.

Mar 23, 202699% relevant

Health AI Benchmarks Show 'Validity Gap': 0.6% of Queries Use Raw Medical Records, 5.5% Cover Chronic Care

Analysis of 18,707 health queries across six public benchmarks reveals a structural misalignment with clinical reality. Benchmarks over-index on wellness data (17.7%) while under-representing lab values (5.2%), imaging (3.8%), and safety-critical scenarios.

Mar 20, 202677% relevant

Qwen 2.5 7B Expresses Near-Constant Confidence Whether It Is Right or Wrong, Study Finds

A June 2026 arXiv preprint from University of Minnesota researchers tested Qwen 2.5 7B on structured clinical prediction data and found its verbalized confidence scores are essentially uninformative -- clustering between 0.856 and 0.937 no matter how well or badly the model performs. Combining SHAP-

Jun 19, 202692% relevant

LLM Schema-Adaptive Method Enables Zero-Shot EHR Transfer

Researchers propose Schema-Adaptive Tabular Representation Learning, an LLM-driven method that transforms structured variables into semantic statements. It enables zero-shot alignment across unseen EHR schemas and outperforms clinical baselines, including neurologists, on dementia diagnosis tasks.

Apr 15, 202699% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety