clinical ai

30 articles about clinical ai in AI news

Benchmarking Crisis: Audit Reveals MedCalc-Bench Flaws, Calls for 'Open-Book' AI Evaluation

A new audit of the MedCalc-Bench clinical AI benchmark reveals over 20 implementation errors and shows that providing calculator specifications at inference time boosts accuracy dramatically, suggesting the benchmark measures formula memorization rather than clinical reasoning.

75% relevant

GPT-5 Shows Promise as Clinical Assistant but Can't Replace Specialized Medical AI

New research evaluates GPT-5's clinical reasoning capabilities, finding significant improvements over GPT-4o in medical text analysis but limitations in specialized imaging tasks. The study reveals generalist AI models are advancing toward integrated clinical reasoning but still trail domain-specific systems in critical diagnostic areas.

75% relevant

Beyond the Black Box: New Framework Tests AI's True Clinical Reasoning on Heart Signals

Researchers have developed a novel framework to evaluate how well multimodal AI models truly reason about ECG signals, separating perception from deduction. This addresses critical gaps in validating AI's clinical logic beyond superficial metrics.

75% relevant

Medical AI Breakthrough: New Method Teaches Vision-Language Models to Understand Clinical Negation

Researchers have developed a novel fine-tuning technique that significantly improves how medical vision-language models understand negation in clinical reports. The method uses causal tracing to identify which neural network layers are most responsible for processing negative statements, then selectively trains those layers.

70% relevant

Inner Ear Gene Therapy Injection Reverses Deafness in All 10 Patients in Clinical Trial

A clinical trial has reported that a single injection of gene therapy into the inner ear successfully reversed deafness in all ten participating patients. This marks a significant threshold in treating genetic hearing loss, with some patients regaining hearing within weeks.

97% relevant

DISCO-TAB: Hierarchical RL Framework Boosts Clinical Data Synthesis by 38.2%, Achieves JSD < 0.01

Researchers propose DISCO-TAB, a reinforcement learning framework that guides a fine-tuned LLM with multi-granular feedback to generate synthetic clinical data. It improves downstream classifier utility by up to 38.2% versus GAN/diffusion baselines and achieves near-perfect statistical fidelity (JSD < 0.01).

98% relevant

NYC Hospital CEO: AI Could Replace Significant Share of Admin Staff

Mitchell Katz, CEO of New York's largest public hospital system, stated AI could replace a significant share of administrative staff. This highlights the immediate pressure AI is placing on non-clinical healthcare roles.

85% relevant

Anthropic Acquires AI Biotech Coefficient Bio for ~$400M to Build 'Virtual Biologist'

Anthropic acquired AI biotech startup Coefficient Bio for approximately $400M. The small team was building AI to plan drug R&D, manage clinical strategy, and identify new drug opportunities, aligning with CEO Dario Amodei's vision of AI as a 'virtual biologist.'

95% relevant

Microsoft & CUHK Debut 'Medical AI Scientist' Agent That Generates Ideas, Runs Experiments, and Writes Papers

Microsoft Research and CUHK have developed an autonomous AI agent that can formulate research ideas, execute experiments, and author papers, achieving near-MICCAI quality on 171 clinical cases across 19 tasks.

95% relevant

Aletta Robot Uses AI & Ultrasound to Fully Automate Blood Draws

Aletta is a robotic system that automates the entire blood draw process, using ultrasound to locate veins, position the arm, collect the sample, and apply a bandage. This addresses a critical bottleneck in healthcare by reducing failed sticks and freeing up clinical staff.

85% relevant

Microsoft's Copilot Health Enters the AI Medical Arena, Paving the Way for 'Medical Superintelligence'

Microsoft launches Copilot Health, an AI assistant that aggregates data from wearables, medical records, and labs to provide personalized health insights. It joins OpenAI and Anthropic in a competitive race to transform healthcare with AI, backed by clinical oversight and stringent privacy measures.

95% relevant

MAPLE: How Process-Aligned Rewards Are Solving AI's Medical Reasoning Crisis

Researchers introduce MAPLE, a new AI training paradigm that replaces statistical consensus with expert-aligned process rewards for medical reasoning. This approach ensures clinical correctness over mere popularity in medical LLMs, significantly outperforming current methods.

77% relevant

Meissa: The 4B-Parameter Medical AI That Outperforms Giants While Running Offline

Researchers have developed Meissa, a lightweight 4B-parameter medical AI that matches or exceeds proprietary frontier models in clinical tasks while operating fully offline with 22x lower latency. This breakthrough addresses critical cost, privacy, and deployment barriers in healthcare AI.

77% relevant

CoRe-BT: The Missing Piece for AI Brain Tumor Diagnosis

Researchers introduce CoRe-BT, a multimodal benchmark combining MRI, pathology images, and text reports for brain tumor typing. The dataset addresses real-world clinical challenges where diagnostic data is often incomplete, enabling more robust AI models for glioma classification.

80% relevant

MedFeat: How AI is Revolutionizing Medical Feature Engineering with Model-Aware Intelligence

Researchers have developed MedFeat, an innovative framework that combines large language models with clinical expertise to create smarter features for medical predictions. Unlike traditional approaches, MedFeat incorporates model awareness and explainability to generate features that improve accuracy and generalization across healthcare settings.

75% relevant

Beyond the Hype: New Benchmark Reveals When AI Truly Benefits from Combining Medical Data

A comprehensive new study systematically benchmarks multimodal AI fusion of Electronic Health Records and chest X-rays, revealing precisely when combining data types improves clinical predictions and when it fails. The research provides crucial guidance for developing effective and reliable AI systems for healthcare deployment.

75% relevant

MediX-R1: How MBZUAI's New Framework is Revolutionizing Medical AI with Limited Data

MBZUAI researchers have developed MediX-R1, an open-ended reinforcement learning framework that teaches medical AI models to generate clinically grounded free-form answers. Using innovative Group-Based RL with composite rewards, it achieves 73.6% accuracy on medical benchmarks with only ~51K training examples.

85% relevant

AI-Powered Digital Twins Herald New Era of Personalized Cancer Radiotherapy

Researchers have developed COMPASS, an AI system that creates patient-specific digital twins to predict radiation toxicity in lung cancer patients. By analyzing real-time treatment data, it identifies early warning signs days before clinical symptoms appear, enabling truly adaptive radiotherapy.

70% relevant

Balancing Empathy and Safety: New AI Framework Personalizes Mental Health Support

Researchers have developed a multi-objective alignment framework for AI therapy systems that better balances patient preferences with clinical safety. The approach uses direct preference optimization across six therapeutic dimensions, achieving superior results compared to single-objective methods.

72% relevant

Multimodal RAG System for Chest X-Ray Reports Achieves 0.95 Recall@5, Reduces Hallucinations with Citation Constraints

Researchers developed a multimodal retrieval-augmented generation system for drafting radiology impressions that fuses image and text embeddings. The system achieves Recall@5 above 0.95 on clinically relevant findings and enforces citation coverage to prevent hallucinations.

99% relevant

Health AI Benchmarks Show 'Validity Gap': 0.6% of Queries Use Raw Medical Records, 5.5% Cover Chronic Care

Analysis of 18,707 health queries across six public benchmarks reveals a structural misalignment with clinical reality. Benchmarks over-index on wellness data (17.7%) while under-representing lab values (5.2%), imaging (3.8%), and safety-critical scenarios.

77% relevant

Gastric-X: New 1.7K-Case Multimodal Benchmark Challenges VLMs on Realistic Gastric Cancer Diagnosis Workflow

Researchers introduce Gastric-X, a comprehensive multimodal benchmark with 1.7K gastric cancer cases including CT scans, endoscopy, lab data, and expert notes. It evaluates VLMs on five clinical tasks to test if they can correlate biochemical signals with tumor features like physicians do.

77% relevant

Palantir CTO: AI Is the 'Antidote' to 20th-Century Management

Palantir CTO Shyam Sankar stated that AI will act as an 'antidote' to the 20th-century managerial revolution, shifting power from middle management to frontline decision-makers. This reflects Palantir's core product philosophy for its AIP platform.

75% relevant

AI Model Analyzes Blood Proteins to Diagnose Alzheimer's, Parkinson's, ALS, and Stroke with 17,187-Patient Study

An AI model can diagnose Alzheimer's, Parkinson's, ALS, frontotemporal dementia, and stroke from a single blood sample by analyzing protein profiles. It outperformed symptom-based diagnosis at predicting future cognitive decline in a Nature-published study of 17,187 people.

97% relevant

Eli Lilly Signs $2.75B AI Drug Discovery Deal with Insilico Medicine

Eli Lilly has entered a $2.75 billion licensing pact with Insilico Medicine for multiple AI-discovered drug programs. The deal includes an upfront payment, milestones, and royalties, marking a major validation for AI-driven pharmaceutical R&D.

95% relevant

Neko Health Launches $400 AI-Powered Full-Body Health Scans in New York This Spring

Neko Health, the $1.8B startup founded by Spotify's Daniel Ek, is launching its AI-driven full-body health screening service in the US. The $400 scan uses imaging and blood tests to screen for cancer, heart disease, and diabetes risk, though medical experts are divided on its efficacy.

85% relevant

Claude AI Diagnoses Positional Headache in Complex Medical Case After Specialists Failed

A 62-year-old patient with multiple chronic conditions and positional migraines received a correct diagnosis and treatment plan from Claude AI after years of unsuccessful specialist visits. The $317 CPAP machine it recommended solved the previously unexplained condition.

85% relevant

Meta's TRIBE v2 Predicts Brain Activity from fMRI Data, Surpassing Real Scan Accuracy

Meta released TRIBE v2, a foundation model trained on 500+ hours of fMRI data from 700+ people. It predicts a new person's brain responses to sensory input without retraining, reportedly exceeding the accuracy of a real brain scan.

95% relevant

Revieve Launches AI Skin Advisor for ChatGPT, Expanding Generative AI Beauty Discovery

Beauty tech platform Revieve launches an AI Skin Advisor as a ChatGPT plugin, enabling conversational skin analysis and product discovery. This represents a strategic expansion into generative AI platforms for beauty brands and retailers.

100% relevant

FedAgain: Dual-Trust Federated Learning Boosts Kidney Stone ID Accuracy to 94.7% on MyStone Dataset

Researchers propose FedAgain, a trust-based federated learning framework that dynamically weights client contributions using benchmark reliability and model divergence. It achieves 94.7% accuracy on kidney stone identification while maintaining robustness against corrupted data from multiple hospitals.

79% relevant