cli
30 articles about cli in AI news
CLI-Universe: Qwen3-32B fine-tuned on 6K trajectories beats models 10x larger on Terminal-Bench 2.0
CLI-Universe synthesizes terminal-agent tasks; Qwen3-32B fine-tuned on 6K trajectories hits 33.4% on Terminal-Bench 2.0, beating models 10x larger.
Cline v4.0.0 Ships Plugin Marketplace
Cline v4.0.0 introduces a plugin marketplace, queued prompts, and an SDK rewrite. Claude Code users get new extensibility and reliability features.
MCP Server Versioning: How to Avoid Breaking All Your AI Clients (Like I
Stop breaking AI clients with MCP schema changes. Use query param versioning (?v=2) — it works with every MCP client, requires no code changes, and lets old and new versions coexist seamlessly.
Namecom-CLI Ships Agent Skill for Claude Code DNS Management
Namecom-CLI is an open-source, agent-friendly CLI for Name.com DNS with a bundled Claude Code skill, enabling AI agents to manage DNS records idempotently via the v4 API.
AI Generates Chest X-Rays Clinicians Cannot Tell Apart From Real Ones
RadiT XL, a 1.3B-parameter rectified flow transformer trained on 1.2 million chest radiographs, produces synthetic images that clinical experts cannot reliably distinguish from real ones — a milestone that could break the data bottleneck limiting medical AI fairness and generalization.
General LLMs Beat Clinical AI Tools in Doctor Study
Frontier LLMs beat clinical AI tools like OpenEvidence in all evaluations, matching Google Search AI Overview.
Clinical LLM Rejection Predictor Hits AUROC 0.719 in 4.5-Month Study
Clinical LLM rejection predictor achieves AUROC 0.719 in 4.5-month study using deployment-specific context to forecast user rejection before response generation.
How to Cut Agent Token Waste: CLI Over GraphQL + Server-Pushed Hints
Replace raw GraphQL with typed CLI commands to eliminate JSON assembly errors, then add server-pushed hints via MCP to prevent judgment failures. Your agent burns 1,500+ tokens per operation otherwise.
LTX Studio Turns AI Video Clips Into Editable Scenes
LTX Studio + LTX-2.3 lets users edit AI video scenes, not just generate clips. This shifts AI video from demo to production tool.
MLLM Raters Show Central Tendency Bias in Clinical Scoring
Study finds GPT-5 and other MLLMs show central tendency bias in clinical scoring, compressing predictions toward scale midpoint despite prompt modifications.
Google DeepMind Launches Real-Time Video AI Co-Clinician
Google DeepMind launched AI Co-Clinician, a real-time video analysis system for triadic care, claiming 30% fewer diagnostic errors in early tests.
GPT-5.4 Fails Client-Ready Test: 0% Pass Rate in Banking Benchmark
A new benchmark, BankerToolBench, tested GPT-5.4, Claude Opus 4.6, and others on junior investment banker tasks. None of the outputs were deemed client-ready, with GPT-5.4 leading but still failing nearly half the criteria.
Apple Releases DFNDR-12M Dataset, Claims 5x CLIP Training Efficiency
Apple has open-sourced DFNDR-12M, a multimodal dataset of 12.8 million image-text pairs with synthetic captions and pre-computed embeddings. The company claims it enables up to 5x training efficiency over standard CLIP datasets.
MIT, Harvard Studies Link AI Use to Declining Critical Thinking in Youth
Research from MIT and Harvard indicates that AI usage is correlated with a significant decline in critical thinking and creativity scores among 17–25 year olds, with 67% of students acknowledging the negative impact.
MCP vs CLI: The Hidden War for AI Agent Tool Integration
A fundamental architectural debate pits Anthropic's standardized Model Context Protocol (MCP) against traditional CLI execution for AI agent tool use. The choice between safety/standardization (MCP) and flexibility/speed (CLI) will shape enterprise AI deployment.
Gemini CLI Launches Subagents with Isolated Context & Custom Instructions
The Gemini CLI tool has launched a 'Subagents' feature, allowing users to run multiple specialized AI agents concurrently, each with its own isolated context and system prompt. This enables more complex, modular workflows by preventing instruction bleed between tasks.
Kering Reports Q1 2026 Revenue Decline as Gucci Sales Fall 14%
Luxury group Kering reported a 6% year-on-year revenue decline to €3.5bn in Q1 2026. The drop was driven by a 14% fall in Gucci sales, with declines in Asia-Pacific and Western Europe offsetting North American growth. CEO Luca de Meo called it a 'first step in our recovery' as a comprehensive brand reset continues.
HeyGen Launches CLI Tool for AI Video Generation from Terminal
AI video platform HeyGen has launched a CLI tool, allowing users to generate videos with avatars, voice, and script via terminal commands. This moves video synthesis from a web dashboard into developer workflows.
MiniMax Open-Sources Three Agent Music Skills for MMX-CLI
MiniMax has open-sourced three 'Music Skills' for its MMX-CLI agent platform. The skills allow AI agents to generate music, sing in a persona, and curate playlists from a user's local library.
AMD AI Director Reports Claude Code Quality Decline, Cites 234k Tool Calls
An AMD AI executive presented data from over 6,800 sessions showing Claude Code's performance has declined since early March, with rising instances of shallow reasoning and incomplete tasks. This raises significant trust issues for engineers using the model in complex development workflows.
MiniMax Launches MMX-CLI, First Infrastructure Built for AI Agents
MiniMax released MMX-CLI, a CLI built for AI agents, not humans. It provides agents with seven multimodal 'senses' and native integration with popular AI coding environments.
PetClaw Launches One-Click Desktop AI Agent, Aims to Fix OpenClaw Setup Woes
A new tool called PetClaw promises a fully functional AI desktop agent in under 60 seconds with one click, no API keys, and no terminal configuration. This directly targets the primary user complaint about its powerful but notoriously difficult-to-setup predecessor, OpenClaw.
Kerf-CLI: The SQLite-Powered Cost Dashboard Every Claude Code User Needs
Install Kerf-CLI to track Claude Code spending, enforce budgets, and identify wasted Opus spend with a local SQLite database and polished dashboard.
FDA-Designated AI 'Vox' Detects Heart Failure from 5-Second Voice Clip
An AI tool named Vox can detect signs of worsening heart failure from a 5-second patient voice clip. It's trained on >3M voice samples and backed by five clinical trials, targeting a condition affecting 64M people globally.
Simon Willison's 'scan-for-secrets' CLI Tool Detects API Keys in Logs
Simon Willison built 'scan-for-secrets', a Python CLI tool for scanning log files for accidentally exposed API keys. It's a lightweight utility for developers to sanitize data before sharing.
Inner Ear Gene Therapy Injection Reverses Deafness in All 10 Patients in Clinical Trial
A clinical trial has reported that a single injection of gene therapy into the inner ear successfully reversed deafness in all ten participating patients. This marks a significant threshold in treating genetic hearing loss, with some patients regaining hearing within weeks.
DISCO-TAB: Hierarchical RL Framework Boosts Clinical Data Synthesis by 38.2%, Achieves JSD < 0.01
Researchers propose DISCO-TAB, a reinforcement learning framework that guides a fine-tuned LLM with multi-granular feedback to generate synthetic clinical data. It improves downstream classifier utility by up to 38.2% versus GAN/diffusion baselines and achieves near-perfect statistical fidelity (JSD < 0.01).
Open-Source 'Codex CLI' Emerges as Free Alternative to OpenAI's Tools, Claims 30-Agent Architecture
An open-source project called 'Codex CLI' has been released, offering a free command-line interface that its creators claim outperforms OpenAI's offerings by coordinating 30 specialized AI agents for coding tasks.
TPC-CMA Framework Reduces CLIP Modality Gap by 82.3%, Boosts Captioning CIDEr by 57.1%
Researchers propose TPC-CMA, a three-phase fine-tuning curriculum that reduces the modality gap in CLIP-like models by 82.3%, improving clustering ARI from 0.318 to 0.516 and captioning CIDEr by 57.1%.
pixcli: The First MCP Server for Brazil's Pix Payments (Install It Now)
A new Rust CLI with built-in MCP server lets Claude Code agents create Pix charges, check payments, and manage webhooks—automating Brazilian payment workflows.