specifications
30 articles about specifications in AI news
GitHub Launches Spec-Kit: AI Tool Converts Natural Language Descriptions into Technical Specifications
GitHub released Spec-Kit, an open-source toolkit that uses AI to generate technical specifications, project plans, and code from natural language descriptions. It's designed to integrate with major AI coding agents.
CAST: A New Framework for Semantic-Level Complementary Recommendations
Researchers propose CAST, a sequential recommendation framework that models transitions between discrete item semantic codes (e.g., specifications) and injects LLM-verified complementary knowledge. It achieves significant performance gains by moving beyond simplistic co-purchase statistics to capture genuine complementarity.
How Spec-Driven Development Cuts Claude Code Review Time by 80%
A developer's experiment shows that writing formal, testable specifications in plain English before coding reduces Claude Code hallucinations and eliminates manual verification of every generated line.
Stop Reviewing AI Code. Start Reviewing CLAUDE.md.
Anthropic's research shows the bottleneck is verification, not generation. Shift your Claude Code workflow from writing prompts to writing precise, testable specifications.
Meta's 'Avocado' AI Project Teased on Social Media, Details Remain Unclear
A cryptic social media post suggests Meta is preparing to announce an AI project codenamed 'Avocado.' No technical specifications, release timeline, or purpose have been revealed.
Beijing Military Intelligent Technology Demonstrates Underwater 'Fish Drone' Prototype
A brief video shows a biomimetic underwater drone resembling a fish, attributed to Beijing Military Intelligent Technology. The prototype's technical specifications and operational status are unconfirmed.
The AI Night Shift: How Programmers Are Deploying Autonomous Agents to Invent Code While They Sleep
Former Google CEO Eric Schmidt reveals how programmers are using AI agents to work overnight shifts, writing specifications before bed and waking to discover fully functional UIs and code generated autonomously.
Benchmarking Crisis: Audit Reveals MedCalc-Bench Flaws, Calls for 'Open-Book' AI Evaluation
A new audit of the MedCalc-Bench clinical AI benchmark reveals over 20 implementation errors and shows that providing calculator specifications at inference time boosts accuracy dramatically, suggesting the benchmark measures formula memorization rather than clinical reasoning.
VeRA Framework Transforms AI Benchmarking from Static Tests to Dynamic Intelligence Probes
Researchers introduce VeRA, a novel framework that converts static AI benchmarks into executable specifications capable of generating unlimited verified test variants. This approach addresses contamination and memorization issues in current evaluation methods while enabling cost-effective creation of challenging new tasks.
Hasan Toor Announces 'First AI Sales Tool That Does the Whole Job' in Cryptic Tweet
AI influencer Hasan Toor posted a tweet claiming a new AI sales tool is the first to handle the entire sales job, not just data or enrichment. No product name, company, or technical specifications were provided.
Minimax Confirms Development of Multimodal Model 'm3' via Social Media Tease
AI company Minimax has confirmed it is developing a multimodal model, internally codenamed 'm3', through a social media post. No technical specifications, release date, or benchmarks were provided.
SalesSim: LLMs Score Below 79% on Retail Persona Alignment, RL Boosts 13.8%
SalesSim benchmarks MLLMs as retail customers; top models score below 79% on persona alignment. UserGRPO RL boosts alignment by 13.8%.
Spec Kit + Claude Code: Spec-First Dev Hits 90% First-Pass Acceptance
Spec Kit generates tests from plain-English specs, then Claude Code iterates until they pass, claiming 90% first-pass acceptance. (148 chars)
Cerebras Understates On-Chip SRAM by 8x, SemiAnalysis Notes
Cerebras understates on-chip SRAM by 8x per SemiAnalysis, a rare under-specification in chip marketing.
Qualcomm Ships Hyperscaler Custom Silicon by December 2026
Qualcomm is developing custom silicon for an unnamed hyperscaler, with shipments expected December 2026, marking its most concrete data-center comeback move.
NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text
NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.
Vertiv Acquires Strategic Thermal Labs for Liquid Cooling
Vertiv acquired Strategic Thermal Labs to add cold plate design expertise to its liquid cooling portfolio, addressing the rising thermal demands of AI workloads in data centers.
Talkie: Vintage LLM Trained on 260B Pre-1931 English Tokens
Talkie is a new 'vintage language model' trained on 260 billion tokens of historical English text from before 1931, developed by a team including Alec Radford, co-author of the original GPT paper. It offers a unique linguistic artifact for NLP research.
MiniMax Music-2.6 Goes Free on Cloudflare This Week
MiniMax's Music-2.6 AI model is available for free on Cloudflare's platform this week, allowing users to generate full-length songs or instrumentals from text prompts.
Kinetix AI Teases KAI Humanoid Robot with 36 DOF, 18,000 Sensors
Kinetix AI has teased KAI, a humanoid robot with 36 degrees of freedom, hybrid dexterous hands, and 18,000 sensors, positioning it as the most human-like robotic system to date.
Utah Hyperscale Data Center to Exceed State Power Use
A hyperscale data center in Box Elder County, Utah, developed by Kevin O'Leary's O'Leary Digital, is set to generate and consume more power than the state itself, moving toward final approval.
Oracle Nabs $16B for Michigan AI Data Center, Rivaling Google Cloud
Oracle has secured $16 billion in funding for a massive AI data center in rural Michigan, a move that pits it directly against Google Cloud and other hyperscalers in the race to build AI infrastructure.
Delegate Launches: An AI Agent You Hand Work To and Walk Away
A new AI agent called Delegate lets users assign work and walk away, with the agent handling execution autonomously. The launch signals a shift toward hands-off AI assistants that manage complex tasks independently.
Shopify Engineering details 'Flow generation through natural language'
Shopify Engineering describes a 2026 approach to generating complex workflows (flows) from natural language prompts using an agentic modeling framework, enabling non-technical users to create automation.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
ROBOTIS Unveils AI Sapiens: 34 kg Humanoid with Dynamic Balance
ROBOTIS has introduced the AI Sapiens humanoid robot. The 34 kg platform is engineered to maintain balance during dynamic shifts and quick leg movements.
Arista Doubles 2026 AI Revenue Target to $3B+ on Open Ethernet
Arista Networks doubled its 2026 AI networking revenue target to over $3 billion, citing expanded roles for open Ethernet in AI data centers. This signals a major shift toward disaggregated, standards-based networking for AI clusters.
ECLASS-Augmented Semantic Product Search
Researchers systematically evaluated LLM-assisted dense retrieval for semantic product search on industrial electronic components. Augmenting embeddings with ECLASS hierarchical metadata created a crucial semantic bridge, achieving 94.3% Hit_Rate@5 versus 31.4% for BM25.
Microsoft's Fairwater AI Data Center Launches Early, Boosts Azure Capacity
Microsoft has launched its Fairwater AI data center ahead of schedule. The facility adds significant high-performance computing capacity to Azure's AI infrastructure, crucial for training and running large models.
AI Agents Show Consistent Economic Analysis, Reducing Human Disagreement
A new study finds AI agents like Claude Code and Codex produce economic analyses with far less disagreement than human teams, landing near the human median but with no extreme outliers. This indicates AI's potential for scalable, consistent research support.