specifications

30 articles about specifications in AI news

CAST: A New Framework for Semantic-Level Complementary Recommendations

Researchers propose CAST, a sequential recommendation framework that models transitions between discrete item semantic codes (e.g., specifications) and injects LLM-verified complementary knowledge. It achieves significant performance gains by moving beyond simplistic co-purchase statistics to capture genuine complementarity.

Apr 22, 202678% relevant

How Spec-Driven Development Cuts Claude Code Review Time by 80%

A developer's experiment shows that writing formal, testable specifications in plain English before coding reduces Claude Code hallucinations and eliminates manual verification of every generated line.

Apr 3, 202695% relevant

Stop Reviewing AI Code. Start Reviewing CLAUDE.md.

Anthropic's research shows the bottleneck is verification, not generation. Shift your Claude Code workflow from writing prompts to writing precise, testable specifications.

Mar 30, 202670% relevant

Meta's 'Avocado' AI Project Teased on Social Media, Details Remain Unclear

A cryptic social media post suggests Meta is preparing to announce an AI project codenamed 'Avocado.' No technical specifications, release timeline, or purpose have been revealed.

Mar 29, 202685% relevant

Beijing Military Intelligent Technology Demonstrates Underwater 'Fish Drone' Prototype

A brief video shows a biomimetic underwater drone resembling a fish, attributed to Beijing Military Intelligent Technology. The prototype's technical specifications and operational status are unconfirmed.

Mar 17, 202685% relevant

The AI Night Shift: How Programmers Are Deploying Autonomous Agents to Invent Code While They Sleep

Former Google CEO Eric Schmidt reveals how programmers are using AI agents to work overnight shifts, writing specifications before bed and waking to discover fully functional UIs and code generated autonomously.

Mar 11, 202685% relevant

Benchmarking Crisis: Audit Reveals MedCalc-Bench Flaws, Calls for 'Open-Book' AI Evaluation

A new audit of the MedCalc-Bench clinical AI benchmark reveals over 20 implementation errors and shows that providing calculator specifications at inference time boosts accuracy dramatically, suggesting the benchmark measures formula memorization rather than clinical reasoning.

Mar 4, 202675% relevant

VeRA Framework Transforms AI Benchmarking from Static Tests to Dynamic Intelligence Probes

Researchers introduce VeRA, a novel framework that converts static AI benchmarks into executable specifications capable of generating unlimited verified test variants. This approach addresses contamination and memorization issues in current evaluation methods while enabling cost-effective creation of challenging new tasks.

Feb 17, 202675% relevant

Hasan Toor Announces 'First AI Sales Tool That Does the Whole Job' in Cryptic Tweet

AI influencer Hasan Toor posted a tweet claiming a new AI sales tool is the first to handle the entire sales job, not just data or enrichment. No product name, company, or technical specifications were provided.

Apr 3, 202689% relevant

NVIDIA Drops Fast-FoundationStereo: 10× Faster Depth Estimation

NVIDIA released Fast-FoundationStereo, a real-time foundation model for zero-shot stereo depth estimation that is 10× faster than FoundationStereo with matching accuracy.

Jun 26, 202685% relevant

Midjourney Plans 60-Second Ultrasound Spa in SF by 2027

Midjourney plans a 2027 SF spa with 60-second ultrasound scans, aiming for 100x faster than MRI.

Jun 18, 202683% relevant

Tensordyne Claims 10x Efficiency Gain with Napier Architecture

Tensordyne claims 10x efficiency over Nvidia in inference with Napier gen, but lacks data or verification.

Jun 18, 202685% relevant

Intel Omni-Path Resurfaces as InfiniBand Rival for DoE Supercomputers

Intel's Omni-Path interconnect, revived by Cornelis Networks, will connect DoE supercomputers at 400Gbps as an InfiniBand alternative.

Jun 16, 202690% relevant

Stanford, Meta 'Code as Agent Harness' Paper Rethinks AI Agent Design

Stanford and Meta's "Code as Agent Harness" paper proposes code-driven AI agent orchestration, potentially improving reliability over natural language prompts.

Jun 10, 2026100% relevant

Fable 5: Claude's Biggest Leap Since Opus 4.5, Says Beta Tester

Beta tester says Fable 5 is Claude's biggest leap since Opus 4.5, with emergent debugging and design capabilities.

Jun 9, 2026100% relevant

GitHub Spec Kit: Open-Source Tool to Fix Vibe Coding’s Core Flaw

GitHub released Spec Kit, an open-source toolkit that enforces specification-first workflows for AI coding, addressing vibe coding's tendency to generate code before requirements are clear.

Jun 7, 202685% relevant

Apple Readies 1.2T-Parameter Gemini Model for WWDC 2026

Apple will reveal a custom 1.2T-parameter Gemini model at WWDC 2026, with local and server-based inference. The integration marks Apple's entry into OS-level AI.

Jun 7, 202687% relevant

Microsoft's Project Solara Aims to Be Agent Infrastructure Backbone

Microsoft announced Project Solara, an agent infrastructure platform with two connectors. No pricing or timeline disclosed.

Jun 2, 202689% relevant

Nvidia Unveils New Windows SoC, Targeting AI PCs

Nvidia announced a Windows SoC for AI PCs, per @mweinbach. Chip targets on-device inference, competing with Qualcomm and Intel.

Jun 1, 2026100% relevant

Grounded Code: 10 principles to cut AI agent re-derivation cost

Grounded Code final article proposes 10 principles across 3 clusters to reduce AI coding agent re-derivation cost, with one audit correction: a 3,110-line orchestrator file.

May 17, 202682% relevant

SalesSim: LLMs Score Below 79% on Retail Persona Alignment, RL Boosts 13.8%

SalesSim benchmarks MLLMs as retail customers; top models score below 79% on persona alignment. UserGRPO RL boosts alignment by 13.8%.

May 12, 202691% relevant

Spec Kit + Claude Code: Spec-First Dev Hits 90% First-Pass Acceptance

Spec Kit generates tests from plain-English specs, then Claude Code iterates until they pass, claiming 90% first-pass acceptance. (148 chars)

May 11, 2026100% relevant

Cerebras Understates On-Chip SRAM by 8x, SemiAnalysis Notes

Cerebras understates on-chip SRAM by 8x per SemiAnalysis, a rare under-specification in chip marketing.

May 7, 202675% relevant

Qualcomm Ships Hyperscaler Custom Silicon by December 2026

Qualcomm is developing custom silicon for an unnamed hyperscaler, with shipments expected December 2026, marking its most concrete data-center comeback move.

May 1, 202676% relevant

NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text

NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.

Apr 28, 202693% relevant

Vertiv Acquires Strategic Thermal Labs for Liquid Cooling

Vertiv acquired Strategic Thermal Labs to add cold plate design expertise to its liquid cooling portfolio, addressing the rising thermal demands of AI workloads in data centers.

Apr 28, 202670% relevant

Talkie: Vintage LLM Trained on 260B Pre-1931 English Tokens

Talkie is a new 'vintage language model' trained on 260 billion tokens of historical English text from before 1931, developed by a team including Alec Radford, co-author of the original GPT paper. It offers a unique linguistic artifact for NLP research.

Apr 28, 202685% relevant

MiniMax Music-2.6 Goes Free on Cloudflare This Week

MiniMax's Music-2.6 AI model is available for free on Cloudflare's platform this week, allowing users to generate full-length songs or instrumentals from text prompts.

Apr 27, 202675% relevant

Kinetix AI Teases KAI Humanoid Robot with 36 DOF, 18,000 Sensors

Kinetix AI has teased KAI, a humanoid robot with 36 degrees of freedom, hybrid dexterous hands, and 18,000 sensors, positioning it as the most human-like robotic system to date.

Apr 27, 202685% relevant

Utah Hyperscale Data Center to Exceed State Power Use

A hyperscale data center in Box Elder County, Utah, developed by Kevin O'Leary's O'Leary Digital, is set to generate and consume more power than the state itself, moving toward final approval.

Apr 26, 2026100% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety