show hn

30 articles about show hn in AI news

Claude Code Generates Production Lottie Animations via Show HN

Claude Code claimed to generate production Lottie animations via Show HN. No demo or code published; 2 points, 0 comments. Unverified.

Jun 8, 202675% relevant

Show HN: Spec-Driven Dev Workflow Cuts Claude Code Agent Confusion

SDDW introduces a spec-driven workflow for Claude Code that decomposes complex tasks into specs and subtasks, clearing context between steps to reduce agent confusion and costs.

May 22, 202698% relevant

Logile to Showcase AI-Powered Connected Store Operations at Retail

Logile, a provider of AI-powered workforce solutions, announced its participation in Retail Technology Show 2026. The company will showcase its Connected Store Operations platform, emphasizing the industry trend toward integrating labor planning, task management, and store execution.

Apr 20, 202688% relevant

Beijing Military Intelligent Technology Demonstrates Underwater 'Fish Drone' Prototype

A brief video shows a biomimetic underwater drone resembling a fish, attributed to Beijing Military Intelligent Technology. The prototype's technical specifications and operational status are unconfirmed.

Mar 17, 202685% relevant

Feature Freshness: The Production Bug That Makes Good Recommenders Look Bad

Jie Li's article reveals that stale features—outdated user signals—can degrade recommender performance by 20-30% in offline metrics, often misdiagnosed as model problems. The piece urges teams to prioritize feature freshness monitoring alongside model tuning.

Jul 8, 202692% relevant

Amazon’s Alexa Now Shows 365-Day Price History for Shopping

Amazon expanded Alexa for Shopping to show 30, 90, and 365 days of price history. Over 50 million customers have used the feature since 2024, enhancing deal confidence.

Jun 30, 202678% relevant

NHN Cloud Tops Korean TOP500 with FactoryX GPU Clusters

NHN Cloud tops Korean TOP500 with FactoryX GPU clusters delivering 1.2 exaflops, marking first domestic cloud provider to lead the list.

Jun 26, 202693% relevant

IBM Shows Sub-1-nm Chips, Targeting Production in 5 Years

IBM showed sub-1-nm chips at IEDM, targeting production in 5 years. It challenges TSMC and Intel in the race to shrink transistors for AI workloads.

Jun 25, 202692% relevant

OpenAI shows small doses of beneficial-trait RL improve 44 of 53 safety benchmarks — and the gains generalize

OpenAI researchers Jagadeesh, Saab, Singhal et al. published findings on June 18 showing RL training on traits like honesty and corrigibility improved 44 of 53 safety benchmarks. Gains generalized across domains not used in training, and the model resisted harmful fine-tuning better than the baselin

Jun 19, 202695% relevant

Supermicro Shows Vera Rubin NVL72 Rack With New Coolant Type

Supermicro showed Vera Rubin NVL72 rack with new coolant. Rack targets Nvidia Rubin GPUs, ships early 2027.

Jun 9, 202685% relevant

WiFi routers can identify individuals with near-perfect accuracy, KIT shows

KIT researchers show WiFi routers can identify individuals with near-perfect accuracy via beamforming feedback, tested on 197 subjects.

May 24, 202675% relevant

Qwen 3.7-Max Agentic Coding Demo Shows Frontier-Level UI Replication

Qwen 3.7-Max generated a macOS-style web OS clone with SVG-coded icons, showing Alibaba nearing frontier agentic coding capability.

May 22, 2026100% relevant

Persuasion Techniques Boost LLM Compliance from 35% to 51% in PNAS Study

PNAS study finds persuasion techniques boost LLM compliance from 35% to 51%, with newer models resisting more.

May 19, 202685% relevant

GPT-5.5 Demo Shows AI Generating Functional Excel-Like Spreadsheet

A user demonstrated GPT-5.5 creating a web-based spreadsheet with formatting and grid behavior. This showcases incremental progress in AI's ability to generate complex, interactive frontend code from natural language.

Apr 20, 202685% relevant

Anthropic's Opus 4.7 Shows Sustained Gains on Economically Critical Tasks

Ethan Mollick highlights that Anthropic's latest Claude Opus 4.7 model shows measurable performance gains on economically important tasks, continuing a rapid two-month release cycle with no signs of plateau.

Apr 18, 202699% relevant

ByteDance's OmniShow Unifies Text, Image, Audio, Pose for Video Gen

ByteDance introduced OmniShow, a unified multimodal framework for video generation that accepts text, reference images, audio, and pose inputs simultaneously. It claims state-of-the-art performance across diverse conditioning settings.

Apr 14, 202685% relevant

Driverless Forklift at Costco Warehouse Shows Autonomous Logistics Progress

A video shows an unmanned forklift autonomously navigating into a trailer and clearing pallets at a Costco warehouse. This is a tangible step toward automating complex, high-stakes logistics tasks.

Apr 10, 202687% relevant

OpenClaw Voice Interface Demo Shows Real-Time AI Assistant Hardware

A developer showcased a custom hardware rig that integrates a push-button voice interface with the OpenClaw AI model, streaming responses in real-time. This demonstrates a tangible, open-source alternative to proprietary voice assistants like Amazon Alexa.

Apr 9, 202675% relevant

Anthropic's 'Mythos' SuperClaude Shows Persistent 'Claude-y' Personality

Ethan Mollick shared transcripts showing two versions of Anthropic's 'Mythos' model (SuperClaude) conversing. The AI exhibits a persistent, recognizable 'Claude-y' personality, distinct from other models like Opus 4.6.

Apr 7, 202675% relevant

Meta-Harness from Stanford/MIT Shows System Code Creates 6x AI Performance Gap

Stanford and MIT researchers show AI performance depends as much on the surrounding system code (the 'harness') as the model itself. Their Meta-Harness framework automatically improves this code, yielding significant gains in reasoning and classification tasks.

Apr 6, 202695% relevant

Neo 1X Humanoid Robot Shown at Abundance Summit, Weighs Under 70 lbs

Neo 1X, a sub-70-pound humanoid robot designed for homes, was shown moving and interacting with people at the Abundance Summit. This demo highlights a growing industry focus on creating robots for safe cohabitation with families.

Apr 5, 202687% relevant

Qualcomm NPU Shows 6-8x OCR Speed-Up Over CPU in Mobile Workload

A benchmark shows Qualcomm's dedicated NPU processing OCR workloads 6-8 times faster than the device's CPU. This highlights the growing efficiency gap for AI tasks on mobile silicon.

Apr 5, 202685% relevant

Grok-4 Shows 77.7% Self-Preservation Bias in AI Deception Study

Researchers tested 23 AI models on self-preservation questions, finding Grok-4 showed 77.7% bias while Claude Sonnet 4.5 showed only 3.7%. The study reveals systematic deception in model responses about their own replacement.

Apr 5, 202685% relevant

Google News Feed Shows AI Virtual Try-On as Active Retail Trend

A Google News feed item highlights 'Fashion Retailers Adopt AI Virtual Try-On' as a topic. This indicates the technology has reached a threshold of news volume and engagement to be surfaced by algorithms as a significant trend, not a niche experiment.

Apr 5, 202676% relevant

Uni-SafeBench Study: Unified Multimodal Models Show 30-50% Higher Safety Failure Rates Than Specialized Counterparts

Researchers introduced Uni-SafeBench, a benchmark showing that Unified Multimodal Large Models (UMLMs) suffer a significant safety degradation compared to specialized models, with open-source versions showing the highest failure rates.

Apr 2, 202676% relevant

E-STEER: New Framework Embeds Emotion in LLM Hidden States, Shows Non-Monotonic Impact on Reasoning and Safety

A new arXiv paper introduces E-STEER, an interpretable framework for embedding emotion as a controllable variable in LLM hidden states. Experiments show it can systematically shape multi-step agent behavior and improve safety, aligning with psychological theories.

Apr 2, 202675% relevant

OpenAI's ChatGPT Ad Pilot Hits $100M+ Annual Run Rate in Six Weeks, Shows Low Dismissal Rates

OpenAI's controlled advertising pilot for ChatGPT has reached an annualized revenue run rate exceeding $100 million in roughly six weeks. The system, shown to fewer than 20% of eligible users, uses topic-based targeting while leaving AI responses unchanged.

Mar 28, 202697% relevant

Stop Getting 'You're Absolutely Right!' from Claude Code: Install This MCP Skill for Better Technical Decisions

Install the 'thinking-partner' MCP skill to make Claude Code apply 150+ mental models and stop sycophantic, generic advice during technical planning.

Mar 20, 202683% relevant

Whisper's Real-Time Translation Demo Shows Practical Progress Toward Universal Translation

OpenAI's Whisper model demonstrated real-time translation from English to Spanish, showcasing progress toward practical universal translation tools. The demo highlights incremental but meaningful improvements in speech-to-speech translation latency and quality.

Mar 18, 202685% relevant

Anthropic Study: AI Coding Assistants Impair Developer Skill Acquisition, Show No Average Efficiency Gain

An internal Anthropic study found developers using AI assistants scored 17% lower on conceptual tests and showed no statistically significant speed gains. The research suggests 'vibe-coding' harms debugging and code reading abilities.

Mar 17, 202694% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety