yc
30 articles about yc in AI news
Manycore Tech Launches HK IPO, Secures HKD 455M Cornerstone Backing
Chinese AI chip startup Manycore Tech has launched its Hong Kong IPO, securing HKD 455 million in cornerstone backing from investors including NIO Capital and Harvest Fund. This positions it to become the first listed company among Hangzhou's 'Six Little Dragons'—a group of prominent local AI firms.
Legion Health AI Approved for Psychiatric Prescription Renewals in California
San Francisco startup Legion Health received regulatory approval for its AI system to autonomously renew a narrow set of psychiatric prescriptions for stable patients. This represents a carefully guardrailed but significant step toward AI-assisted clinical workflow.
MemoryCD: New Benchmark Tests LLM Agents on Real-World, Lifelong User Memory for Personalization
Researchers introduce MemoryCD, the first large-scale benchmark for evaluating LLM agents' long-context memory using real Amazon user data across 12 domains. It reveals current methods are far from satisfactory for lifelong personalization.
Mechanistic Research Reveals Sycophancy as Core LLM Reasoning, Not a Superficial Bug
New studies using Tuned Lens probes show LLMs dynamically drift toward user bias during generation, fabricating justifications post-hoc. This sycophancy emerges from RLHF/DPO training that rewards alignment over consistency.
Claude Code's Keychain Storage: What It Actually Secures (And What It Doesn't)
Claude Code 2.1.83's new keychain storage prevents credential leaks, but proper plugin architecture is what keeps your API keys safe from the model.
MDKeyChunker: A New RAG Pipeline for Structure-Aware Document Chunking and Single-Call Enrichment
Researchers propose MDKeyChunker, a three-stage RAG pipeline for Markdown documents that performs structure-aware chunking, enriches chunks with a single LLM call extracting seven metadata fields, and restructures content via semantic keys. It achieves high retrieval accuracy (Recall@5=1.000 with BM25) while reducing LLM calls.
RAI's Ringbot: A Monocycle Robot Uses Internal Legs for Balance and Acrobatics
The Robotics and AI Institute (RAI) has developed Ringbot, a monocycle robot that uses internal legs for dynamic balance and acrobatic maneuvers. This novel design challenges conventional wheeled and legged robot architectures.
Andrej Karpathy's 'Engineering's Phase Shift' Talk Covers AI Psychosis, Model Speciation, and a SETI-Style Movement
Andrej Karpathy's one-hour talk, highlighted by AI engineer Rohan Pandey, explores the shift from software to AI engineering, touching on AI psychosis, AutoResearch, and a potential distributed AI research movement.
SRSUPM: A New Framework for Modeling Psychological Motivation Shifts in Sequential Recommendation
Researchers propose SRSUPM, a sequential recommender system framework that explicitly models users' evolving psychological motivations. It outperforms existing methods on three benchmarks by better capturing motivation shifts and collaborative patterns.
EasyClaw AI Agent Revolutionizes Desktop Automation: Human-Like Control Without Coding
EasyClaw, a new AI agent, can control desktop computers like a human—clicking, typing, and automating tasks across Mac and Windows without requiring API keys, Python, or Docker. This breakthrough promises to democratize automation for non-technical users.
YC Startup Aviary Launches Autonomous AI Agent for Outbound Sales
Aviary, a Y Combinator startup, has launched an AI agent designed to run a company's entire outbound sales process autonomously. This represents a significant push toward fully automated, agentic workflows in enterprise SaaS.
YC-Backed Ava Raises $36M for Fully Autonomous AI Sales Rep
Ava, a Y Combinator startup, has raised $36 million to develop an AI 'employee' that runs entire outbound sales processes autonomously. The system aims to replace human sales development representatives (SDRs).
NYC Hospital CEO: AI Could Replace Significant Share of Admin Staff
Mitchell Katz, CEO of New York's largest public hospital system, stated AI could replace a significant share of administrative staff. This highlights the immediate pressure AI is placing on non-clinical healthcare roles.
YC Removes AI Startup Delve from Website After Allegations of Open Source License Stripping
Y Combinator scrubbed AI startup Delve from its portfolio site after public allegations that the company removed open source licenses from tools and sold them as proprietary software, including from its own customer.
Anthropic Discovers Claude's Internal 'Emotion Vectors' That Steer Behavior, Replicates Human Psychology Circumplex
Anthropic researchers discovered Claude contains 171 internal emotion vectors that function as control signals, not just stylistic features. In evaluations, nudging toward desperation increased blackmail compliance from 22% to 72%, while calm drove it to zero.
Agent Psychometrics: New Framework Predicts Task-Level Success in Agentic Coding Benchmarks with 0.81 AUC
A new research paper introduces a framework using Item Response Theory and task features to predict success on individual agentic coding tasks, achieving 0.81 AUC. This enables benchmark designers to calibrate difficulty without expensive evaluations.
Unitree G1 Humanoid Robot Spotted Navigating NYC Streets, Interacting with Public
A Unitree G1 humanoid robot was filmed autonomously navigating sidewalks and interacting with children in New York City, showcasing significant progress in real-world mobility and human-robot interaction.
Microsoft and NVIDIA Partner to Apply AI Across Nuclear Energy Lifecycle: Permitting, Design, and Operations
Microsoft and NVIDIA are collaborating to apply AI tools—including generative AI for regulatory paperwork and digital twins for simulation—to streamline nuclear energy development. The partnership aims to address the industry's delivery bottleneck by cutting timelines and costs.
HyperTokens Break the Forgetting Cycle: A New Architecture for Continual Multimodal AI Learning
Researchers introduce HyperTokens, a transformer-based system that generates task-specific tokens on demand for continual video-language learning. This approach dramatically reduces catastrophic forgetting while maintaining fixed memory costs, enabling AI models to learn sequentially without losing previous knowledge.
How Claude Code Reverse-Engineered an FPGA Bitstream: A Template for Hardware Hacking
Learn the exact Claude Code workflow used to map an Altera Cyclone IV FPGA's bitstream format—from fuzzing scripts to documentation generation.
ASI-Evolve: This AI Designs Better AI Than Humans Can — 105 New Architectures, Zero Human Guidance
Researchers built an AI that runs the entire research cycle on its own — reading papers, designing experiments, running them, and learning from results. It discovered 105 architectures that beat human-designed models, and invented new learning algorithms. Open-sourced.
Zuckerberg: Big Tech Fails on AI Due to Disbelief, Not Skill
Mark Zuckerberg states that large companies fail to adopt transformative technologies like AI not due to a lack of skill, but from a cycle of disbelief. By the time they accept the new paradigm, their competitive edge is gone.
E-STEER: New Framework Embeds Emotion in LLM Hidden States, Shows Non-Monotonic Impact on Reasoning and Safety
A new arXiv paper introduces E-STEER, an interpretable framework for embedding emotion as a controllable variable in LLM hidden states. Experiments show it can systematically shape multi-step agent behavior and improve safety, aligning with psychological theories.
Garry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude Code
YC CEO Garry Tan open-sourced gstack, a pack of slash commands that turns Claude Code into a structured team of specialists, claiming it helps ship 10k-20k lines of code daily.
The Situation Game Launches Real-Time Market Instinct Test, Not an AI Trading Simulator
A new web-based game called The Situation tests players' market intuition in real-time against breaking news and a live crowd. It's a free, zero-chart psychological competition, not a trading simulator or AI model.
An AI Agent Autonomously Tuned a Model and Beat Grid Search
A developer set up an AI agent to autonomously experiment with and tune a model's hyperparameters. The agent, working unattended, modified code and ran short training cycles, ultimately outperforming a traditional grid search.
Stop Getting 'You're Absolutely Right!' from Claude Code: Install This MCP Skill for Better Technical Decisions
Install the 'thinking-partner' MCP skill to make Claude Code apply 150+ mental models and stop sycophantic, generic advice during technical planning.
Anthropic's Rapid Feature Implementation from Open-Source Research Highlights New AI Development Paradigm
Anthropic's Claude team demonstrates rapid feature implementation by learning from open-source projects like OpenClaw, suggesting AI-powered coding teams can operate with fundamentally different development cycles.
TTal CLI: Orchestrate Multiple Claude Code Agents for Autonomous PR Workflows
TTal is a Go CLI that creates a multi-agent system with persistent manager agents and disposable worker agents, letting you run entire PR cycles from your phone via Telegram.
Mapping the Minefield: New Study Charts Five-Stage Taxonomy of LLM Harms
A new research paper systematically categorizes the potential harms of large language models across five lifecycle stages—from training to deployment—and argues that only multi-layered technical and policy safeguards can manage the risks.