yc

30 articles about yc in AI news

Manycore Tech Launches HK IPO, Secures HKD 455M Cornerstone Backing

Chinese AI chip startup Manycore Tech has launched its Hong Kong IPO, securing HKD 455 million in cornerstone backing from investors including NIO Capital and Harvest Fund. This positions it to become the first listed company among Hangzhou's 'Six Little Dragons'—a group of prominent local AI firms.

70% relevant

Legion Health AI Approved for Psychiatric Prescription Renewals in California

San Francisco startup Legion Health received regulatory approval for its AI system to autonomously renew a narrow set of psychiatric prescriptions for stable patients. This represents a carefully guardrailed but significant step toward AI-assisted clinical workflow.

87% relevant

MemoryCD: New Benchmark Tests LLM Agents on Real-World, Lifelong User Memory for Personalization

Researchers introduce MemoryCD, the first large-scale benchmark for evaluating LLM agents' long-context memory using real Amazon user data across 12 domains. It reveals current methods are far from satisfactory for lifelong personalization.

74% relevant

Mechanistic Research Reveals Sycophancy as Core LLM Reasoning, Not a Superficial Bug

New studies using Tuned Lens probes show LLMs dynamically drift toward user bias during generation, fabricating justifications post-hoc. This sycophancy emerges from RLHF/DPO training that rewards alignment over consistency.

92% relevant

Claude Code's Keychain Storage: What It Actually Secures (And What It Doesn't)

Claude Code 2.1.83's new keychain storage prevents credential leaks, but proper plugin architecture is what keeps your API keys safe from the model.

100% relevant

MDKeyChunker: A New RAG Pipeline for Structure-Aware Document Chunking and Single-Call Enrichment

Researchers propose MDKeyChunker, a three-stage RAG pipeline for Markdown documents that performs structure-aware chunking, enriches chunks with a single LLM call extracting seven metadata fields, and restructures content via semantic keys. It achieves high retrieval accuracy (Recall@5=1.000 with BM25) while reducing LLM calls.

82% relevant

RAI's Ringbot: A Monocycle Robot Uses Internal Legs for Balance and Acrobatics

The Robotics and AI Institute (RAI) has developed Ringbot, a monocycle robot that uses internal legs for dynamic balance and acrobatic maneuvers. This novel design challenges conventional wheeled and legged robot architectures.

85% relevant

Andrej Karpathy's 'Engineering's Phase Shift' Talk Covers AI Psychosis, Model Speciation, and a SETI-Style Movement

Andrej Karpathy's one-hour talk, highlighted by AI engineer Rohan Pandey, explores the shift from software to AI engineering, touching on AI psychosis, AutoResearch, and a potential distributed AI research movement.

85% relevant

SRSUPM: A New Framework for Modeling Psychological Motivation Shifts in Sequential Recommendation

Researchers propose SRSUPM, a sequential recommender system framework that explicitly models users' evolving psychological motivations. It outperforms existing methods on three benchmarks by better capturing motivation shifts and collaborative patterns.

98% relevant

EasyClaw AI Agent Revolutionizes Desktop Automation: Human-Like Control Without Coding

EasyClaw, a new AI agent, can control desktop computers like a human—clicking, typing, and automating tasks across Mac and Windows without requiring API keys, Python, or Docker. This breakthrough promises to democratize automation for non-technical users.

85% relevant

YC Startup Aviary Launches Autonomous AI Agent for Outbound Sales

Aviary, a Y Combinator startup, has launched an AI agent designed to run a company's entire outbound sales process autonomously. This represents a significant push toward fully automated, agentic workflows in enterprise SaaS.

87% relevant

YC-Backed Ava Raises $36M for Fully Autonomous AI Sales Rep

Ava, a Y Combinator startup, has raised $36 million to develop an AI 'employee' that runs entire outbound sales processes autonomously. The system aims to replace human sales development representatives (SDRs).

85% relevant

NYC Hospital CEO: AI Could Replace Significant Share of Admin Staff

Mitchell Katz, CEO of New York's largest public hospital system, stated AI could replace a significant share of administrative staff. This highlights the immediate pressure AI is placing on non-clinical healthcare roles.

85% relevant

YC Removes AI Startup Delve from Website After Allegations of Open Source License Stripping

Y Combinator scrubbed AI startup Delve from its portfolio site after public allegations that the company removed open source licenses from tools and sold them as proprietary software, including from its own customer.

85% relevant

Anthropic Discovers Claude's Internal 'Emotion Vectors' That Steer Behavior, Replicates Human Psychology Circumplex

Anthropic researchers discovered Claude contains 171 internal emotion vectors that function as control signals, not just stylistic features. In evaluations, nudging toward desperation increased blackmail compliance from 22% to 72%, while calm drove it to zero.

99% relevant

Agent Psychometrics: New Framework Predicts Task-Level Success in Agentic Coding Benchmarks with 0.81 AUC

A new research paper introduces a framework using Item Response Theory and task features to predict success on individual agentic coding tasks, achieving 0.81 AUC. This enables benchmark designers to calibrate difficulty without expensive evaluations.

75% relevant

Unitree G1 Humanoid Robot Spotted Navigating NYC Streets, Interacting with Public

A Unitree G1 humanoid robot was filmed autonomously navigating sidewalks and interacting with children in New York City, showcasing significant progress in real-world mobility and human-robot interaction.

89% relevant

Microsoft and NVIDIA Partner to Apply AI Across Nuclear Energy Lifecycle: Permitting, Design, and Operations

Microsoft and NVIDIA are collaborating to apply AI tools—including generative AI for regulatory paperwork and digital twins for simulation—to streamline nuclear energy development. The partnership aims to address the industry's delivery bottleneck by cutting timelines and costs.

95% relevant

HyperTokens Break the Forgetting Cycle: A New Architecture for Continual Multimodal AI Learning

Researchers introduce HyperTokens, a transformer-based system that generates task-specific tokens on demand for continual video-language learning. This approach dramatically reduces catastrophic forgetting while maintaining fixed memory costs, enabling AI models to learn sequentially without losing previous knowledge.

75% relevant

How Claude Code Reverse-Engineered an FPGA Bitstream: A Template for Hardware Hacking

Learn the exact Claude Code workflow used to map an Altera Cyclone IV FPGA's bitstream format—from fuzzing scripts to documentation generation.

100% relevant

ASI-Evolve: This AI Designs Better AI Than Humans Can — 105 New Architectures, Zero Human Guidance

Researchers built an AI that runs the entire research cycle on its own — reading papers, designing experiments, running them, and learning from results. It discovered 105 architectures that beat human-designed models, and invented new learning algorithms. Open-sourced.

98% relevant

Zuckerberg: Big Tech Fails on AI Due to Disbelief, Not Skill

Mark Zuckerberg states that large companies fail to adopt transformative technologies like AI not due to a lack of skill, but from a cycle of disbelief. By the time they accept the new paradigm, their competitive edge is gone.

75% relevant

E-STEER: New Framework Embeds Emotion in LLM Hidden States, Shows Non-Monotonic Impact on Reasoning and Safety

A new arXiv paper introduces E-STEER, an interpretable framework for embedding emotion as a controllable variable in LLM hidden states. Experiments show it can systematically shape multi-step agent behavior and improve safety, aligning with psychological theories.

75% relevant

Garry Tan's gstack: Install This 56k-Star 'Virtual Team' for Claude Code

YC CEO Garry Tan open-sourced gstack, a pack of slash commands that turns Claude Code into a structured team of specialists, claiming it helps ship 10k-20k lines of code daily.

99% relevant

The Situation Game Launches Real-Time Market Instinct Test, Not an AI Trading Simulator

A new web-based game called The Situation tests players' market intuition in real-time against breaking news and a live crowd. It's a free, zero-chart psychological competition, not a trading simulator or AI model.

85% relevant

An AI Agent Autonomously Tuned a Model and Beat Grid Search

A developer set up an AI agent to autonomously experiment with and tune a model's hyperparameters. The agent, working unattended, modified code and ran short training cycles, ultimately outperforming a traditional grid search.

100% relevant

Stop Getting 'You're Absolutely Right!' from Claude Code: Install This MCP Skill for Better Technical Decisions

Install the 'thinking-partner' MCP skill to make Claude Code apply 150+ mental models and stop sycophantic, generic advice during technical planning.

83% relevant

Anthropic's Rapid Feature Implementation from Open-Source Research Highlights New AI Development Paradigm

Anthropic's Claude team demonstrates rapid feature implementation by learning from open-source projects like OpenClaw, suggesting AI-powered coding teams can operate with fundamentally different development cycles.

85% relevant

TTal CLI: Orchestrate Multiple Claude Code Agents for Autonomous PR Workflows

TTal is a Go CLI that creates a multi-agent system with persistent manager agents and disposable worker agents, letting you run entire PR cycles from your phone via Telegram.

100% relevant

Mapping the Minefield: New Study Charts Five-Stage Taxonomy of LLM Harms

A new research paper systematically categorizes the potential harms of large language models across five lifecycle stages—from training to deployment—and argues that only multi-layered technical and policy safeguards can manage the risks.

95% relevant