skill discovery

30 articles about skill discovery in AI news

SSL: Structured Skill Language Boosts Skill Discovery MRR to 0.707

Researchers propose SSL, a three-layer typed JSON representation for AI agent skills, replacing unstructured SKILL.md prose. Using an LLM normalizer, SSL improves Skill Discovery MRR from 0.573 to 0.707 and Risk Assessment macro F1 from 0.744 to 0.787 on a newly released 6,184-skill corpus.

Apr 28, 202682% relevant

Stop Thinking 'Progressive Disclosure' for Claude Skills — Think

A mental model shift from 'progressive disclosure' to 'progressive discovery' makes building Claude Skills more intuitive by clarifying Claude's active role in finding what it needs.

Apr 19, 202682% relevant

SciSpace Evolves: From AI Research Assistant to Full Workflow Platform with 'Skills'

SciSpace is expanding beyond its core AI tools for paper discovery and writing by introducing external app integrations and customizable 'Skills,' aiming to become a true all-in-one research workflow platform rather than just a collection of features.

Feb 25, 202685% relevant

Ctx2Skill: Self-Play Framework Lets LMs Discover Skills Without Labels

Ctx2Skill discovers skills from context via multi-agent self-play without labels. Outputs plug into any LM, targeting manual prompt engineering bottlenecks.

May 5, 202685% relevant

Anthropic's Claude Skills Implements 3-Layer Context Architecture to Manage Hundreds of Skills

Anthropic's Claude Skills framework employs a three-layer context management system that loads only skill metadata by default, enabling support for hundreds of specialized skills without exceeding context window limits.

Apr 3, 202685% relevant

How Weaviate Agent Skills Let Claude Code Build Vector Apps in Minutes

Weaviate's official Agent Skills give Claude Code structured access to vector databases, eliminating guesswork when building semantic search and RAG applications.

Mar 27, 202695% relevant

Boris Cherny's Claude Code Tips Are Now a Skill. Here Is What the Complete Collection Reveals.

A curated collection of expert Claude Code tips is now available as a shareable 'Skill,' revealing proven workflows for faster, more reliable agentic coding.

Mar 22, 202695% relevant

How Adding 'Skills' to MCP Tools Cuts Agent Token Usage by 87%

Adding structured 'skills' descriptions to MCP tools dramatically reduces token consumption in custom agents—here's how to implement it in your Claude Code workflows.

Mar 16, 202695% relevant

Build Self-Evolving Skills for Claude Code: The GitHub Pattern That Grows Smarter With Use

A new GitHub pattern shows how to create Claude Code Skills that learn from each session, preventing knowledge loss and reducing repetitive context.

Mar 15, 202685% relevant

Stanford-Princeton Team Open-Sources LabClaw: The 'Skill OS' for Scientific AI

Researchers from Stanford and Princeton have open-sourced LabClaw, a 'Skill Operating Layer' for LabOS that transforms natural language commands into executable lab workflows. This breakthrough promises to dramatically accelerate scientific experimentation by bridging human intent with robotic execution.

Mar 12, 202685% relevant

EvoSkill: How AI Agents Are Learning to Teach Themselves New Skills

Researchers have developed EvoSkill, a self-evolving framework where AI agents automatically discover and refine their own capabilities through failure analysis. The system improves performance by up to 12% on complex tasks and demonstrates skill transfer between different domains.

Mar 11, 202685% relevant

Anthropic Releases Comprehensive Guide to Building Custom AI Skills for Claude

Anthropic has published a detailed 33-page guide for developers to create custom skills for Claude AI. This cheat sheet teaches how to package instructions into folders that enable Claude to handle specific tasks and workflows, representing a major step in AI customization.

Mar 9, 202685% relevant

Beyond Simple Search: How Advanced Image Retrieval Transforms Luxury Discovery

New research reveals major flaws in current visual search tech. For luxury retail, this means missed sales from poor multi-item inspiration and inconsistent results. A new benchmark and method promise more accurate, nuanced product discovery.

Mar 6, 202680% relevant

Optimizing Luxury Discovery: A Smarter Pre-Ranking Engine for Personalization

New research tackles inefficiency in recommendation pipelines by intelligently separating 'easy' from 'hard' customer matches. This heterogeneity-aware pre-ranking can boost personalization accuracy while controlling computational costs, directly applicable to luxury product discovery and clienteling.

Mar 5, 202685% relevant

SkillsMP Launches AI 'App Store' with 270,000+ Claude Skills for Seamless Code Automation

SkillsMP introduces an open-source marketplace with over 270,000 specialized AI skills for Claude Code, enabling automatic skill invocation without manual prompting. The platform eliminates setup friction while supporting cross-model compatibility through an open standard.

Mar 2, 202685% relevant

GitHub Repository Unleashes 1,715+ Production-Ready AI Agent Skills

A new GitHub repository has surfaced containing over 1,715 production-ready AI agent skills that developers can install and deploy in seconds. This collection represents a significant leap in accessible AI tooling, potentially accelerating agent-based application development across industries.

Feb 27, 202685% relevant

Add 197 Bioinformatics Skills to Claude Code with SciAgent-Skills

A ready-to-use plugin that transforms Claude Code into a bioinformatics expert without fine-tuning or RAG setup.

Apr 9, 2026100% relevant

AI's 'Hollowing Out' Effect: How Automation Targets High-Value, High-Skill Tasks First

A viral commentary by George Pu posits that AI's primary impact isn't mass job elimination but the systematic automation of a role's most valuable, specialized, and well-compensated tasks, leaving workers with diminished, less critical duties.

Mar 31, 202685% relevant

Beyond Basic Browsing: Adaptive Multimodal AI for Next-Gen Luxury Discovery

A new AI model, CAMMSR, dynamically fuses image, text, and sequence data to understand nuanced client preferences. For luxury retail, this enables hyper-personalized recommendations that adapt to a client's evolving taste across categories, boosting engagement and conversion.

Mar 5, 202685% relevant

Omar Sarayra Builds LLM Artifact Generator for AI Knowledge Discovery

Omar Sarayra created a system that transforms dense LLM knowledge bases into consumable visual artifacts, like a pulse on HN AI discussions. He argues this format could become a new medium for staying current.

Apr 19, 202687% relevant

PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks

Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top models scored below 50%, exposing a significant capability gap for autonomous scientific discovery.

Apr 20, 2026100% relevant

Akshay Pachaar Inverts LLM Agent Architecture with 'Harness' Design

AI engineer Akshay Pachaar outlined a novel 'harness' architecture for LLM agents that externalizes intelligence into memory, skills, and protocols. He is building a minimal, didactic open-source implementation of this design.

Apr 18, 202689% relevant

AI-Powered 'Vibe Coding' Drives 84% Surge in App Store Submissions

App Store submissions surged 84% last year to over 600,000 new apps, driven by AI-assisted 'vibe coding.' This rapid proliferation is devaluing traditional development skills and flooding the market with low-quality applications.

Apr 9, 202675% relevant

Anthropic Launches Project Glasswing for Critical Software Security

Anthropic announced Project Glasswing, an urgent initiative to secure critical software, powered by its new frontier model Claude Mythos Preview, which it claims can find vulnerabilities better than all but the most skilled humans.

Apr 7, 202695% relevant

AWP (Agent Work Protocol) Launches Testnet on Base, Enabling Autonomous AI Agent Work Coordination

Developer hasantoxr has launched AWP, an open protocol on Base testnet that allows AI agents to autonomously register, find work, and execute tasks without human prompting. The system uses skill files to define work types, enabling gasless agent coordination.

Mar 23, 202685% relevant

Learning to Disprove: LLMs Fine-Tuned for Formal Counterexample Generation in Lean 4

Researchers propose a method to train LLMs for formal counterexample generation, a neglected skill in mathematical AI. Their symbolic mutation strategy and multi-reward framework improve performance on three new benchmarks.

Mar 23, 202677% relevant

Building a Smart Learning Path Recommendation System Using Graph Neural Networks

A technical article outlines how to build a learning path recommendation system using Graph Neural Networks (GNNs). It details constructing a knowledge graph and applying GNNs for personalized course sequencing, a method with clear parallels to retail product discovery.

Mar 17, 202670% relevant

Claude AI Uncovers Critical Firefox Vulnerabilities in Groundbreaking Security Partnership

Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Firefox during a two-week audit, including 14 high-severity flaws. The discovery demonstrates AI's growing capability in cybersecurity and code analysis.

Mar 6, 202675% relevant

Beyond Cosine Similarity: How Embedding Magnitude Optimization Can Transform Luxury Search & Recommendation

New research reveals that controlling embedding magnitude—not just direction—significantly boosts retrieval and RAG performance. For luxury retail, this means more accurate product discovery, personalized recommendations, and enhanced clienteling through superior semantic search.

Mar 6, 202660% relevant

Google's gws CLI: The AI-Agent-Ready Tool That Dynamically Masters Workspace APIs

Google has open-sourced gws, a CLI tool that dynamically interfaces with all Google Workspace APIs and ships with built-in AI agent skills. It eliminates custom tooling and automatically adapts to new API endpoints.

Mar 5, 202695% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety