skills

30 articles about skills in AI news

Stop Testing Skills Once: Use Caliper's pass@k to Measure What Actually

Caliper is a lightweight harness that runs Claude Code skills k times, scores them with pass@k, and compares against a no-skill baseline so you know if your skill actually helps.

Jun 29, 202685% relevant

Caliper: Run Your Claude Code Skills k Times and Get a pass@k Score That

Caliper gives Claude Code users a pass@k reliability score for skills, with a baseline delta showing if the skill beats the base agent. Install via pipx or npx.

Jun 28, 2026100% relevant

Claude Fable 5 Migration: Cut Prescriptive Skills 60% to Stop Degrading Output

Audit your ~/.claude/skills for temperature, budget_tokens, and 'show your reasoning'. Replace 6+ step procedures with goal+constraints. Cut MUST/NEVER blocks to only guard money, deletions, or identity.

Jun 8, 2026100% relevant

Larger models learn rare skills by forgetting them less, new paper shows

New paper from Stanford, MIT, Harvard, and Anthropic shows larger models learn rare skills because they forget them less during training, tested on OLMo models from 4M to 4B parameters.

Jun 8, 202688% relevant

Nvidia Unveils Physical AI Agent Skills, 32B VLA Model at CVPR

Nvidia launched physical AI agent skills and a 32B VLA model at CVPR to automate AV and robotics workflows, addressing the fragmented tooling bottleneck.

Jun 3, 2026100% relevant

Microsoft SkillOpt Trains Agent Skills in Text Space, Beats 52/52 Benchmarks

Microsoft's SkillOpt trains agent skills in text space, achieving best or tied-best results in all 52 settings across 6 benchmarks and 7 models.

May 25, 202689% relevant

CLAUDE.md Wastes 7K+ Tokens Per Turn; Skills Cut to 50

A 1,000-line CLAUDE.md burns 7,000-10,000 tokens per turn on instructions the model already knows. Skills using progressive disclosure cut that to ~50 tokens.

May 15, 2026100% relevant

Skills as Untrusted Code: A Security Precedent for Agent Runtimes

Paper argues agent skills are untrusted code until verified; runtimes must enforce verification gates to prevent supply-chain attacks, echoing decades of software security lessons.

May 5, 2026100% relevant

Ctx2Skill: Self-Play Framework Lets LMs Discover Skills Without Labels

Ctx2Skill discovers skills from context via multi-agent self-play without labels. Outputs plug into any LM, targeting manual prompt engineering bottlenecks.

May 5, 202685% relevant

Build Reusable Data Science Workflows with Claude Skills and Subagents

Claude Skills and Subagents let you package prompts into reusable modules, freeing data scientists from repetitive AI adjustments for EDA, modeling, and deployment.

Apr 26, 202699% relevant

10 Claude Code Skills That Actually Work: A Solo Developer's Vetted List

A curated list of the most effective Claude Code skills for developers, based on hands-on testing, focusing on practical MCP servers and workflow enhancements.

Apr 21, 2026100% relevant

Ethan Mollick: AI Judgment & Problem-Solving Are Skills, Not Human Exclusives

Ethan Mollick contends that skills like judgment and problem-solving, often cited as uniquely human, are domains where AI can and does demonstrate competence, reframing them as learnable capabilities.

Apr 19, 202675% relevant

Stop Thinking 'Progressive Disclosure' for Claude Skills — Think

A mental model shift from 'progressive disclosure' to 'progressive discovery' makes building Claude Skills more intuitive by clarifying Claude's active role in finding what it needs.

Apr 19, 202682% relevant

Free 'finance-skills' Tool Adds Bloomberg Terminal-Like Features to Claude

An open-source tool called 'finance-skills' allows Claude to access real-time financial data and analysis, replicating key features of the expensive Bloomberg Terminal platform for free.

Apr 14, 202693% relevant

MiniMax Open-Sources Three Agent Music Skills for MMX-CLI

MiniMax has open-sourced three 'Music Skills' for its MMX-CLI agent platform. The skills allow AI agents to generate music, sing in a persona, and curate playlists from a user's local library.

Apr 13, 202687% relevant

Newline's 'Skills' Update Shows Where MCP Servers Are Headed

The Newline MCP server now supports modular 'Skills,' allowing developers to customize their Claude Code environment with specific, installable capabilities for more targeted workflows.

Apr 13, 2026100% relevant

Palantir CEO Karp: AI Will 'Destroy Humanities Jobs', Shift to Vocational Skills

Palantir CEO Alex Karp warns AI will 'destroy humanities jobs,' arguing broad degrees lose value while vocational skills and neurodivergent traits become key advantages. He insists there will still be 'more than enough jobs,' just redistributed toward practical roles.

Apr 12, 202685% relevant

Addy Osmani Unveils 'Agent Skills' for AI-Powered Development

Google VP Addy Osmani teased a new framework called 'Agent Skills' for constructing AI agents, likely a significant move to standardize and simplify agent-based development workflows.

Apr 9, 202687% relevant

MCP Security Crisis: 43% of Servers Vulnerable, 341 Malicious Skills Found

Security audits of the Model Context Protocol (MCP) ecosystem reveal 43% of servers are vulnerable to command execution, while 341 malicious skills were found on marketplaces, exposing systemic security flaws in agentic AI. The findings highlight a growing attack surface as AI agents become more autonomous.

Apr 9, 202677% relevant

How Anthropic's Team Uses Skills as Knowledge Containers (And What It Means For Your CLAUDE.md)

Learn how to use Claude Code skills not just for automation but as living knowledge bases, following patterns from Anthropic's own engineering team.

Apr 4, 202670% relevant

Anthropic's Claude Skills Implements 3-Layer Context Architecture to Manage Hundreds of Skills

Anthropic's Claude Skills framework employs a three-layer context management system that loads only skill metadata by default, enabling support for hundreds of specialized skills without exceeding context window limits.

Apr 3, 202685% relevant

How to Build a Custom AI Agent with Claude Code's Skills, SubAgents, and Hooks

A developer's deep dive into customizing Claude Code with 7 skills, 5 subagents, and quality-check hooks—showing how to move beyond basic prompting to create a truly autonomous coding assistant.

Mar 31, 202695% relevant

Base44 Launches Superagent Skills: No-Code Library for Adding Domain-Specific Functions to AI Agents

Base44 has launched Superagent Skills, a library of pre-built, domain-specific functions that can be added to AI agents with a single click. The no-code system allows for combining skills and creating custom ones via natural language description.

Mar 30, 202685% relevant

Trace2Skill Framework Distills Execution Traces into Declarative Skills via Parallel Sub-Agents

Researchers introduced Trace2Skill, a framework that uses parallel sub-agents to analyze execution trajectories and distill them into transferable declarative skills. This enables performance improvements in larger models without parameter updates.

Mar 30, 202685% relevant

Claude Skills: How Anthropic's Context-Aware Workflow System Solves the bloated CLAUDE.md Problem

Claude Skills are modular, self-contained workflow packages that load only when triggered by user intent, solving the context bloat caused by monolithic CLAUDE.md files. They support automatic invocation, slash commands, and can bundle supporting documents.

Mar 29, 202695% relevant

Palantir CEO Alex Karp: AI Era Will Favor Trade Skills and Neurodivergent Thinking

Palantir CEO Alex Karp predicts AI will most reward individuals with hands-on vocational skills and those who think in unusually original, often neurodivergent, ways. This perspective challenges the narrative that AI success is reserved for traditional tech roles.

Mar 27, 202685% relevant

How Weaviate Agent Skills Let Claude Code Build Vector Apps in Minutes

Weaviate's official Agent Skills give Claude Code structured access to vector databases, eliminating guesswork when building semantic search and RAG applications.

Mar 27, 202695% relevant

Awesome Finance Skills: Open-Source Plugin Adds Real-Time Market Analysis to AI Agents

Developer open-sources Awesome Finance Skills, a plug-and-play toolkit that gives AI agents real-time financial data access, sentiment analysis, and automated research report generation. The MIT-licensed package works with Claude Code, OpenClaw, and other popular agent frameworks.

Mar 26, 202695% relevant

How to Deploy Claude Code at Scale: The Admin's Guide to MCPs, Skills, and User Management

Practical solutions for managing Claude Code across teams: central MCP servers, standardized CLAUDE.md templates, and pre-configured skills to prevent chaos.

Mar 26, 202690% relevant

How to Install claude-flow MCP and 3 Skills That Transform Claude Code

A production team's setup reveals claude-flow MCP with hierarchical-mesh topology and three essential skills that add structure, parallelism, and quality control.

Mar 25, 202695% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety