## What is an Agent Skill?
An **Agent Skill** is a modular, reusable package of instructions, scripts, and resources that gives an AI agent specialized expertise for specific tasks . Think of it as a **"job manual" or "SOP"** for your AI assistant .
### Key Analogy
| Concept | Analogy |
| :--- | :--- |
| **Traditional Prompt** | Giving a new employee a 50-page manual to memorize before starting work |
| **Agent Skill** | Giving the employee a shelf of reference guides they can pull down only when needed |
Instead of cramming every possible instruction into the AI's system prompt (causing context bloat and confusion), Agent Skills let the AI **dynamically load expertise on demand** . The agent scans skill names and descriptions at the start, then loads the full instructions only when it identifies a relevant task .
---
## The Core Innovation: Progressive Disclosure
Agent Skills use a **three-stage "progressive disclosure"** architecture that dramatically reduces token consumption :
| Stage | What Loads | Token Cost | When |
| :--- | :--- | :--- | :--- |
| **L1: Metadata** | Skill name + description (from YAML frontmatter) | Very low (<1%) | Always - at every session start |
| **L2: Instructions** | Full `SKILL.md` body | Medium (5-10%) | Only when the skill is triggered |
| **L3: Resources** | Reference docs, scripts, assets | Variable | Only when explicitly referenced |
**Result:** Studies show this reduces context token consumption by **60-80%** while significantly improving instruction-following accuracy for complex tasks .
---
## Required Files for an Agent Skill
A skill is simply a **directory** containing a mandatory `SKILL.md` file plus optional supporting files .
### Standard Directory Structure
```
skill-name/ # Any name (lowercase, hyphens only)
├── SKILL.md # REQUIRED - The skill definition file
├── scripts/ # OPTIONAL - Executable code
│ └── helper.py
├── references/ # OPTIONAL - Reference docs (loaded on demand)
│ └── api_documentation.md
└── assets/ # OPTIONAL - Templates, images, fonts
└── report-template.docx
```
### The SKILL.md File Format
Every `SKILL.md` must contain **YAML frontmatter** (metadata) followed by **Markdown content** (instructions) :
```markdown
---
name: expense-report
description: File and validate employee expense reports according to company policy. Use when asked about expense submissions, reimbursement rules, or spending limits.
license: Apache-2.0
compatibility: Requires python3
metadata:
author: finance-team
version: "2.1"
---
# Expense Report Skill
You are now an expense report specialist.
## Instructions
1. Ask the user for: date, amount, category, receipt
2. Validate against policy in [references/policy.md](references/policy.md)
3. If amount > $500, require manager approval
4. Generate report using [assets/template.docx](assets/template.docx)
## Scripts
Run validation: `python scripts/validate.py --file {receipt_path}`
## Edge Cases
- Missing receipts: Flag as "needs follow-up"
- International currency: Convert using daily exchange rate
```
### Required Frontmatter Fields
| Field | Required | Description |
| :--- | :--- | :--- |
| `name` | **Yes** | Max 64 chars. Lowercase letters, numbers, and hyphens only. Must match parent directory name. |
| `description` | **Yes** | Max 1024 chars. What the skill does AND when to use it. Critical for routing! |
| `license` | No | License name or reference |
| `compatibility` | No | Environment requirements (Python version, network access, etc.) |
| `metadata` | No | Any custom key-value pairs (author, version, etc.) |
> ⚠️ **Critical:** The `description` field is how the agent decides whether to load your skill. Use specific keywords that match real user queries .
---
## How the Agent Processes Skills
### Step 1: Discovery
The agent scans predefined directories for skill folders containing `SKILL.md` . Common locations:
| Level | Path | Scope |
| :--- | :--- | :--- |
| **Project-level** | `./.claude/skills/` or `./.codeartsdoer/skills/` | Specific to current project |
| **User-level** | `~/.claude/skills/` or `~/.codeartsdoer/skills/` | Across all projects |
| **System-level** | Built-in skills | Provided by the tool vendor |
### Step 2: Registration & Metadata Injection
At the start of every session, the agent:
1. Recursively scans skill directories (up to 2 levels deep)
2. Reads only the `name` and `description` from each `SKILL.md` frontmatter
3. Injects a compact **skills manifest** into the system prompt
**What the agent sees at start:**
```
Available skills:
- expense-report: File and validate employee expense reports according to company policy...
- pdf-processor: Extract text, tables, and form data from PDF documents...
- code-review: Review Python code for style, security, and performance issues...
```
### Step 3: Intent Matching & Loading
When you ask a question, the agent:
1. Compares your query against skill descriptions
2. If a match is found, calls the `load_skill` tool to retrieve the **full SKILL.md body**
3. The full instructions are injected into the current context
**Example flow :**
```
User: "Process this PDF and extract all tables"
↓
Agent scans: "pdf-processor" description matches
↓
Agent calls: load_skill("pdf-processor")
↓
Full SKILL.md loads with specific extraction instructions
↓
Agent executes using referenced scripts/ and references/
```
### Step 4: Resource Loading (On-Demand)
If the skill instructions reference external files (e.g., `See [references/policy.md](references/policy.md)`), the agent:
1. Reads those files **only when needed**
2. Injects their content into context at that moment
3. Does NOT keep them loaded afterward
### Step 5: Script Execution (Optional)
Skills can include executable scripts (Python, Bash, etc.) that run in a **sandboxed environment** . The agent:
- Executes the script when instructed
- Passes parameters as needed
- Receives output (stdout/stderr)
- Uses output to inform the final response
---
## Skills vs. Rules vs. Commands
Understanding the distinction is crucial for effective implementation :
| Concept | Who Triggers | Best For | Context Cost | Example |
| :--- | :--- | :--- | :--- | :--- |
| **Rules** | The tool (always applied) | Non-negotiable requirements | Always paid | "Never commit .env files" |
| **Commands** | You (explicit intent) | Repeatable workflows | Paid when used | `/deploy` to trigger deployment |
| **Skills** | The agent (automatic) | Task-specific expertise | Paid when needed | PDF processing, code review |
### Litmus Test
> **"Would you want this instruction to apply even when you're not thinking about it?"**
> - Yes → Make it a **Rule**
> - No → Make it a **Skill**
---
## Agent Skills vs. MCP (Model Context Protocol)
These are complementary, not competing :
| Aspect | MCP (Model Context Protocol) | Agent Skill |
| :--- | :--- | :--- |
| **Role** | Data pipeline | Cognitive schema |
| **Question** | "How does data get here?" | "How is data used?" |
| **Example** | Fetch live stock prices from Yahoo Finance | Format analysis as professional research report |
| **Output** | Raw JSON data | Structured, formatted response following guidelines |
---
## Tools That Support Agent Skills
| Tool/Platform | Support Level | Notes |
| :--- | :--- | :--- |
| **Claude Code** | Native | Originator of the Skills standard |
| **Microsoft Agent Framework** | Full support | `FileAgentSkillsProvider` class, C# and Python SDKs |
| **Huawei CodeArts** | Full support | Project-level and user-level skills |
| **Builder.io** | Full support | Uses `.builder/` or `.claude/` directories |
| **Minion (open source)** | Full compatibility | Open-source implementation, LLM-agnostic |
| **OpenAI** | Similar concept | Uses different implementation (package-manager style) |
---
## Best Practices for Creating Skills
### ✅ Do's
1. **Write descriptions for routing, not reading**
- Bad: "Helps with documents"
- Good: "Extract tables from PDF files. Use when user mentions PDF, tables, or form extraction."
2. **Keep SKILL.md focused (under 500 lines)**
- Move detailed references to `references/` folder
- Keep only core instructions in the main file
3. **Use progressive disclosure naturally**
- L1: Metadata (name + description)
- L2: Core workflow in SKILL.md
- L3: Detailed policies in `references/`
4. **Include concrete examples** in the instructions
- Show input/output formats
- Demonstrate edge case handling
### ❌ Don'ts
1. **Don't stuff everything into one file** - Reference external docs instead
2. **Don't write vague descriptions** - The agent will never find your skill
3. **Don't include sensitive data** - Skills are plain text files in your repo
4. **Don't make skills that are really rules** - Use the litmus test above
---
## Example: Complete Skill for PDF Processing
```
project-root/
└── .claude/
└── skills/
└── pdf-analyzer/
├── SKILL.md
├── scripts/
│ └── extract_tables.py
└── references/
└── table_formats.md
```
**SKILL.md:**
```markdown
---
name: pdf-analyzer
description: Extract text, tables, and form data from PDF documents. Use when user asks about PDF files, form extraction, or table parsing.
license: MIT
compatibility: Requires python3, tabula-py, pypdf2
---
# PDF Analyzer Skill
You are a PDF processing specialist.
## Instructions
1. Locate the PDF file path from user input
2. Determine extraction type:
- Text: Use pypdf2
- Tables: Use tabula-py
- Forms: Use pdfplumber
3. Run the appropriate script from `scripts/`
## Table Extraction
Run: `python scripts/extract_tables.py --input {pdf_path} --output {csv_path}`
Refer to [references/table_formats.md](references/table_formats.md) for handling complex multi-page tables.
## Edge Cases
- Scanned PDFs: Flag as "needs OCR" and suggest alternative tool
- Password-protected: Ask user for password before proceeding
```
---
## Summary
| Question | Answer |
| :--- | :--- |
| **What is an Agent Skill?** | A modular package of instructions + resources giving AI specialized expertise |
| **What files are required?** | `SKILL.md` with YAML frontmatter (name + description) and Markdown instructions |
| **What optional files exist?** | `scripts/` (executable code), `references/` (docs), `assets/` (templates) |
| **How does the agent process skills?** | L1 metadata (always) → L2 instructions (on match) → L3 resources (on reference) |
| **What's the key benefit?** | Reduces context token usage by 60-80%, improves instruction following |
Would you like me to help you create a specific skill for your use case, or dive deeper into any of these concepts?
No comments:
Post a Comment