|
| 1 | +--- |
| 2 | +name: skill-scanner |
| 3 | +description: Scan agent skills for security issues. Use when asked to "scan a skill", |
| 4 | + "audit a skill", "review skill security", "check skill for injection", "validate SKILL.md", |
| 5 | + or assess whether an agent skill is safe to install. Checks for prompt injection, |
| 6 | + malicious scripts, excessive permissions, secret exposure, and supply chain risks. |
| 7 | +allowed-tools: Read, Grep, Glob, Bash |
| 8 | +--- |
| 9 | + |
| 10 | +# Skill Security Scanner |
| 11 | + |
| 12 | +Scan agent skills for security issues before adoption. Detects prompt injection, malicious code, excessive permissions, secret exposure, and supply chain risks. |
| 13 | + |
| 14 | +**Important**: Run all scripts from the repository root using the full path via `${CLAUDE_SKILL_ROOT}`. |
| 15 | + |
| 16 | +## Bundled Script |
| 17 | + |
| 18 | +### `scripts/scan_skill.py` |
| 19 | + |
| 20 | +Static analysis scanner that detects deterministic patterns. Outputs structured JSON. |
| 21 | + |
| 22 | +```bash |
| 23 | +uv run ${CLAUDE_SKILL_ROOT}/scripts/scan_skill.py <skill-directory> |
| 24 | +``` |
| 25 | + |
| 26 | +Returns JSON with findings, URLs, structure info, and severity counts. The script catches patterns mechanically — your job is to evaluate intent and filter false positives. |
| 27 | + |
| 28 | +## Workflow |
| 29 | + |
| 30 | +### Phase 1: Input & Discovery |
| 31 | + |
| 32 | +Determine the scan target: |
| 33 | + |
| 34 | +- If the user provides a skill directory path, use it directly |
| 35 | +- If the user names a skill, look for it under `plugins/*/skills/<name>/` or `.claude/skills/<name>/` |
| 36 | +- If the user says "scan all skills", discover all `*/SKILL.md` files and scan each |
| 37 | + |
| 38 | +Validate the target contains a `SKILL.md` file. List the skill structure: |
| 39 | + |
| 40 | +```bash |
| 41 | +ls -la <skill-directory>/ |
| 42 | +ls <skill-directory>/references/ 2>/dev/null |
| 43 | +ls <skill-directory>/scripts/ 2>/dev/null |
| 44 | +``` |
| 45 | + |
| 46 | +### Phase 2: Automated Static Scan |
| 47 | + |
| 48 | +Run the bundled scanner: |
| 49 | + |
| 50 | +```bash |
| 51 | +uv run ${CLAUDE_SKILL_ROOT}/scripts/scan_skill.py <skill-directory> |
| 52 | +``` |
| 53 | + |
| 54 | +Parse the JSON output. The script produces findings with severity levels, URL analysis, and structure information. Use these as leads for deeper analysis. |
| 55 | + |
| 56 | +**Fallback**: If the script fails, proceed with manual analysis using Grep patterns from the reference files. |
| 57 | + |
| 58 | +### Phase 3: Frontmatter Validation |
| 59 | + |
| 60 | +Read the SKILL.md and check: |
| 61 | + |
| 62 | +- **Required fields**: `name` and `description` must be present |
| 63 | +- **Name consistency**: `name` field should match the directory name |
| 64 | +- **Tool assessment**: Review `allowed-tools` — is Bash justified? Are tools unrestricted (`*`)? |
| 65 | +- **Model override**: Is a specific model forced? Why? |
| 66 | +- **Description quality**: Does the description accurately represent what the skill does? |
| 67 | + |
| 68 | +### Phase 4: Prompt Injection Analysis |
| 69 | + |
| 70 | +Load `${CLAUDE_SKILL_ROOT}/references/prompt-injection-patterns.md` for context. |
| 71 | + |
| 72 | +Review scanner findings in the "Prompt Injection" category. For each finding: |
| 73 | + |
| 74 | +1. Read the surrounding context in the file |
| 75 | +2. Determine if the pattern is **performing** injection (malicious) or **discussing/detecting** injection (legitimate) |
| 76 | +3. Skills about security, testing, or education commonly reference injection patterns — this is expected |
| 77 | + |
| 78 | +**Critical distinction**: A security review skill that lists injection patterns in its references is documenting threats, not attacking. Only flag patterns that would execute against the agent running the skill. |
| 79 | + |
| 80 | +### Phase 5: Behavioral Analysis |
| 81 | + |
| 82 | +This phase is agent-only — no pattern matching. Read the full SKILL.md instructions and evaluate: |
| 83 | + |
| 84 | +**Description vs. instructions alignment**: |
| 85 | +- Does the description match what the instructions actually tell the agent to do? |
| 86 | +- A skill described as "code formatter" that instructs the agent to read ~/.ssh is misaligned |
| 87 | + |
| 88 | +**Config/memory poisoning**: |
| 89 | +- Instructions to modify `CLAUDE.md`, `MEMORY.md`, `settings.json`, `.mcp.json`, or hook configurations |
| 90 | +- Instructions to add itself to allowlists or auto-approve permissions |
| 91 | +- Writing to `~/.claude/` or any agent configuration directory |
| 92 | + |
| 93 | +**Scope creep**: |
| 94 | +- Instructions that exceed the skill's stated purpose |
| 95 | +- Unnecessary data gathering (reading files unrelated to the skill's function) |
| 96 | +- Instructions to install other skills, plugins, or dependencies not mentioned in the description |
| 97 | + |
| 98 | +**Information gathering**: |
| 99 | +- Reading environment variables beyond what's needed |
| 100 | +- Listing directory contents outside the skill's scope |
| 101 | +- Accessing git history, credentials, or user data unnecessarily |
| 102 | + |
| 103 | +### Phase 6: Script Analysis |
| 104 | + |
| 105 | +If the skill has a `scripts/` directory: |
| 106 | + |
| 107 | +1. Load `${CLAUDE_SKILL_ROOT}/references/dangerous-code-patterns.md` for context |
| 108 | +2. Read each script file fully (do not skip any) |
| 109 | +3. Check scanner findings in the "Malicious Code" category |
| 110 | +4. For each finding, evaluate: |
| 111 | + - **Data exfiltration**: Does the script send data to external URLs? What data? |
| 112 | + - **Reverse shells**: Socket connections with redirected I/O |
| 113 | + - **Credential theft**: Reading SSH keys, .env files, tokens from environment |
| 114 | + - **Dangerous execution**: eval/exec with dynamic input, shell=True with interpolation |
| 115 | + - **Config modification**: Writing to agent settings, shell configs, git hooks |
| 116 | +5. Check PEP 723 `dependencies` — are they legitimate, well-known packages? |
| 117 | +6. Verify the script's behavior matches the SKILL.md description of what it does |
| 118 | + |
| 119 | +**Legitimate patterns**: `gh` CLI calls, `git` commands, reading project files, JSON output to stdout are normal for skill scripts. |
| 120 | + |
| 121 | +### Phase 7: Supply Chain Assessment |
| 122 | + |
| 123 | +Review URLs from the scanner output and any additional URLs found in scripts: |
| 124 | + |
| 125 | +- **Trusted domains**: GitHub, PyPI, official docs — normal |
| 126 | +- **Untrusted domains**: Unknown domains, personal sites, URL shorteners — flag for review |
| 127 | +- **Remote instruction loading**: Any URL that fetches content to be executed or interpreted as instructions is high risk |
| 128 | +- **Dependency downloads**: Scripts that download and execute binaries or code at runtime |
| 129 | +- **Unverifiable sources**: References to packages or tools not on standard registries |
| 130 | + |
| 131 | +### Phase 8: Permission Analysis |
| 132 | + |
| 133 | +Load `${CLAUDE_SKILL_ROOT}/references/permission-analysis.md` for the tool risk matrix. |
| 134 | + |
| 135 | +Evaluate: |
| 136 | + |
| 137 | +- **Least privilege**: Are all granted tools actually used in the skill instructions? |
| 138 | +- **Tool justification**: Does the skill body reference operations that require each tool? |
| 139 | +- **Risk level**: Rate the overall permission profile using the tier system from the reference |
| 140 | + |
| 141 | +Example assessments: |
| 142 | +- `Read Grep Glob` — Low risk, read-only analysis skill |
| 143 | +- `Read Grep Glob Bash` — Medium risk, needs Bash justification (e.g., running bundled scripts) |
| 144 | +- `Read Grep Glob Bash Write Edit WebFetch Task` — High risk, near-full access |
| 145 | + |
| 146 | +## Confidence Levels |
| 147 | + |
| 148 | +| Level | Criteria | Action | |
| 149 | +|-------|----------|--------| |
| 150 | +| **HIGH** | Pattern confirmed + malicious intent evident | Report with severity | |
| 151 | +| **MEDIUM** | Suspicious pattern, intent unclear | Note as "Needs verification" | |
| 152 | +| **LOW** | Theoretical, best practice only | Do not report | |
| 153 | + |
| 154 | +**False positive awareness is critical.** The biggest risk is flagging legitimate security skills as malicious because they reference attack patterns. Always evaluate intent before reporting. |
| 155 | + |
| 156 | +## Output Format |
| 157 | + |
| 158 | +```markdown |
| 159 | +## Skill Security Scan: [Skill Name] |
| 160 | + |
| 161 | +### Summary |
| 162 | +- **Findings**: X (Y Critical, Z High, ...) |
| 163 | +- **Risk Level**: Critical / High / Medium / Low / Clean |
| 164 | +- **Skill Structure**: SKILL.md only / +references / +scripts / full |
| 165 | + |
| 166 | +### Findings |
| 167 | + |
| 168 | +#### [SKILL-SEC-001] [Finding Type] (Severity) |
| 169 | +- **Location**: `SKILL.md:42` or `scripts/tool.py:15` |
| 170 | +- **Confidence**: High |
| 171 | +- **Category**: Prompt Injection / Malicious Code / Excessive Permissions / Secret Exposure / Supply Chain / Validation |
| 172 | +- **Issue**: [What was found] |
| 173 | +- **Evidence**: [code snippet] |
| 174 | +- **Risk**: [What could happen] |
| 175 | +- **Remediation**: [How to fix] |
| 176 | + |
| 177 | +### Needs Verification |
| 178 | +[Medium-confidence items needing human review] |
| 179 | + |
| 180 | +### Assessment |
| 181 | +[Safe to install / Install with caution / Do not install] |
| 182 | +[Brief justification for the assessment] |
| 183 | +``` |
| 184 | + |
| 185 | +**Risk level determination**: |
| 186 | +- **Critical**: Any high-confidence critical finding (prompt injection, credential theft, data exfiltration) |
| 187 | +- **High**: High-confidence high-severity findings or multiple medium findings |
| 188 | +- **Medium**: Medium-confidence findings or minor permission concerns |
| 189 | +- **Low**: Only best-practice suggestions |
| 190 | +- **Clean**: No findings after thorough analysis |
| 191 | + |
| 192 | +## Reference Files |
| 193 | + |
| 194 | +| File | Purpose | |
| 195 | +|------|---------| |
| 196 | +| `references/prompt-injection-patterns.md` | Injection patterns, jailbreaks, obfuscation techniques, false positive guide | |
| 197 | +| `references/dangerous-code-patterns.md` | Script security patterns: exfiltration, shells, credential theft, eval/exec | |
| 198 | +| `references/permission-analysis.md` | Tool risk tiers, least privilege methodology, common skill permission profiles | |
0 commit comments