Conversation
This RFC proposes adding client hook management to ToolHive, enabling OpenTelemetry-based observability for agent skill execution. Key features: - Client shim architecture to normalize different hook formats - Primary support for Claude Code and Cursor - Standalone mode (transitional) and server-managed mode (target) - Enterprise deployment via config file + server auto-start - Integration with THV-0034 long-running server architecture Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
07214b7 to
a5db3e8
Compare
|
|
||
| ## Summary | ||
|
|
||
| This RFC proposes adding client hook management to ToolHive CLI, enabling OpenTelemetry-based observability for agent skill execution. ToolHive will install and manage hooks in supported AI clients that capture skill invocation telemetry and forward it to the existing `pkg/telemetry/` infrastructure for OTLP export. A client-specific shim architecture normalizes the different hook formats into a unified telemetry pipeline. Primary support targets Claude Code and Cursor, with Windsurf and Cline as stretch goals. |
There was a problem hiding this comment.
So how some plugins are written mean we can capture more than just skill execution - mcp server user, tool use, which bash commands, etc. My initial idea was to capture all of it and the dashboards / UI will filter to skill usage.
Is there any reason to consider being more restrictive so we only install a hook just for skills. Or let a user configure it.
There was a problem hiding this comment.
It was just to scope down the work. I agree this can expand to more things
|
|
||
| | Client | Priority | Hooks API | Session ID | Tool Name Location | Response Format | Platform | | ||
| |--------|----------|-----------|------------|-------------------|-----------------|----------| | ||
| | **Claude Code** | Primary | Full (12 events) | `session_id` | `tool_name` | Exit code only | All | |
There was a problem hiding this comment.
Claude code can do exit code. They also support json decision control like cursor
https://code.claude.com/docs/en/hooks#pretooluse-decision-control
| ``` | ||
|
|
||
| **Capabilities:** | ||
| - **Centralized configuration**: IT deploys `config.yml` via MDM; server handles the rest |
There was a problem hiding this comment.
To make sure I understand, all admins have to do is get toolhive installed via MDM and push some config file with the hook info via MDM. Then, toolhive acts as the process that installs it and has some enforcement policy in case the user removes it.
This is far simpler than a more naive MDM solution where the admins have to build the hooks config, the install script, and the verification process to enforce it is installed.
Is that a correct understanding?
| - **Centralized configuration**: IT deploys `config.yml` via MDM; server handles the rest | ||
| - **Auto-installation**: Server installs hooks on startup based on configuration | ||
| - **Drift detection**: Server watches client config files for unauthorized changes | ||
| - **Auto-remediation**: If `enforce: true`, server re-installs hooks when users remove them |
There was a problem hiding this comment.
Will we ned any alerting here in case admins want to know of violations? I assume this would be secondary to this main thrust.
| # Install hooks automatically when server starts | ||
| auto_install: true | ||
|
|
||
| # Re-install hooks if user removes them (requires file watcher) |
There was a problem hiding this comment.
This may be a naive question but can file watcher handle granular changes. ~/.claude/settings.json contains both toolhive hooks that we install and possible user hooks they want to add. For each hook event, the user can have a list of hooks.
So, if the user edits this file and removes one of their personal hooks, that's fine. We do nothing. If they remove our hook, we get grumpy.
|
|
||
| | Method | Endpoint | Description | | ||
| |--------|----------|-------------| | ||
| | `POST` | `/api/v1beta/hooks/install` | Install hooks for specified client(s) | |
There was a problem hiding this comment.
Is this an all or nothing install. If we have both observability hooks for skills and, for example, access control hooks that prevent non-toolhive skills from being used, will these endpoints support only installing one. Or is it "install all stacklok hooks".
| skill.name: commit-message | ||
| skill.version: v1.2.0 | ||
| skill.client: claude|cursor|windsurf|cline | ||
| skill.status: success|failure|denied |
There was a problem hiding this comment.
Denials are a bit of a different animal in Claude Code...I think. We can loosely capture that based on if there's a PreToolUse that does not have a matching PostToolUse.
I'll post a follow on message with more details after I investigate more.
There was a problem hiding this comment.
Okay so the above shows you two flows (look at the session ID for which flow) where the model asks for permission to use a Skill. The first one is PreToolUse -> PermissionRequest -> PostToolUse with the same ID within the same time. This means I approved the request.
The second is PreToolUse -> PermissionRequest because I didn't approve. I don't have PostToolUse.
So in Claude, you don't get good denial information.
Via Claude's otel logging you can do a bit better by seeing "Hey this Skill was rejected by the user at this time and session" (see below). Doesn't really help us here as we're doing hooks but for what it's worth.
As an aside, I think Anthropic changed their policies around skills, and that they are basically in an auto-approve mode to read a skill unless you have explicit permissions for skills to be in ask or deny mode. I never get prompted for skill use anymore in the last week or so.
| ### Data Security | ||
|
|
||
| - **Skill inputs not logged**: Only skill name, version, status, and timing are captured | ||
| - **No sensitive data in metrics**: Metrics contain only skill metadata, not content |
There was a problem hiding this comment.
Some companies consider skill input prompt that comes with some skills as sensitive. It'd be in the args. But my overall opinion is to just be very very clear with what we are logging but not change anything yet.
There was a problem hiding this comment.
Oh lol you kind of address this with the next bullet.
| { | ||
| "version": 1, | ||
| "hooks": { | ||
| "beforeMCPExecution": [ |
There was a problem hiding this comment.
nit: most of the configurations after claude code pertain to MCP servers, not skills.
|
|
||
| ## Open Questions | ||
|
|
||
| 1. **Skill name extraction**: How do we reliably identify skill invocations vs regular MCP tool calls across different clients? The tool name is typically "Skill" but we need to extract the actual skill name from `tool_input`. Need to validate extraction logic with real skill invocations across all clients. |
There was a problem hiding this comment.
We also need to verify we can get skill usage from other clients. I bet you can for cursor. Would need to investigte the others.
|
|
||
| 1. **Skill name extraction**: How do we reliably identify skill invocations vs regular MCP tool calls across different clients? The tool name is typically "Skill" but we need to extract the actual skill name from `tool_input`. Need to validate extraction logic with real skill invocations across all clients. | ||
|
|
||
| 2. **THV-0034 dependency**: Should this RFC block on THV-0034 (long-running server), or should we implement standalone mode first and migrate later? Standalone mode has significant limitations for enterprise use. |
There was a problem hiding this comment.
My read is wiggle now. The standalone server may not be until March. We could start to use this internally in standalone mode before then.
Summary
This RFC proposes adding client hook management to ToolHive CLI, enabling OpenTelemetry-based observability for agent skill execution.
ToolHive will install and manage hooks in supported AI clients that capture skill invocation telemetry and forward it to the existing
pkg/telemetry/infrastructure for OTLP export.Key Design Decisions
config.ymldeployed by IT; developers don't need to run commandsDependencies
Test plan
pkg/api/patterns🤖 Generated with Claude Code