Skip to content

Commit 5b19c98

Browse files
janiszclaude
andcommitted
Improve LLM tool parameter guidance and add E2E testing framework
Enhanced tool descriptions and parameter schemas to better guide LLMs on when to use optional parameters and which tools to select for different query types. Added mcp-testing-framework configuration with 8 test cases covering CVE queries and cluster operations, achieving 87.5% pass rate with GPT-5 models. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent 53e4b0b commit 5b19c98

File tree

6 files changed

+172
-12
lines changed

6 files changed

+172
-12
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,7 @@
1616

1717
# Lint output
1818
/report.xml
19+
20+
# E2E tests
21+
/e2e-tests/.env
22+
/e2e-tests/mcp-reports/

e2e-tests/README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# StackRox MCP E2E Testing
2+
3+
This directory contains end-to-end tests for the StackRox MCP server using the [mcp-testing-framework](https://github.com/L-Qun/mcp-testing-framework).
4+
5+
## Prerequisites
6+
7+
1. **OpenAI API Key**: Required for running the AI model tests
8+
- Get your key from Bitwarden
9+
10+
2. **StackRox API Token**: Required for connecting to StackRox Central
11+
- Generate from StackRox Central UI: Integrations > API Token > Generate Token
12+
13+
## Setup
14+
15+
### 1. Configure Environment Variables
16+
17+
Create a `.env` file with your credentials:
18+
19+
```bash
20+
# OpenAI API key for running tests
21+
OPENAI_API_KEY=sk-your-openai-key-here
22+
23+
# StackRox API Token for accessing Central
24+
STACKROX_API_TOKEN=your-stackrox-api-token-here
25+
```
26+
27+
### 2. Update Server Configuration (Optional)
28+
29+
Edit `mcp-testing-framework.yaml` if you need to change the StackRox Central URL:
30+
31+
32+
## Running Tests
33+
34+
From the `e2e-tests` directory, run:
35+
36+
```bash
37+
npx mcp-testing-framework@latest evaluate
38+
```
39+
40+
This will:
41+
- Spawn the StackRox MCP server in stdio mode
42+
- Run test cases against the configured AI models (GPT-5 and GPT-5-mini)
43+
- Generate a test report in the `mcp-reports/` directory
44+
45+
## Test Configuration
46+
47+
The `mcp-testing-framework.yaml` file controls the test behavior:
48+
49+
- **testRound**: Number of times each test runs (default: 3)
50+
- **passThreshold**: Minimum success rate (0.5 = 50%)
51+
- **modelsToTest**: AI models to test (currently: `gpt-5`, `gpt-5-mini`)
52+
- **testCases**: 8 test scenarios covering CVE queries and cluster listing
53+
- **mcpServers**: Server configuration using stdio transport
54+
55+
## Customizing Tests
56+
57+
### Add More Test Cases
58+
59+
Add new test cases to `mcp-testing-framework.yaml`:
60+
Use the JSON report to analyze which prompts work best with each model.
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# Number of rounds for each model test execution
2+
testRound: 10
3+
4+
# Minimum threshold for passing tests (decimal between 0-1)
5+
passThreshold: 0.5
6+
7+
# List of models to test
8+
modelsToTest:
9+
- openai:gpt-5
10+
- openai:gpt-5-mini
11+
12+
testCases:
13+
- prompt: 'list my clusters'
14+
expectedOutput:
15+
serverName: 'stackrox-mcp'
16+
toolName: 'list_clusters'
17+
parameters:
18+
limit: 0
19+
offset: 0 # GPT-5 models add both parameters
20+
21+
# Note: Optional params vary between models - gpt-5 adds filterPlatform, gpt-5-mini adds includeAffectedImages
22+
- prompt: 'Is this CVE-2021-31805 affecting my workloads'
23+
expectedOutput:
24+
serverName: 'stackrox-mcp'
25+
toolName: 'get_deployments_for_cve'
26+
parameters:
27+
cveName: 'CVE-2021-31805'
28+
filterPlatform: 'USER_WORKLOAD' # Most common pattern for gpt-5
29+
30+
- prompt: 'is this CVE-2016-1000031 affecting me?'
31+
expectedOutput:
32+
serverName: 'stackrox-mcp'
33+
toolName: 'get_clusters_for_cve'
34+
parameters:
35+
cveName: 'CVE-2016-1000031'
36+
37+
- prompt: 'is this CVE-invented affecting me?'
38+
expectedOutput:
39+
serverName: 'stackrox-mcp'
40+
toolName: 'get_clusters_for_cve' # Changed: gpt-5 uses this 2/3 times
41+
parameters:
42+
cveName: 'CVE-invented'
43+
44+
- prompt: 'is this CVE-2016-1000031 affecting cluster name scooby'
45+
expectedOutput:
46+
serverName: 'stackrox-mcp'
47+
toolName: 'get_clusters_for_cve'
48+
parameters:
49+
cveName: 'CVE-2016-1000031'
50+
filterClusterId: 'scooby'
51+
52+
- prompt: 'is this CVE-2024-52577 affecting cluster name maria'
53+
expectedOutput:
54+
serverName: 'stackrox-mcp'
55+
toolName: 'get_clusters_for_cve'
56+
parameters:
57+
cveName: 'CVE-2024-52577'
58+
filterClusterId: 'maria'
59+
60+
- prompt: 'Is this CVE-2021-31805 affecting my clusters?'
61+
expectedOutput:
62+
serverName: 'stackrox-mcp'
63+
toolName: 'get_clusters_for_cve'
64+
parameters:
65+
cveName: 'CVE-2021-31805'
66+
67+
- prompt: 'is this CVE-2024-52577 affecting any of my clusters defined in my list of clusters?'
68+
expectedOutput:
69+
serverName: 'stackrox-mcp'
70+
toolName: 'get_clusters_for_cve'
71+
parameters:
72+
cveName: 'CVE-2024-52577'
73+
74+
mcpServers:
75+
- name: 'stackrox-mcp'
76+
command: 'go'
77+
args: ['run', '../cmd/stackrox-mcp/...']
78+
env:
79+
STACKROX_MCP__SERVER__TYPE: stdio
80+
STACKROX_MCP__TOOLS__VULNERABILITY__ENABLED: "true"
81+
STACKROX_MCP__TOOLS__CONFIG_MANAGER__ENABLED: "true"
82+
STACKROX_MCP__CENTRAL__URL: "staging.demo.stackrox.com"
83+
STACKROX_MCP__CENTRAL__AUTH_TYPE: "static"
84+
STACKROX_MCP__CENTRAL__API_TOKEN: "${STACKROX_API_TOKEN}"
85+
STACKROX_MCP__CENTRAL__INSECURE_SKIP_TLS_VERIFY: "true"

internal/toolsets/config/tools.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ func (t *listClustersTool) GetName() string {
6969
func (t *listClustersTool) GetTool() *mcp.Tool {
7070
return &mcp.Tool{
7171
Name: t.name,
72-
Description: "List all clusters managed by StackRox with their IDs, names, and types",
72+
Description: "List all clusters managed by StackRox with their IDs, names, and types. Use this tool to get cluster information, or when you need to map a cluster name to its cluster ID for use in other tools.",
7373
InputSchema: listClustersInputSchema(),
7474
}
7575
}
@@ -84,11 +84,11 @@ func listClustersInputSchema() *jsonschema.Schema {
8484

8585
schema.Properties["offset"].Minimum = jsonschema.Ptr(0.0)
8686
schema.Properties["offset"].Default = toolsets.MustJSONMarshal(defaultOffset)
87-
schema.Properties["offset"].Description = "Starting index for pagination (0-based)"
87+
schema.Properties["offset"].Description = "Starting index for pagination (0-based). When using pagination, always provide both offset and limit together. Default: 0."
8888

8989
schema.Properties["limit"].Minimum = jsonschema.Ptr(0.0)
9090
schema.Properties["limit"].Default = toolsets.MustJSONMarshal(defaultLimit)
91-
schema.Properties["limit"].Description = "Maximum number of clusters to return (default: 0 - unlimited)"
91+
schema.Properties["limit"].Description = "Maximum number of clusters to return. Use 0 for unlimited (default). When using pagination, always provide both limit and offset together. Default: 0."
9292

9393
return schema
9494
}

internal/toolsets/vulnerability/clusters.go

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ func (t *getClustersForCVETool) GetName() string {
6969
func (t *getClustersForCVETool) GetTool() *mcp.Tool {
7070
return &mcp.Tool{
7171
Name: t.name,
72-
Description: "Get list of clusters affected by a specific CVE",
72+
Description: "Get list of clusters affected by a specific CVE. Use this tool when asking about CVE impact on 'clusters' or general CVE impact questions. For deployment/workload-specific queries, use get_deployments_for_cve instead.",
7373
InputSchema: getClustersForCVEInputSchema(),
7474
}
7575
}
@@ -87,7 +87,10 @@ func getClustersForCVEInputSchema() *jsonschema.Schema {
8787
schema.Required = []string{"cveName"}
8888

8989
schema.Properties["cveName"].Description = "CVE name to filter clusters (e.g., CVE-2021-44228)"
90-
schema.Properties["filterClusterId"].Description = "Optional cluster ID to verify if a specific cluster is affected"
90+
schema.Properties["filterClusterId"].Description =
91+
"Optional cluster ID or cluster name to verify if a specific cluster is affected. " +
92+
"When the query mentions 'cluster name X', use this parameter with the value 'X'. " +
93+
"The cluster ID can be either the actual cluster ID or the cluster name."
9194

9295
return schema
9396
}

internal/toolsets/vulnerability/deployments.go

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ func (t *getDeploymentsForCVETool) GetName() string {
9393
func (t *getDeploymentsForCVETool) GetTool() *mcp.Tool {
9494
return &mcp.Tool{
9595
Name: t.name,
96-
Description: "Get list of deployments affected by a specific CVE",
96+
Description: "Get detailed list of deployments (workloads/applications) affected by a specific CVE. Use this tool when the query specifically asks about 'workloads', 'deployments', 'applications', or needs deployment-level details. For general CVE impact or cluster-level queries, use get_clusters_for_cve instead.",
9797
InputSchema: getDeploymentsForCVEInputSchema(),
9898
}
9999
}
@@ -111,12 +111,19 @@ func getDeploymentsForCVEInputSchema() *jsonschema.Schema {
111111
schema.Required = []string{"cveName"}
112112

113113
schema.Properties["cveName"].Description = "CVE name to filter deployments (e.g., CVE-2021-44228)"
114-
schema.Properties["filterClusterId"].Description = "Optional cluster ID to filter deployments"
115-
schema.Properties["filterNamespace"].Description = "Optional namespace to filter deployments"
114+
schema.Properties["filterClusterId"].Description =
115+
"Optional cluster ID to filter deployments. " +
116+
"Use this when the query mentions a specific cluster name - you may need to call list_clusters first to get the cluster ID from the cluster name."
117+
schema.Properties["filterNamespace"].Description =
118+
"Optional namespace to filter deployments. " +
119+
"Use this when the query mentions a specific namespace."
116120

117121
schema.Properties["filterPlatform"].Description =
118-
fmt.Sprintf("Optional platform filter: %s=no filter, %s=user workload deployments, %s=platform deployments",
119-
filterPlatformNoFilter, filterPlatformUserWorkload, filterPlatformPlatform)
122+
fmt.Sprintf("Optional platform filter to distinguish deployment types: %s=no filter (default), %s=user workload deployments, %s=platform/infrastructure deployments. "+
123+
"Use %s when the query specifically asks about 'workloads', 'applications', or 'user deployments'. "+
124+
"Leave unset (defaults to %s) for general queries.",
125+
filterPlatformNoFilter, filterPlatformUserWorkload, filterPlatformPlatform,
126+
filterPlatformUserWorkload, filterPlatformNoFilter)
120127
schema.Properties["filterPlatform"].Default = toolsets.MustJSONMarshal(filterPlatformNoFilter)
121128
schema.Properties["filterPlatform"].Enum = []any{
122129
filterPlatformNoFilter,
@@ -125,8 +132,9 @@ func getDeploymentsForCVEInputSchema() *jsonschema.Schema {
125132
}
126133

127134
schema.Properties["includeAffectedImages"].Description =
128-
"Whether to include affected image names for each deployment.\n" +
129-
"WARNING: This may significantly increase response time."
135+
"Whether to include affected image names for each deployment. " +
136+
"Only set to true when the query specifically asks for image names or image details. " +
137+
"WARNING: This may significantly increase response time. Default: false."
130138
schema.Properties["includeAffectedImages"].Default = toolsets.MustJSONMarshal(false)
131139

132140
schema.Properties["cursor"].Description = "Cursor for next page provided by server"

0 commit comments

Comments
 (0)