Skip to content

Commit bcaaa07

Browse files
committed
use gevals
Signed-off-by: Tomasz Janiszewski <tomek@redhat.com>
1 parent 2868f53 commit bcaaa07

17 files changed

+1273
-33
lines changed

e2e-tests/README.md

Lines changed: 65 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,60 +1,92 @@
11
# StackRox MCP E2E Testing
22

3-
This directory contains end-to-end tests for the StackRox MCP server using the [mcp-testing-framework](https://github.com/L-Qun/mcp-testing-framework).
3+
End-to-end tests for the StackRox MCP server using [gevals](https://github.com/genmcp/gevals).
44

55
## Prerequisites
66

7-
1. **OpenAI API Key**: Required for running the AI model tests
8-
- Get your key from Bitwarden
9-
10-
2. **StackRox API Token**: Required for connecting to StackRox Central
11-
- Generate from StackRox Central UI: Integrations > API Token > Generate Token
7+
- Go 1.25+
8+
- OpenAI API Key (for AI agent and LLM judge)
9+
- StackRox API Token
1210

1311
## Setup
1412

15-
### 1. Configure Environment Variables
16-
17-
Create a `.env` file with your credentials:
13+
### 1. Build gevals
1814

1915
```bash
20-
# OpenAI API key for running tests
21-
OPENAI_API_KEY=sk-your-openai-key-here
22-
23-
# StackRox API Token for accessing Central
24-
STACKROX_API_TOKEN=your-stackrox-api-token-here
16+
cd e2e-tests
17+
./scripts/build-gevals.sh
2518
```
2619

27-
### 2. Update Server Configuration (Optional)
20+
### 2. Configure Environment
2821

29-
Edit `mcp-testing-framework.yaml` if you need to change the StackRox Central URL:
22+
Create `.env` file:
3023

24+
```bash
25+
OPENAI_API_KEY=sk-your-key-here
26+
STACKROX_API_TOKEN=your-token-here
27+
```
3128

3229
## Running Tests
3330

34-
From the `e2e-tests` directory, run:
31+
```bash
32+
./scripts/run-tests.sh
33+
```
34+
35+
Results are saved to `gevals-stackrox-mcp-e2e-out.json`.
36+
37+
### View Results
3538

3639
```bash
37-
npx mcp-testing-framework@latest evaluate
40+
# Summary
41+
jq '.tasks[] | {name, passed}' gevals-stackrox-mcp-e2e-out.json
42+
43+
# Tool calls
44+
jq '.tasks[].callHistory[] | {toolName, arguments}' gevals-stackrox-mcp-e2e-out.json
3845
```
3946

40-
This will:
41-
- Spawn the StackRox MCP server in stdio mode
42-
- Run test cases against the configured AI models (GPT-5 and GPT-5-mini)
43-
- Generate a test report in the `mcp-reports/` directory
47+
## Test Cases
48+
49+
| Test | Description | Tool |
50+
|------|-------------|------|
51+
| `list-clusters` | List all clusters | `list_clusters` |
52+
| `cve-affecting-workloads` | CVE impact on deployments | `get_deployments_for_cve` |
53+
| `cve-affecting-clusters` | CVE impact on clusters | `get_clusters_for_cve` |
54+
| `cve-nonexistent` | Handle non-existent CVE | `get_clusters_for_cve` |
55+
| `cve-cluster-scooby` | CVE with cluster filter | `get_clusters_for_cve` |
56+
| `cve-cluster-maria` | CVE with cluster filter | `get_clusters_for_cve` |
57+
| `cve-clusters-general` | General CVE query | `get_clusters_for_cve` |
58+
| `cve-cluster-list` | CVE across clusters | `get_clusters_for_cve` |
59+
60+
## Configuration
61+
62+
- **`gevals/eval.yaml`**: Main test configuration, agent settings, assertions
63+
- **`gevals/mcp-config.yaml`**: MCP server configuration
64+
- **`gevals/tasks/*.yaml`**: Individual test task definitions
4465

45-
## Test Configuration
66+
## How It Works
4667

47-
The `mcp-testing-framework.yaml` file controls the test behavior:
68+
Gevals uses a proxy architecture to intercept MCP tool calls:
4869

49-
- **testRound**: Number of times each test runs (default: 3)
50-
- **passThreshold**: Minimum success rate (0.5 = 50%)
51-
- **modelsToTest**: AI models to test (currently: `gpt-5`, `gpt-5-mini`)
52-
- **testCases**: 8 test scenarios covering CVE queries and cluster listing
53-
- **mcpServers**: Server configuration using stdio transport
70+
1. AI agent receives task prompt
71+
2. Agent calls MCP tool
72+
3. Gevals proxy intercepts and records the call
73+
4. Call forwarded to StackRox MCP server
74+
5. Server executes and returns result
75+
6. Gevals validates assertions and response quality
5476

55-
## Customizing Tests
77+
## Troubleshooting
78+
79+
**Tests fail - no tools called**
80+
- Verify StackRox Central is accessible
81+
- Check API token permissions
82+
83+
**Build errors**
84+
```bash
85+
go mod tidy
86+
./scripts/build-gevals.sh
87+
```
5688

57-
### Add More Test Cases
89+
## Further Reading
5890

59-
Add new test cases to `mcp-testing-framework.yaml`:
60-
Use the JSON report to analyze which prompts work best with each model.
91+
- [Gevals Documentation](https://github.com/genmcp/gevals)
92+
- [StackRox MCP Server](../README.md)

e2e-tests/gevals/eval.yaml

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
kind: Eval
2+
metadata:
3+
name: "stackrox-mcp-e2e"
4+
config:
5+
agent:
6+
type: "builtin.openai-agent"
7+
model: "gpt-4o"
8+
llmJudge:
9+
env:
10+
baseUrlKey: JUDGE_BASE_URL
11+
apiKeyKey: JUDGE_API_KEY
12+
modelNameKey: JUDGE_MODEL_NAME
13+
mcpConfigFile: mcp-config.yaml
14+
taskSets:
15+
# Test 1: List clusters
16+
- path: tasks/list-clusters.yaml
17+
assertions:
18+
toolsUsed:
19+
- server: stackrox-mcp
20+
toolPattern: "list_clusters"
21+
minToolCalls: 1
22+
maxToolCalls: 1
23+
24+
# Test 2: CVE affecting workloads
25+
- path: tasks/cve-affecting-workloads.yaml
26+
assertions:
27+
toolsUsed:
28+
- server: stackrox-mcp
29+
toolPattern: "get_deployments_for_cve"
30+
argumentsMatch:
31+
cveName: "CVE-2021-31805"
32+
minToolCalls: 1
33+
maxToolCalls: 1
34+
35+
# Test 3: CVE affecting clusters - basic
36+
- path: tasks/cve-affecting-clusters.yaml
37+
assertions:
38+
toolsUsed:
39+
- server: stackrox-mcp
40+
toolPattern: "get_clusters_for_cve"
41+
argumentsMatch:
42+
cveName: "CVE-2016-1000031"
43+
minToolCalls: 1
44+
maxToolCalls: 3
45+
46+
# Test 4: Non-existent CVE
47+
- path: tasks/cve-nonexistent.yaml
48+
assertions:
49+
toolsUsed:
50+
- server: stackrox-mcp
51+
toolPattern: "get_clusters_for_cve"
52+
argumentsMatch:
53+
cveName: "CVE-2099-00001"
54+
minToolCalls: 1
55+
maxToolCalls: 2
56+
57+
# Test 5: CVE with specific cluster filter (scooby)
58+
- path: tasks/cve-cluster-scooby.yaml
59+
assertions:
60+
toolsUsed:
61+
- server: stackrox-mcp
62+
toolPattern: "list_clusters"
63+
- server: stackrox-mcp
64+
toolPattern: "get_clusters_for_cve"
65+
argumentsMatch:
66+
cveName: "CVE-2016-1000031"
67+
minToolCalls: 1
68+
maxToolCalls: 2
69+
70+
# Test 6: CVE with specific cluster filter (maria)
71+
- path: tasks/cve-cluster-maria.yaml
72+
assertions:
73+
toolsUsed:
74+
- server: stackrox-mcp
75+
toolPattern: "list_clusters"
76+
minToolCalls: 1
77+
maxToolCalls: 2
78+
79+
# Test 7: CVE affecting clusters - general
80+
- path: tasks/cve-clusters-general.yaml
81+
assertions:
82+
toolsUsed:
83+
- server: stackrox-mcp
84+
toolPattern: "get_clusters_for_cve"
85+
argumentsMatch:
86+
cveName: "CVE-2021-31805"
87+
minToolCalls: 1
88+
maxToolCalls: 5
89+
90+
# Test 8: CVE check with cluster list reference
91+
- path: tasks/cve-cluster-list.yaml
92+
assertions:
93+
toolsUsed:
94+
- server: stackrox-mcp
95+
toolPattern: "get_clusters_for_cve"
96+
argumentsMatch:
97+
cveName: "CVE-2024-52577"
98+
minToolCalls: 1
99+
maxToolCalls: 5

0 commit comments

Comments
 (0)