Nightly Terminal-Bench #49
nightly-terminal-bench.yml
on: schedule
Determine models to test
2s
Matrix: benchmark
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
terminal-bench-results-anthropic-claude-sonnet-4-5-20183374543
|
6.24 MB |
sha256:f21ff465975927df853963a1e4e4e64344789a8df489d7a8ce032cf1d1564db4
|
|
|
terminal-bench-results-openai-gpt-5.1-codex-20183374543
|
5.88 MB |
sha256:05907b09eef0afe23d5f0c35b35aa50a287cb9552a581fa9583b678b4ce7ba99
|
|