Nightly Terminal-Bench #46
nightly-terminal-bench.yml
on: schedule
Determine models to test
2s
Matrix: benchmark
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
terminal-bench-results-anthropic-claude-sonnet-4-5-20082545805
|
6.86 MB |
sha256:ada9fd6697fc1e63960e832f3c8b1ca10c1678f53bd4a38a59fad8b204682de3
|
|
|
terminal-bench-results-openai-gpt-5.1-codex-20082545805
|
4.97 MB |
sha256:7c98a9fa62ea7e27736ee05f90d9b99f216d19accbf08f87c4fd97808415820c
|
|