Nightly Terminal-Bench #53
nightly-terminal-bench.yml
on: schedule
Determine models to test
4s
Matrix: benchmark
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
terminal-bench-results-anthropic-claude-opus-4-5-20286826812
|
5.97 MB |
sha256:ee91169adf60d246513961501e3fe9dfaf61ec4a778bfe65b614f2fbb9e28126
|
|
|
terminal-bench-results-openai-gpt-5.2-20286826812
|
8.56 MB |
sha256:7326b7d14ab88747905f6e5860dc4e04f20cef1552c3fce43810d2e0c9bada94
|
|