Commit c4f5cff
authored
🤖 bench: switch Terminal Bench GPT 5.2 to GPT 5.4 (#2824)
## Summary
- switch the Terminal Bench workflow defaults/examples from
`openai/gpt-5.2` to `openai/gpt-5.4`
- add GPT-5.4 leaderboard metadata while preserving the GPT-5.2 mapping
for mixed or historical artifacts
## Validation
- `make static-check`
- `python3 -m py_compile
benchmarks/terminal_bench/prepare_leaderboard_submission.py`
- targeted `python3` verification that workflow defaults now reference
GPT 5.4 and metadata preserves both GPT 5.2 and GPT 5.4 entries
---
_Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` •
Cost: `$0.36`_
<!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=0.36 -->1 parent 07767f5 commit c4f5cff
3 files changed
Lines changed: 12 additions & 3 deletions
File tree
- .github/workflows
- benchmarks/terminal_bench
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
| 13 | + | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| |||
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
102 | | - | |
| 102 | + | |
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
Lines changed: 9 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
| 105 | + | |
| 106 | + | |
105 | 107 | | |
106 | 108 | | |
107 | 109 | | |
108 | 110 | | |
109 | 111 | | |
110 | 112 | | |
111 | 113 | | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
112 | 121 | | |
113 | 122 | | |
114 | 123 | | |
| |||
0 commit comments