Would love to see more comparisons with some of the 'other' options, especially those in OpenCode Go.
Even Claude itself is giving props to MiniMax M2.5:
| Model |
SWE-bench |
Strength vs Claude |
Weakness |
| MiniMax M2.5 |
80.2% |
Cheaper, faster |
Less reasoning depth |
| GLM-5 |
77.8% |
Less hallucination |
Slower, heavier |
| Kimi K2.5 |
~77% |
Big context, multimodal |
Hallucinates more |
| Nemotron 3 Super |
Competitive |
Efficient, self-hostable |
Smaller, less capable |
| Claude Sonnet 4.6 |
~80%+ |
Best consistency + tool use |
Costs more via API |
Would love to see more comparisons with some of the 'other' options, especially those in OpenCode Go.
Even Claude itself is giving props to MiniMax M2.5: