Skip to content

Moatless Tools used in Stanford/Oxford/Google DeepMind paper #35

@JensRoland

Description

@JensRoland

Loved to see Moatless Tools used to set a new SoTA on SWE-Bench Lite by using multi-shot (active search).

Read the paper

From a related article on Medium:

“Impressively, when running DeepSeek-V2-coder, a small language model with multiple sampling, the model outperformed state-of-the-art models like GPT-4o or Claude 3.5 Sonnet, achieving a new state-of-the-art 56% in SWE-Bench Lite (a benchmark that evaluates a model’s capacity to solve GitHub issues), while these two models, combined, achieved 43% (Mixed models).”

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions