Skip to content

CI: Wire vally eval invocation to a curated medium-size suite via .vally.yaml #1920

@wbreza

Description

@wbreza

Context

Follow-up to #1912, which migrated Waza eval references to Vally. That PR deliberately wired .github/workflows/eval.yml to a single hardcoded --eval-spec path to keep scope small and get CI green.

Goal

Replace the single --eval-spec flag with --suite <name> driven by .vally.yaml suite config, so CI runs a curated set of eval specs (not just one).

Target

  • Inner loop (local dev): vally-cli eval --eval-spec <one> — fast iteration on a single spec.
  • Medium loop (CI): vally-cli eval --suite pr — curated set targeting 5–10 minute wall-clock runtime.
  • Outer loop (nightly / manual): vally-cli eval --suite full or individual dispatches — full coverage.

Prereqs

Acceptance

  • .vally.yaml defines pr and full suites with explicit tag filters
  • .github/workflows/eval.yml Run evaluations step uses --suite pr
  • CI wall-clock time for the pr suite is 5–10 minutes
  • Results artifact still uploads at the standard path

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions