Skip to content

act-based workflow smoke tests + sandbox infrastructure #293

@mmcdermott

Description

@mmcdermott

Follow-up to PR #291's testing discussion (the workflow paths are only verified by firing for real).

actionlint is now landing via #292 — static analysis is Layer 1. This issue tracks the more substantive testing infrastructure.

Goals

Confirm upload_benchmark_result.yaml + aggregate_benchmark_results.yaml + regenerate_entities.yaml work end-to-end without writing to the real _results / _web branches, so we can iterate on workflow logic without polluting production state.

Proposal

Layer 2 — act smoke tests in CI

act runs GitHub Actions workflows locally via Docker. Supports workflow_call. Add a CI job that:

  1. Synthesizes an issues.labeled event payload (committed to the repo as e.g. .github/test_events/issue_labeled.json).
  2. Runs the workflow under act with MEDS_DEV_DRY_RUN=1.
  3. Asserts the workflow completes without error.

Workflow changes needed:

  • Gate the actual git push origin _results / git push origin _web steps behind if: env.MEDS_DEV_DRY_RUN != '1' (or equivalent).
  • Optionally: emit a structured "would have pushed: " line on dry runs so the test can assert the intended side-effects without performing them.

Catches:

  • Wrong action versions / typos / missing inputs (already covered by actionlint, but redundant safety doesn't hurt).
  • workflow_call plumbing — job dependencies, permission inheritance, output passing.
  • Issue-extraction → validate → would-push flow against a synthetic event payload.
  • Most expression-language bugs.

Doesn't catch:

  • Token-scoping issues (act uses a different auth model).
  • Real GitHub branch-protection rules.
  • Network failure modes (rate limits, transient GitHub API errors).

Layer 3 — sandbox repo / branches (manual, periodic)

For what act can't verify, maintain a separate sandbox where the real workflows can be fired without polluting production. Two options:

  • Separate repo: clone of MEDS-DEV used only for workflow shakeouts. Pro: complete isolation. Con: drift from main repo state.
  • test_* branch prefix in the same repo: workflows accept an input/env var that swaps _resultstest_results and _webtest_web. Pro: same repo state. Con: more workflow complexity, easier to accidentally cross streams.

Probably start with the separate-repo approach.

Layer 4 (stretch) — full integration with a docker-compose stack

If we ever wanted to be really thorough: spin up a local GitHub Enterprise emulator or use a self-hosted runner in a controlled environment. Probably overkill for the scale of MEDS-DEV.

Acceptance criteria

  • MEDS_DEV_DRY_RUN (or equivalent) gating is in the three workflows, with documentation in src/MEDS_DEV/web/README.md.
  • .github/test_events/*.json payload fixtures committed.
  • CI job runs act against the synthetic payloads on every PR that touches .github/workflows/**.
  • Sandbox-repo runbook documented (where it lives, how to refresh it, expected manual cadence).

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    TestingAssociated with testing &/or CI practices to ensure validityWebsite / BrandingFor website/branding/tutorial content issues (beyond pure technical documentation)priority:mediumMedium priority; should be triaged for inclusion in near-term releases.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions