Skip to content

feat: add backfill script (CM-1218)#4193

Open
ulemons wants to merge 1 commit into
feat/stewardship-tablesfrom
feat/backfill-stewardship-script
Open

feat: add backfill script (CM-1218)#4193
ulemons wants to merge 1 commit into
feat/stewardship-tablesfrom
feat/backfill-stewardship-script

Conversation

@ulemons

@ulemons ulemons commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds stewardship tables and a backfill script to seed the initial state required by the OSSPREY Self Serve program (v1). In v1 the stewardship program is read-only: every critical package gets one unassigned row in the new stewardships table. Write flows (claim, assign, status transitions) land in v2. The backfill script is the one-time (and safely re-runnable) job that populates those rows for the ~358K currently-critical packages.

Changes

  • Migration V1781094067__stewardship-tables.sql — creates stewardships and five satellite tables (stewardship_stewards, stewardship_activity, stewardship_assessments, stewardship_findings, stewardship_remediation_actions). Only stewardships is populated in v1; the rest are schema-only.

  • services/libs/data-access-layer/src/osspckgs/stewardships.ts — two DAL query functions: listCriticalPackagesWithoutStewardship (cursor-paginated LEFT JOIN anti-join) and insertUnassignedStewardships (batch INSERT with ON CONFLICT DO NOTHING and is_critical re-check at insert time to guard against concurrent criticality flips).

  • packages_worker/src/stewardship/runStewardshipBackfill.ts — idempotent loop over DAL functions; cursor-based pagination by package.id; supports graceful shutdown via isStopping callback designed for future Temporal activity wiring.

  • packages_worker/src/bin/stewardship-backfill.ts — entry point; validates STEWARDSHIP_BACKFILL_BATCH_SIZE env var (fails fast on NaN/non-positive); SIGINT/SIGTERM handled gracefully.

  • package.json — adds backfill:stewardship and backfill:stewardship:local npm scripts (mirrors backfill:maven:local pattern).

  • backend/src/api/public/v1/packages/types.ts — extracts StewardshipStatus, Lifecycle, SeverityLevel, OpenVulns, Steward, StewardshipSummary into a shared types file; removes inline duplicates from batchGetStewardship.ts.

  • mockData.ts / openapi.yaml — adds stewardship block to MockPackageDetail; fixes steward → stewards field rename; adds openVulns to OpenAPI required fields; adds PackageDetail.stewardship schema.

    Type of change

    • Bug fix
    • New feature
    • Refactor / cleanup
    • Performance improvement
    • Chore / dependency update
    • Documentation

JIRA ticket

ticket


Note

Medium Risk
Bulk inserts into production stewardships for ~358K packages; idempotent SQL limits duplicate risk, but operational mistakes or wrong DB/env could still affect package stewardship data.

Overview
Adds a one-time, re-runnable backfill that seeds stewardships rows (unassigned, auto_imported) for every critical package missing stewardship data, supporting OSSPREY Self Serve v1 read-only state.

The data-access layer gains cursor-paginated listCriticalPackagesWithoutStewardship and batch insertUnassignedStewardships with ON CONFLICT DO NOTHING and an is_critical re-check at insert time. packages_worker wires runStewardshipBackfill (batch loop + optional isStopping for graceful shutdown) and a stewardship-backfill bin script with STEWARDSHIP_BACKFILL_BATCH_SIZE validation and SIGINT/SIGTERM handling. backfill:stewardship and backfill:stewardship:local npm scripts mirror the existing maven backfill pattern.

Reviewed by Cursor Bugbot for commit 399b9ba. Bugbot is set up for automated code reviews on this repo. Configure here.

Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
@ulemons ulemons self-assigned this Jun 10, 2026
Copilot AI review requested due to automatic review settings June 10, 2026 16:13
@ulemons ulemons changed the title feat: add backfill script feat: add backfill script (CM-1218) Jun 10, 2026

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 399b9ba. Configure here.

inserted += batchInserted
skipped += batchSkipped
batches++
lastId = ids[ids.length - 1]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor skips newly critical packages

Medium Severity

The backfill advances lastId and only lists packages with p.id > afterId. Critical packages whose is_critical flips to true after that id was passed are never selected in that run, yet the loop still exits when no higher ids remain. Those packages stay without a stewardships row until the job is run again from the start.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 399b9ba. Configure here.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the initial “stewardship” backfill capability for OSSPREY Self Serve v1 by introducing DAL queries to find critical packages missing stewardship rows, plus a packages-worker script/runner to insert those rows in idempotent batches.

Changes:

  • Add @crowd/data-access-layer stewardship DAL functions to (a) page through critical packages lacking stewardship rows and (b) batch-insert unassigned/auto_imported stewardships with ON CONFLICT DO NOTHING.
  • Add a packages-worker backfill runner with cursor pagination + batch-level logging and a CLI entrypoint with SIGINT/SIGTERM graceful shutdown.
  • Add pnpm scripts to run the stewardship backfill (including a :local variant mirroring existing patterns).

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
services/libs/data-access-layer/src/osspckgs/stewardships.ts Adds cursor-paginated selector for missing stewardships + idempotent batch insert with criticality re-check.
services/libs/data-access-layer/src/osspckgs/index.ts Re-exports stewardship DAL module from the osspckgs index.
services/libs/data-access-layer/src/index.ts Re-exports stewardship DAL module from the package root.
services/apps/packages_worker/src/stewardship/runStewardshipBackfill.ts Implements the batched backfill loop using the new DAL functions, with graceful-stop support.
services/apps/packages_worker/src/bin/stewardship-backfill.ts Adds CLI entrypoint: connects to packages-db, validates batch size env var, handles shutdown signals.
services/apps/packages_worker/package.json Adds backfill:stewardship and backfill:stewardship:local scripts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants