Skip to content

feat: add project onboarding script (LFXV2-1373)#82

Open
bramwelt wants to merge 1 commit intomainfrom
bramwelt/LFXV2-1373-import-projects-script
Open

feat: add project onboarding script (LFXV2-1373)#82
bramwelt wants to merge 1 commit intomainfrom
bramwelt/LFXV2-1373-import-projects-script

Conversation

@bramwelt
Copy link
Copy Markdown
Contributor

@bramwelt bramwelt commented Apr 7, 2026

Summary

  • Adds scripts/onboard_project.py to automate steps 2–5 of the new project onboarding workflow (resolve project tree, check/replay KV entries, trigger reindex, query DynamoDB)
  • Adds docs/onboard_new_projects_script.md documenting prerequisites, quick-start usage, and per-phase behaviour
  • Script uses inline uv dependencies (boto3, httpx, nats-py) — no manual install required

Jira: LFXV2-1373

🤖 Generated with Claude Code

Adds scripts/onboard_project.py and accompanying
docs/onboard_new_projects_script.md to automate
steps 2-5 of the new project onboarding workflow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Issue: LFXV2-1373
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>
@bramwelt bramwelt requested review from a team and emsearcy as code owners April 7, 2026 19:05
Copilot AI review requested due to automatic review settings April 7, 2026 19:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a standalone onboarding automation script plus documentation to consolidate the “new project onboarding” operational steps into a repeatable workflow.

Changes:

  • Introduces scripts/onboard_project.py to resolve a project tree, replay NATS KV entries, verify mappings, reindex committees, and reindex DynamoDB-backed resources.
  • Adds docs/onboard_new_projects_script.md with prerequisites, usage, options, and a recommended operator workflow.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
scripts/onboard_project.py New CLI script that orchestrates Phases 1–5 (API lookup, NATS KV replay/verification, committee/member reindex, DynamoDB query + KV replay).
docs/onboard_new_projects_script.md New operator documentation for running the onboarding script via uv, including options and workflow guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +479 to +492
if cfg["parent_key_index"] is None:
# Parent key field is the primary key — use batch_get_item
keys_list = list(parent_keys)
for i in range(0, len(keys_list), 100):
batch = [{cfg["primary_key"]: k} for k in keys_list[i:i + 100]]
resp = self.dynamodb.batch_get_item(
RequestItems={cfg["name"]: {"Keys": batch}}
)
items.extend(resp.get("Responses", {}).get(cfg["name"], []))
while resp.get("UnprocessedKeys"):
resp = self.dynamodb.batch_get_item(
RequestItems=resp["UnprocessedKeys"]
)
items.extend(resp.get("Responses", {}).get(cfg["name"], []))
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.dynamodb is a DynamoDB resource (boto3.resource('dynamodb')), but this code calls self.dynamodb.batch_get_item(...), which is a low-level client operation and will raise at runtime. Use boto3.client('dynamodb') (or self.dynamodb.meta.client) for batch_get_item, and ensure keys are marshalled to the expected wire format when using the client API.

Copilot uses AI. Check for mistakes.
parser.add_argument(
"--fix-mappings",
action="store_true",
help="Attempt to create missing v1-mappings entries (limited support)",
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --fix-mappings flag help text says it will “Attempt to create missing v1-mappings entries”, but the implementation only prints a note and does not create mappings. Update the CLI help text to reflect actual behavior (e.g., “print guidance when mappings are missing”).

Suggested change
help="Attempt to create missing v1-mappings entries (limited support)",
help="Print guidance when v1-mappings entries are missing",

Copilot uses AI. Check for mistakes.
5. Reindex DynamoDB resources (meetings, polls, etc.)

Usage:
python scripts/onboard_project.py <slug> [dev|staging|prod] [options]
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module docstring usage suggests a positional environment argument (<slug> [dev|staging|prod]), but the CLI actually uses --env and only accepts a single positional slug. Update the usage text to avoid misleading users.

Suggested change
python scripts/onboard_project.py <slug> [dev|staging|prod] [options]
python scripts/onboard_project.py <slug> [--env {dev,staging,prod}] [options]

Copilot uses AI. Check for mistakes.

Consolidates the manual onboarding steps 2-5 into a single command:
2. Replay project KV entries in v1-objects to trigger reprocessing
3. Verify (and optionally create) v1-mappings entries
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring claims Phase 3 can “optionally create” v1-mappings entries, but the current implementation explicitly does not create mappings (it only reports guidance). Align the docstring wording with the actual behavior.

Suggested change
3. Verify (and optionally create) v1-mappings entries
3. Verify v1-mappings entries and report guidance for missing mappings

Copilot uses AI. Check for mistakes.
Comment on lines +115 to +132
def _fetch_all_pages(self, client: httpx.Client, url: str, params: dict) -> List[dict]:
"""Fetch all pages from a paginated endpoint."""
items: List[dict] = []
offset = 0
while True:
p = {**params, "pageSize": PAGE_SIZE, "offset": offset}
resp = client.get(url, params=p, headers=self.headers)
resp.raise_for_status()
data = resp.json()
page = data.get("Data") or data.get("data") or []
if not page:
break
items.extend(page)
if len(page) < PAGE_SIZE:
break
offset += PAGE_SIZE
return items

Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LFXAPIClient._fetch_all_pages is defined but never used in this script. Consider removing it to reduce maintenance surface area, or refactor existing pagination loops to use it so there’s a single pagination implementation.

Suggested change
def _fetch_all_pages(self, client: httpx.Client, url: str, params: dict) -> List[dict]:
"""Fetch all pages from a paginated endpoint."""
items: List[dict] = []
offset = 0
while True:
p = {**params, "pageSize": PAGE_SIZE, "offset": offset}
resp = client.get(url, params=p, headers=self.headers)
resp.raise_for_status()
data = resp.json()
page = data.get("Data") or data.get("data") or []
if not page:
break
items.extend(page)
if len(page) < PAGE_SIZE:
break
offset += PAGE_SIZE
return items

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,161 @@
# Onboarding Script — `scripts/onboard_project.py`

Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc links to ./onboarding-new-project.md, but that file doesn’t exist in the repository (so the link is broken). Update the link target to the correct location/file name, or add the referenced manual guide file.

Suggested change
Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a
Automates steps 2–5 of onboarding a new project into a

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +151
# Only re-run committee and DynamoDB reindex (skip project replay and mapping check)
uv run scripts/onboard_project.py <slug> --reindex --skip-committees

# Only check mappings
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Skipping phases” example is internally inconsistent: the comment says it will “skip project replay and mapping check” and “re-run committee and DynamoDB reindex”, but the command shown only uses --skip-committees (which skips committees) and the script has no flags to skip phases 2 or 3. Please correct the explanation and/or the example command to match what the script can actually do.

Suggested change
# Only re-run committee and DynamoDB reindex (skip project replay and mapping check)
uv run scripts/onboard_project.py <slug> --reindex --skip-committees
# Only check mappings
# Re-run project replay, mapping check, and DynamoDB reindex (skip committees)
uv run scripts/onboard_project.py <slug> --reindex --skip-committees
# Re-run project replay and mapping check only (skip committees and DynamoDB)

Copilot uses AI. Check for mistakes.
@emsearcy
Copy link
Copy Markdown
Contributor

emsearcy commented Apr 8, 2026

Side note: I would expect in the future that "fan-out reindexing" is not a v1-sync-helper activity — v1-sync-helper's reindexing capability, in the future, should be limited to handling iterations in the v2 API contracts of projects and committees (if we need to backfill new fields in v2 that correspond to v1 fields)—the "touch each KV entry in the replica to re-sync" makes sense for this. I do recognize the value of a centralized "fan-out" reindex as we iterative remove project filters, but I propose we remove it afterwards (project filters were a hack to begin with and the architecture was never really intended for these). Wrapper services should implement their own backfill/reindexing routines in code (rather than scripts). Let's discuss more in coming weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants