feat: add project onboarding script (LFXV2-1373) by bramwelt · Pull Request #82 · linuxfoundation/lfx-v1-sync-helper

bramwelt · 2026-04-07T19:05:33Z

Summary

Adds scripts/onboard_project.py to automate steps 2–5 of the new project onboarding workflow (resolve project tree, check/replay KV entries, trigger reindex, query DynamoDB)
Adds docs/onboard_new_projects_script.md documenting prerequisites, quick-start usage, and per-phase behaviour
Script uses inline uv dependencies (boto3, httpx, nats-py) — no manual install required

🤖 Generated with Claude Code

Adds scripts/onboard_project.py and accompanying docs/onboard_new_projects_script.md to automate steps 2-5 of the new project onboarding workflow. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Issue: LFXV2-1373 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Trevor Bramwell <tbramwell@linuxfoundation.org>

Copilot

Pull request overview

Adds a standalone onboarding automation script plus documentation to consolidate the “new project onboarding” operational steps into a repeatable workflow.

Changes:

Introduces scripts/onboard_project.py to resolve a project tree, replay NATS KV entries, verify mappings, reindex committees, and reindex DynamoDB-backed resources.
Adds docs/onboard_new_projects_script.md with prerequisites, usage, options, and a recommended operator workflow.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File	Description
`scripts/onboard_project.py`	New CLI script that orchestrates Phases 1–5 (API lookup, NATS KV replay/verification, committee/member reindex, DynamoDB query + KV replay).
`docs/onboard_new_projects_script.md`	New operator documentation for running the onboarding script via `uv`, including options and workflow guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T19:09:01Z

+        if cfg["parent_key_index"] is None:
+            # Parent key field is the primary key — use batch_get_item
+            keys_list = list(parent_keys)
+            for i in range(0, len(keys_list), 100):
+                batch = [{cfg["primary_key"]: k} for k in keys_list[i:i + 100]]
+                resp = self.dynamodb.batch_get_item(
+                    RequestItems={cfg["name"]: {"Keys": batch}}
+                )
+                items.extend(resp.get("Responses", {}).get(cfg["name"], []))
+                while resp.get("UnprocessedKeys"):
+                    resp = self.dynamodb.batch_get_item(
+                        RequestItems=resp["UnprocessedKeys"]
+                    )
+                    items.extend(resp.get("Responses", {}).get(cfg["name"], []))


self.dynamodb is a DynamoDB resource (boto3.resource('dynamodb')), but this code calls self.dynamodb.batch_get_item(...), which is a low-level client operation and will raise at runtime. Use boto3.client('dynamodb') (or self.dynamodb.meta.client) for batch_get_item, and ensure keys are marshalled to the expected wire format when using the client API.

Copilot · 2026-04-07T19:09:01Z

+    parser.add_argument(
+        "--fix-mappings",
+        action="store_true",
+        help="Attempt to create missing v1-mappings entries (limited support)",


The --fix-mappings flag help text says it will “Attempt to create missing v1-mappings entries”, but the implementation only prints a note and does not create mappings. Update the CLI help text to reflect actual behavior (e.g., “print guidance when mappings are missing”).

Suggested change

help="Attempt to create missing v1-mappings entries (limited support)",

help="Print guidance when v1-mappings entries are missing",

Copilot · 2026-04-07T19:09:02Z

+  5. Reindex DynamoDB resources (meetings, polls, etc.)
+
+Usage:
+  python scripts/onboard_project.py <slug> [dev|staging|prod] [options]


The module docstring usage suggests a positional environment argument (<slug> [dev|staging|prod]), but the CLI actually uses --env and only accepts a single positional slug. Update the usage text to avoid misleading users.

Suggested change

python scripts/onboard_project.py <slug> [dev|staging|prod] [options]

python scripts/onboard_project.py <slug> [--env {dev,staging,prod}] [options]

Copilot · 2026-04-07T19:09:02Z

+
+Consolidates the manual onboarding steps 2-5 into a single command:
+  2. Replay project KV entries in v1-objects to trigger reprocessing
+  3. Verify (and optionally create) v1-mappings entries


The docstring claims Phase 3 can “optionally create” v1-mappings entries, but the current implementation explicitly does not create mappings (it only reports guidance). Align the docstring wording with the actual behavior.

Suggested change

3. Verify (and optionally create) v1-mappings entries

3. Verify v1-mappings entries and report guidance for missing mappings

Copilot · 2026-04-07T19:09:02Z

+    def _fetch_all_pages(self, client: httpx.Client, url: str, params: dict) -> List[dict]:
+        """Fetch all pages from a paginated endpoint."""
+        items: List[dict] = []
+        offset = 0
+        while True:
+            p = {**params, "pageSize": PAGE_SIZE, "offset": offset}
+            resp = client.get(url, params=p, headers=self.headers)
+            resp.raise_for_status()
+            data = resp.json()
+            page = data.get("Data") or data.get("data") or []
+            if not page:
+                break
+            items.extend(page)
+            if len(page) < PAGE_SIZE:
+                break
+            offset += PAGE_SIZE
+        return items
+


LFXAPIClient._fetch_all_pages is defined but never used in this script. Consider removing it to reduce maintenance surface area, or refactor existing pagination loops to use it so there’s a single pagination implementation.

Suggested change

def _fetch_all_pages(self, client: httpx.Client, url: str, params: dict) -> List[dict]:

"""Fetch all pages from a paginated endpoint."""

items: List[dict] = []

offset = 0

while True:

p = {**params, "pageSize": PAGE_SIZE, "offset": offset}

resp = client.get(url, params=p, headers=self.headers)

resp.raise_for_status()

data = resp.json()

page = data.get("Data") or data.get("data") or []

if not page:

break

items.extend(page)

if len(page) < PAGE_SIZE:

break

offset += PAGE_SIZE

return items

Copilot · 2026-04-07T19:09:03Z

@@ -0,0 +1,161 @@
+# Onboarding Script — `scripts/onboard_project.py`
+
+Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a


This doc links to ./onboarding-new-project.md, but that file doesn’t exist in the repository (so the link is broken). Update the link target to the correct location/file name, or add the referenced manual guide file.

Suggested change

Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a

Automates steps 2–5 of onboarding a new project into a

Copilot · 2026-04-07T19:09:03Z

+# Only re-run committee and DynamoDB reindex (skip project replay and mapping check)
+uv run scripts/onboard_project.py <slug> --reindex --skip-committees
+
+# Only check mappings


The “Skipping phases” example is internally inconsistent: the comment says it will “skip project replay and mapping check” and “re-run committee and DynamoDB reindex”, but the command shown only uses --skip-committees (which skips committees) and the script has no flags to skip phases 2 or 3. Please correct the explanation and/or the example command to match what the script can actually do.

Suggested change

# Only re-run committee and DynamoDB reindex (skip project replay and mapping check)

uv run scripts/onboard_project.py <slug> --reindex --skip-committees

# Only check mappings

# Re-run project replay, mapping check, and DynamoDB reindex (skip committees)

uv run scripts/onboard_project.py <slug> --reindex --skip-committees

# Re-run project replay and mapping check only (skip committees and DynamoDB)

emsearcy · 2026-04-08T17:39:33Z

Side note: I would expect in the future that "fan-out reindexing" is not a v1-sync-helper activity — v1-sync-helper's reindexing capability, in the future, should be limited to handling iterations in the v2 API contracts of projects and committees (if we need to backfill new fields in v2 that correspond to v1 fields)—the "touch each KV entry in the replica to re-sync" makes sense for this. I do recognize the value of a centralized "fan-out" reindex as we iterative remove project filters, but I propose we remove it afterwards (project filters were a hack to begin with and the architecture was never really intended for these). Wrapper services should implement their own backfill/reindexing routines in code (rather than scripts). Let's discuss more in coming weeks.

bramwelt requested review from a team and emsearcy as code owners April 7, 2026 19:05

Copilot AI review requested due to automatic review settings April 7, 2026 19:05

Copilot started reviewing on behalf of bramwelt April 7, 2026 19:06 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add project onboarding script (LFXV2-1373) - #82

feat: add project onboarding script (LFXV2-1373)#82
bramwelt wants to merge 1 commit into
mainfrom
bramwelt/LFXV2-1373-import-projects-script

bramwelt commented Apr 7, 2026 •

edited by atlassian Bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

emsearcy commented Apr 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	help="Attempt to create missing v1-mappings entries (limited support)",
	help="Print guidance when v1-mappings entries are missing",

	python scripts/onboard_project.py <slug> [dev\|staging\|prod] [options]
	python scripts/onboard_project.py <slug> [--env {dev,staging,prod}] [options]

	3. Verify (and optionally create) v1-mappings entries
	3. Verify v1-mappings entries and report guidance for missing mappings

		@@ -0,0 +1,161 @@
		# Onboarding Script — `scripts/onboard_project.py`

		Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a

	Automates steps 2–5 of [onboarding a new project](./onboarding-new-project.md) into a
	Automates steps 2–5 of onboarding a new project into a

Uh oh!

Conversation

bramwelt commented Apr 7, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

emsearcy commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bramwelt commented Apr 7, 2026 •

edited by atlassian Bot

Loading

emsearcy commented Apr 8, 2026 •

edited

Loading