@@ -69,120 +69,92 @@ the SeqSet version. The link is by `(seqset_id, seqset_version)`, so **no minted
6969> realm role. Recommend logging in as the existing dev superuser (` superuser ` /` superuser ` , created
7070> when ` createTestAccounts: true ` ) for that one call, while the submit/seqset steps use the seed user.
7171
72- ## Component shape
72+ ## Component shape (as implemented)
7373
74- A Kubernetes ** Job** (not a long-running Deployment) gated on a new dev-only value. Built from the
75- ` integration-tests/ ` image (Playwright + node_modules already present) with a non-test entrypoint.
74+ The seeder is a ** Playwright project** (` seed ` ) running a single setup file, packaged in a new
75+ ` integration-tests ` image and run in-cluster by a Helm-hook ** Job** . No bespoke Node entrypoint and
76+ no DB access — it drives the existing page objects and calls the new citation endpoint over HTTP.
7677
7778```
7879integration-tests/
79- seed/
80- SPEC.md <- this file
81- seed.ts <- standalone entrypoint (launches chromium, composes page objects, then pg insert)
82- Dockerfile <- (new or extended) builds an image usable as both test-runner and seeder
80+ seed/SPEC.md <- this file
81+ tests/seed.setup.ts <- the seed setup (reuses page objects; the whole flow)
82+ playwright.config.ts <- adds the RUN_SEED-gated `seed` project
83+ package.json <- `npm run seed` => RUN_SEED=true playwright test --project=seed
84+ Dockerfile <- mcr.microsoft.com/playwright image; CMD ["npm","run","seed"]
85+ kubernetes/loculus/templates/seed-test-data-job.yaml <- Helm-hook Job, gated on seedTestData.enabled
8386```
8487
85- ` seed.ts ` outline (all calls are existing page-object methods unless noted):
86-
87- ``` ts
88- const browser = await chromium .launch ({ headless: true });
89- const page = await browser .newPage ({ baseURL: process .env .PLAYWRIGHT_TEST_BASE_URL });
90-
91- // idempotency: bail if the seed user already exists (login succeeds)
92- if (await new AuthPage (page ).login (SEED_USER , SEED_PW )) { log (' already seeded' ); process .exit (0 ); }
93-
94- await new AuthPage (page ).createAccount (seedAccount );
95- const groupId = await new GroupPage (page ).createGroup (buildTestGroup (' seed-group' ));
96-
97- const accessions: string [] = [];
98- for (const s of SEED_SEQUENCES ) { // ~3 sequences
99- const review = await submissionPage .completeSubmission (
100- { ... s , groupId: String (groupId ) }, s .sequenceData ); // dummy-organism form
101- await review .waitForAllProcessed (); // dummy pipeline runs in-cluster
102- await review .releaseAndGoToReleasedSequences ();
103- accessions .push (await readAccession (page )); // small helper (parse released table/URL)
104- }
105-
106- const { seqSetId, seqSetVersion } = // createSeqSet returns id+version (parse from URL)
107- await new SeqSetPage (page ).createSeqSet ({
108- name: ' Seed SeqSet' , description: ' Auto-seeded for dev' ,
109- focalAccessions: [accessions [0 ]], backgroundAccessions: accessions .slice (1 ),
110- });
111-
112- // citation: call the superuser-only endpoint with a super-user token
113- const superUserToken = await getToken (' superuser' , ' superuser' ); // keycloak password grant
114- await page .request .post (` ${BACKEND_URL }/create-curated-citation ` , {
115- headers: { authorization: ` Bearer ${superUserToken } ` },
116- data: {
117- seqSetId , seqSetVersion ,
118- source: {
119- sourceDOI: ' 10.0000/seed-citation-1' , title: ' Seed reference publication' ,
120- year: 2024 , contributors: [{ givenName: ' Ada' , surname: ' Lovelace' }],
121- },
122- },
123- });
124- await browser .close ();
125- ```
126-
127- Two small additions to the page-object layer are needed (both trivial, reusable by future tests):
128- - ` SeqSetPage.createSeqSet ` should return ` { seqSetId, seqSetVersion } ` (parse from the post-create URL).
129- - a ` readAccession(page) ` helper to pull the accession of a just-released sequence.
130-
131- ## Kubernetes wiring
132-
133- New template ` kubernetes/loculus/templates/seed-test-data-job.yaml ` :
134-
135- - ` kind: Job ` , gated: ` {{- if .Values.seedTestData.enabled }} ` (whole file).
136- - Image: ` ghcr.io/loculus-project/integration-tests:{{ $dockerTag }} ` (new image built in CI from
137- ` integration-tests/Dockerfile ` ), ` command: ["node", "seed/seed.js"] ` .
138- - Env:
139- - ` PLAYWRIGHT_TEST_BASE_URL: http://loculus-website-service:3000 ` (verified service name,
140- ` templates/website-service.yaml ` ).
141- - ` DB_URL ` / ` DB_USERNAME ` / ` DB_PASSWORD ` from the ` database ` secret (same refs as backend).
142- - ** Ordering / readiness:** website + backend + dummy-preprocessing must be up before it runs.
143- Two viable mechanisms (pick one):
144- 1 . ** ArgoCD PostSync hook** (mirror ` templates/ingest.yaml:127 ` ` loculus-ingest-trigger ` ):
145- ` argocd.argoproj.io/hook: PostSync ` , ` backoffLimit ` , ` ttlSecondsAfterFinished: 600 ` .
146- Cleanest fit with how this repo already bootstraps post-deploy work.
147- 2 . Plain Job + an init-container that curls ` …/website ` and ` …/backend ` health until ready.
148- > ** Recommendation:** PostSync hook (option 1) — consistent with ` ingest-trigger ` .
149- - ` backoffLimit: 1 ` , ` ttlSecondsAfterFinished: 600 ` , ` restartPolicy: Never ` .
88+ ` tests/seed.setup.ts ` runs everything ** as the dev super user** (` superuser ` /` superuser ` ), which can
89+ submit, create SeqSets, and add curated citations — so a single login covers all four steps:
90+
91+ 1 . ` AuthPage.login('superuser', …) ` ; ` GroupPage.getOrCreateGroup(seedGroup) ` .
92+ 2 . Idempotency: ` SeqSetPage.gotoList() ` ; if a ` Seed SeqSet ` cell exists, ` setup.skip() ` .
93+ 3 . ` BulkSubmissionPage ` → ` uploadMetadataFile ` (submissionId/date/country/pangoLineage) +
94+ ` uploadSequencesFile ` → ` submitAndWaitForProcessingDone ` → ` releaseAndGoToReleasedSequences ` .
95+ 4 . Collect released ` LOC_… ` accessions from the group's released page (poll-with-reload).
96+ 5 . ` SeqSetPage.createSeqSet({focal, background}) ` ; read ` seqSetId ` /` version ` from the
97+ ` /seqsets/<id>.<version> ` URL.
98+ 6 . Read the ` access_token ` cookie from the logged-in context and ` POST /create-curated-citation `
99+ (super-user token) via a backend ` APIRequestContext ` .
100+
101+ No page-object changes were required — ` createSeqSet ` 's result is recovered from the detail URL, and
102+ accessions are read with the same ` LOC_ ` regex the seqset test uses.
103+
104+ ### Gating it so normal runs never seed
105+
106+ The ` seed ` project is only added to ` playwright.config.ts ` when ` RUN_SEED=true ` . ` seed.setup.ts ` ends
107+ in ` .setup.ts ` , so no other project's ` testMatch ` picks it up. Default ` npm test ` therefore never runs it.
108+
109+ ## Kubernetes wiring (as implemented)
110+
111+ ` kubernetes/loculus/templates/seed-test-data-job.yaml ` :
112+
113+ - ` kind: Job ` , whole file gated on ` {{- if .Values.seedTestData.enabled }} ` .
114+ - ** Helm hooks** ` post-install,post-upgrade ` with ` hook-delete-policy: before-hook-creation ` , so it
115+ re-runs on each deploy and is recreated cleanly (avoids the immutable-Job problem on ` helm upgrade ` ).
116+ Plain Helm (deploy.py/k3d) runs it as a post-deploy hook; Argo CD honours Helm hooks too — so this
117+ one mechanism covers both, no separate Argo annotations needed.
118+ - ** Readiness:** an init container (` curlimages/curl ` ) loops until the website and backend respond,
119+ so the seeder doesn't start before services are up.
120+ - Image ` {{ .Values.images.integrationTests.repository }}:{{ tag|default dockerTag }} ` , built in CI
121+ from ` integration-tests/Dockerfile ` ; ` command: ["npm","run","seed"] ` .
122+ - Env: ` PLAYWRIGHT_TEST_BASE_URL=http://loculus-website-service:3000 ` ,
123+ ` PLAYWRIGHT_TEST_BACKEND_URL=http://loculus-backend-service:8079 ` ,
124+ ` SEED_SUPER_USER ` / ` SEED_SUPER_USER_PASSWORD ` from ` seedTestData.superUser ` . No DB secret.
125+ - ` activeDeadlineSeconds: 900 ` , ` backoffLimit: 1 ` , ` ttlSecondsAfterFinished: 86400 ` , ` restartPolicy: Never ` .
150126
151127### Values
152128
153- ` kubernetes/loculus/values.yaml ` (default OFF, production-safe):
154- ``` yaml
155- seedTestData :
156- enabled : false
157- user : { username: seed_user, password: seed_user }
158- organism : dummy-organism
159- sequenceCount : 3
160- ` ` `
161- ` kubernetes/loculus/values_e2e_and_dev.yaml` (turn ON for dev/E2E):
162- ` ` ` yaml
163- seedTestData:
164- enabled: true
165- ` ` `
166- Add the `seedTestData` object to `values.schema.json`, then :
167- ` npx prettier@3.6.2 --write kubernetes/loculus/values.schema.json` and
168- ` helm lint kubernetes/loculus -f kubernetes/loculus/values.yaml` (per `kubernetes/AGENTS.md`).
129+ - ` values.yaml ` — adds ` images.integrationTests ` and a default-OFF ` seedTestData ` block
130+ (` enabled: false ` , ` superUser: {username: superuser, password: superuser} ` ).
131+ - ` values_e2e_and_dev.yaml ` — ` seedTestData.enabled: true ` .
132+ - ` values.schema.json ` — registers ` images.integrationTests ` (required: ` images ` has
133+ ` additionalProperties: false ` ) and the ` seedTestData ` object.
134+
135+ Validated with ` helm lint ` (prod + dev values), ` helm template ` (Job renders only when enabled),
136+ ` prettier ` on the schema, and ` tsc ` /` prettier ` /` eslint ` on the new TS.
169137
170138## Idempotency & safety
171139
172- - Re-running on an already-seeded cluster is a no-op (seed user login check up front).
173- - `enabled : false` by default → never runs in production. The CURATED-citation SQL and the
174- ` database` secret mount only exist on dev because the whole template is gated.
175- - Uses the dummy organism only, so no real pathogen data or real DOIs/CrossRef calls.
140+ - Re-running on an already-seeded cluster is a no-op (the ` Seed SeqSet ` existence check ` setup.skip() ` s).
141+ - ` enabled: false ` by default → the whole template is gated, so it never renders in production.
142+ - Uses the dummy organism only — no real pathogen data, no real DOIs/CrossRef calls.
176143
177144## Decisions
178145
179- 1. **Citation mechanism — DECIDED : new superuser-only `POST /create-curated-citation` endpoint**
180- (implemented in this branch). Seed job calls it with a super-user token; no DB secret needed.
181- 2. **Submission driver — DECIDED : Playwright UI**, reusing the integration-test page objects.
182-
183- # # Open questions for reviewer
184-
185- 1. **Trigger:** ArgoCD PostSync hook (recommended) vs. readiness-gated plain Job.
186- 2. **Image:** extend the existing `integration-tests` image with a `seed/` entrypoint
187- (recommended) vs. a separate slimmer image.
188- ```
146+ 1 . ** Citation mechanism — new superuser-only ` POST /create-curated-citation ` endpoint** (implemented
147+ in this branch). Seeder calls it with the super-user token; no DB secret.
148+ 2 . ** Submission driver — Playwright UI** , reusing the integration-test page objects.
149+ 3 . ** Run identity — the dev super user** for all steps (one login; can submit + seqset + cite).
150+ 4 . ** Trigger — Helm hooks** (post-install/upgrade), which work under both plain Helm and Argo CD.
151+ 5 . ** Image — extend the integration-tests image** with a ` seed ` Playwright project (RUN_SEED-gated).
152+
153+ ## Validating on a live cluster (not yet done)
154+
155+ The flow is type-checked and the chart renders, but it has not been run end-to-end against a cluster.
156+ Two assumptions to confirm there (both have a clear fallback):
157+ - The dummy-organism ** bulk** submission accepts ` submissionId/date/country/pangoLineage ` . If a field
158+ is rejected, adjust ` METADATA_HEADERS ` /` SUBMISSIONS ` in ` seed.setup.ts ` .
159+ - The website stores the Keycloak access token in an ** ` access_token ` cookie** usable as a backend
160+ Bearer token. If not, swap the citation step to a Keycloak password-grant (needs a keycloak URL env).
0 commit comments