You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Make agent test-running guidance consistent and precise (#2329)
* Make agent test guidance consistent and scoped
The root Testing section said "use IDE testing tools over the cli" (agents
cannot drive an IDE, so in practice this read as "do not run tests") while
backend/FwLite/AGENTS.md demanded the full FwLiteOnly.slnf suite before
every commit. Replace both with one policy: run filtered CLI tests for
what you changed, verify tests you wrote actually pass, save targeted
integration runs for finished critical sync work, never run LexBox
integration tests or local Playwright.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Disambiguate which test suites agents may run locally
"backend/Testing" is not all integration tests -- only the
Integration/FlakyIntegration/RequiresDb categories and Testing.Browser
need infrastructure; its unit tests are runnable via task test:unit.
Likewise "do not run Playwright" only ever applied to suites needing the
local lexbox stack (frontend/tests, Testing.Browser); the viewer
standalone suite auto-starts a vite dev server against the demo project
and is cheap to run filtered.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* Document known CI flakes in the CI/CD agent guide
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .github/AGENTS.md
+13Lines changed: 13 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -50,6 +50,19 @@ The CI/CD setup is:
50
50
51
51
---
52
52
53
+
## Known Flaky CI Failures (re-run before debugging)
54
+
55
+
The `GHA integration tests / dotnet` check (`integration-test-gha.yaml`) fails in two known ways that are NOT regressions. Re-run first (`gh run rerun <runId> --failed`) — especially on frontend-only or dependency-only PRs, which can't affect the lexbox-api / hg / fw-headless containers it exercises:
56
+
57
+
1.**cert-manager readiness timeout** — `setup-k8s` waits `--timeout=90s` for cert-manager pods; on a cold kind cluster they don't always make it → deploy aborts fast (~3 min) and the status step logs "No resources found in languagedepot namespace". Environmental — tends to hit all branches in the same window.
58
+
2.**MediaFileTests large-upload stream error** — `Testing.FwHeadless.MediaFileTests.UploadReplacementFile_TooLarge_ThrowsError` intermittently throws `HttpRequestException: Error while copying content to a stream` (transient connection drop streaming the large file) instead of the expected validation error. Shows as Failed: 1 / Passed: ~146 after the full ~14 min run.
59
+
60
+
Also expected, not a failure: on frontend-only PRs the backend image-publish workflows (`lexbox-fw-headless`, `lexbox-hgweb`) don't trigger (path filters), so `setup-k8s` gets `manifest unknown` pulling those images at the PR version and falls back to the `develop` tag via `continue-on-error`. Those log lines are noise.
61
+
62
+
Separately: a PR whose merge state is CONFLICTING silently *skips* the build/test checks rather than failing them — if expected checks are missing, reconcile with develop first.
Copy file name to clipboardExpand all lines: AGENTS.md
+5-4Lines changed: 5 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,9 +52,11 @@ Key documentation for this project:
52
52
53
53
### Testing
54
54
55
-
- ❌ **Do NOT run LexBox dotnet INTEGRATION tests** unless the user explicitly asks. They require full test infrastructure (database, services) which usually isn't available.
56
-
- ✅ **FwLite integration tests CAN be run** — e.g. `FwLiteProjectSync.Tests` They're just a bit slow, but run them freely when making critical changes to relevant code.
57
-
- ✅ **DO run unit tests locally** and filter to the tests that are relevant to the changes you are making. Use IDE testing tools over the cli.
55
+
- ✅ **DO run unit tests via the CLI**, filtered to the tests relevant to your changes (e.g. `dotnet test backend/FwLite/FwLiteShared.Tests --filter "FullyQualifiedName~MyTestClass"`). Verify tests you wrote or changed actually pass before handing work back. Never run whole suites just to "see if anything broke".
56
+
- ✅ **FwLite integration tests** (e.g. `FwLiteProjectSync.Tests`) need no infrastructure but are slow. Run a **targeted selection** (specific tests, not necessarily whole classes) when you touched critical sync code **and believe the work is finished** — not on every iteration. Waiting on tests burns time; be deliberate about which runs buy real signal.
57
+
- ✅ **`backend/Testing` contains unit tests too** — only tests marked `Category=Integration|FlakyIntegration|RequiresDb` (and the `Testing.Browser` namespace) need infrastructure. Its unit tests are fine to run: `task test:unit -- <filter>` excludes those categories for you.
58
+
- ✅ **FwLite viewer Playwright tests MAY be run** — they're cheap: `task playwright-test-standalone -- <test-name-filter>` (from `frontend/viewer/`) auto-starts the vite dev server with the in-browser demo project; no lexbox stack, chromium only. Always filter to relevant tests; details in `frontend/viewer/AGENTS.md`.
59
+
- ❌ **Do NOT run tests that need the lexbox stack** unless the user explicitly asks: LexBox integration tests (`Category=Integration`/`FlakyIntegration`, `Testing.Browser`) and the lexbox frontend Playwright suite (`frontend/tests`). The local stack is usually down or torn down between sessions and results aren't trustworthy — rely on CI for these.
58
60
59
61
### Questions?
60
62
@@ -78,7 +80,6 @@ Before implementing any change that will touch many files or is in a 🔴 **Crit
78
80
- ✅ If the user asks about "the" PR, but does not explicitly name a PR or branch, assume they mean the PR associated with the current branch.
79
81
- ✅ Use **Mermaid diagrams** for flowcharts and architecture (not ASCII art)
80
82
- ✅ Prefer IDE diagnostics (compiler/lint errors) over CLI tools for identifying issues. Fixing these diagnostics is part of completing any instruction.
81
-
- ✅ Do NOT run integration tests unless user explicitly requests
82
83
- ✅ When handling a user prompt ALWAYS ask for clarification if there are details to clarify, important decisions that must be made first or the plan sounds unwise
83
84
- ❌ Do NOT git commit or git push without explicit user approval
Copy file name to clipboardExpand all lines: backend/FwLite/AGENTS.md
+4-10Lines changed: 4 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ Lightweight FieldWorks application for dictionary editing with CRDT-based sync.
12
12
**Before making changes:**
13
13
1. Read the relevant section below thoroughly
14
14
2. Understand the sync flow end-to-end
15
-
3.Run the full test suite: `dotnet test FwLiteOnly.slnf`
15
+
3.Identify which tests cover the affected area (run a targeted selection when the work is done — see the root `AGENTS.md` Testing section)
16
16
4. Test with real FwData projects, not just unit tests
17
17
18
18
---
@@ -23,7 +23,7 @@ Lightweight FieldWorks application for dictionary editing with CRDT-based sync.
23
23
# Run FwLite Web (typical workflow)
24
24
task fw-lite-web # from repo root
25
25
26
-
# Run tests (ALWAYS run before committing)
26
+
# Run all FwLite tests (slow — prefer targeted runs, see root AGENTS.md Testing section)
27
27
dotnet test FwLiteOnly.slnf
28
28
29
29
# Build MAUI app (Windows)
@@ -269,15 +269,9 @@ if (entity?.DeletedAt is not null) return;
269
269
270
270
## Testing Strategy
271
271
272
-
### Before ANY commit:
272
+
### When the work is finished:
273
273
274
-
```bash
275
-
# Run all FwLite tests
276
-
dotnet test FwLiteOnly.slnf
277
-
278
-
# If touching sync code, also run:
279
-
dotnet test FwLiteProjectSync.Tests
280
-
```
274
+
Run a targeted selection of the tests covering what you changed (root `AGENTS.md` → Testing). For 🔴 critical sync changes that usually includes the relevant `FwLiteProjectSync.Tests` scenarios. `dotnet test FwLiteOnly.slnf` runs everything but is slow — reserve it for when broad signal is genuinely needed.
Copy file name to clipboardExpand all lines: frontend/AGENTS.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -60,7 +60,7 @@ pnpm run -r lint
60
60
61
61
- Playwright for E2E tests
62
62
- Test files in `tests/`
63
-
- Run with `pnpm test`
63
+
- Run with `pnpm test` — requires the full local lexbox stack (`task up`). Agents: don't run these locally, rely on CI (root `AGENTS.md` → Testing). The cheap, agent-runnable Playwright suite is the *viewer's* (`viewer/AGENTS.md`), not this one.
0 commit comments