docs(site): add UI testing framework topic#2589
Conversation
- Mark this as a new v2 framework, distinct from the existing YAML player - Reframe verify/agent: verify is the deterministic gate; agent is exploratory and non-deterministic, advisory only (no pass/fail effect) - Present Pi as a swappable agent layer aligned with the community direction - Define the Pi context contract: past steps + their outputs + current UI, nothing more; clarify naming, skill-result lifetime, and output vs context.state channels - Rewrite the Rstest section to emphasize lifecycle/fixtures/concurrency over raw runtime speed
Phase-0 design draft for the v2 AI-native testing framework: node model (ui/verify/soft/agent + runtime nodes), midscene.config.ts with a single uiAgent union (declarative or factory), defineRuntime, $name skills reusing Pi's built-in Skills, the verify verdict contract (report_verdict customTool, fail-closed), the natural-language output model, and the Pi context contract. Records all settled decisions; only open item is Pi custom base-URL parity.
…0001)
Add the new package @midscene/testing-framework implementing the Phase 0
contracts from rfcs/0001-v2-testing-framework-phase0.md:
- defineMidsceneConfig / defineRuntime authoring helpers
- v2 YAML case schema parser (name + flow; single-key step nodes)
- node engine: ui / verify / soft / agent / custom runtime nodes
- verify verdict contract with fail-closed semantics; soft = warning-only
- context assembly (past intents + outputs + verdicts + current screenshot)
- output store and the output-as-only-forward-channel contract
- single `uiAgent` field (config object | factory) for the run target
- default Pi-backed agent runtime; swappable via agentRuntime
- lightweight runner + `midscene-tf` CLI (Rstest wiring out of scope)
Resolve decision C': point Pi at the same OpenAI-compatible endpoint as the
UI Agent via ModelRegistry.registerProvider({ baseUrl, apiKey, models }), so
verify/agent and ui share MIDSCENE_MODEL_BASE_URL.
Add a copy-out demo under example/, unit tests, and smoke scripts (Pi
wiring + real-browser engine validated; model-backed smoke documented for
networked environments). Sync the en/zh design docs to the decided uiAgent
field and flattened runtime context.
…e cases Run the real runner end-to-end (discovery, YAML parsing, node engine, output store, summary writing) against the real example/e2e cases, mocking only the browser (fake UI Agent) and the model (mock agent runtime). No network or Chrome required, so it runs in the standard `nx test` / `test:coverage` CI job. Narrow the package tsconfig include to tests/unit-test so type-check:tests does not try to type-check the standalone .mjs smoke scripts.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 85bd63efbe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const { config } = await loadConfig(args.config ?? args.root); | ||
| const summary = await runAll(config, { | ||
| projectRoot: args.root, |
There was a problem hiding this comment.
Resolve --config paths relative to the config directory
When midscene-tf run --config path/to/midscene.config.ts is invoked from outside that config's directory without --root, the CLI loads the chosen config but still calls runAll with projectRoot: undefined, so runAll resolves testDir, explicit files, summaries, and skills against the caller's current working directory. For example, running the shipped example as midscene-tf run --config example/midscene.config.ts from the repo root looks under ./e2e instead of example/e2e; use dirname(configPath) as the default root when --root is omitted.
Useful? React with 👍 / 👎.
| /** Directory for Midscene HTML reports. */ | ||
| reportDir?: string; |
There was a problem hiding this comment.
Honor output.reportDir before exposing it
Configs that set output.reportDir (including the new example/docs) get no effect because this field is declared but never read by the runner or UI-agent factory; with uiAgentOptions.generateReport: true, Midscene reports still go to the default midscene_run/report under process.cwd() via the existing report helpers, not to the configured directory. Either wire this option into report generation (for example by setting the run/report output before creating agents) or avoid documenting a non-functional setting.
Useful? React with 👍 / 👎.
Deploying midscene with
|
| Latest commit: |
910625a
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://4140114b.midscene.pages.dev |
| Branch Preview URL: | https://claude-upbeat-noether-u2jrq.midscene.pages.dev |
The `agent-runtime` directory sat at the intersection of two naming collisions: "agent" (shared with the UI Agent in `ui-agent/`) and "runtime" (shared with custom YAML nodes, `defineRuntime`). Reserve "runtime" for custom nodes and rename the swappable general-purpose agent layer to read as the counterpart of the UI Agent. - src/agent-runtime/ -> src/general-agent/ (pi-runtime.ts -> pi-general-agent.ts) - AgentRuntimeAdapter -> GeneralAgentAdapter - AgentRunInput/AgentRunResult -> GeneralAgentInput/GeneralAgentResult - PiAgentRuntime -> PiGeneralAgent, PiRuntimeOptions -> PiGeneralAgentOptions - config field agentRuntime -> generalAgent Also drop the unused `index` param threaded through runNode / runJudgmentNode / runAgentNode in the engine. Docs (package README, RFC 0001) updated to match.
A config without `uiAgent` is recoverable for flows that only use custom runtime nodes, so `defineMidsceneConfig` now logs a warning (via getDebug console channel) rather than throwing during config load. The UI Agent factory gains a clear guard so a case that actually needs the UI Agent still fails with an actionable message instead of a cryptic "cannot read 'type' of undefined" crash.
A runtime node's own YAML value (`input`) was crammed into `RuntimeNodeContext` alongside the ambient execution context (uiAgent, outputs, state, result, env). Pull it out as a dedicated first positional argument so the handler signature reads `(input, context) => ...` — "what this node was invoked with" vs "what's around it". - RuntimeNode: `(ctx)` -> `(input, context)` - RuntimeNodeContext: drop the `input` field - update engine call site, unit/smoke tests, example config, RFC §3
Add `WebConnectionOpt` / `AndroidConnectionOpt` / `IOSConnectionOpt` / `ComputerConnectionOpt` (and `HarmonyConnectionOpt`): the pure "how to reach the target" shapes, derived from the `MidsceneYamlScript*Env` types with the YAML run config and agent-behavior options stripped out. They stay in sync with the env types automatically (derived via Omit) and give consumers a connection-only contract without the YAML/agent-opt baggage the env type names imply. Also export `MidsceneYamlScriptComputerEnv`, which was missing from the public surface while its web/android/ios siblings were already exported.
… types `UIAgentConfig.options` was a hand-rolled `Record<string, unknown>`, disconnected from the real agent launcher inputs and forcing `as unknown as Parameters<...>` casts in the factory. Make `UIAgentConfig` a discriminated union keyed by `type`, with each variant's `options` typed against the canonical connection types from `@midscene/core` (`WebConnectionOpt` / `AndroidConnectionOpt` / ...). `UIAgentType` is derived from the union; web's `url` is required. Also drop the now-unnecessary `agent as unknown as Agent` cast in the web factory: `PuppeteerAgent` is `@midscene/core`'s `Agent<PuppeteerWebPage>`, so once options are typed it is directly assignable to `Agent` — the cast was only needed because the prior `Record` casts poisoned the return type. Updates config test and RFC §2/§2.1. Validation: nx build core, testing-framework build + tsc + 33 unit tests, pnpm lint.
1186085 to
ec331b3
Compare
…m them Previous commit derived the `*ConnectionOpt` types from the `MidsceneYamlScript*Env` types via `Omit<...>`, which framed the YAML env as the source of truth and the connection options as a byproduct. Invert that: the `*ConnectionOpt` types are now first-class interfaces in a dedicated `connection-options.ts` module (web fields + JSDoc moved there; native ones extend `Omit<*DeviceOpt, 'customActions'>`). The `MidsceneYamlScript*Env` types are composed FROM them via `extends` (env = connection + YAML run config + agent behavior for web). Layering now reads: AndroidDeviceOpt (driver) ⊂ AndroidConnectionOpt (connection) ⊂ MidsceneYamlScriptAndroidEnv (YAML flavor). Env shapes are structurally identical — no behavior change, no impact on CLI / ScriptPlayer / web-integration. Validation: nx build core, nx test core (only pre-existing auto-glm prompt snapshot failures, unrelated), testing-framework tsc + 33 unit tests, pnpm lint.
Release prep for @midscene/testing-framework. The package is internal for now (not exposed to users), so guard against accidental publishing and tidy the layout: - Move the top-level `example/` demo into `packages/testing-framework/example/` (it is a standalone copy-out demo, not a workspace member; `packages/*` globs are non-recursive so it stays standalone). Fix the relative link in its README and the package README pointer. - Mark the package `"private": true` and drop `publishConfig`. The release workflow publishes via `pnpm -r publish`, which skips private packages — verified the package is no longer picked up for publish. - Delete the Phase 0 design RFC now that the work has landed.
…-u2JRq # Conflicts: # packages/core/src/yaml.ts
No description provided.