Skip to content

docs(site): add UI testing framework topic#2589

Open
yuyutaotao wants to merge 37 commits into
mainfrom
claude/upbeat-noether-u2JRq
Open

docs(site): add UI testing framework topic#2589
yuyutaotao wants to merge 37 commits into
mainfrom
claude/upbeat-noether-u2JRq

Conversation

@yuyutaotao

Copy link
Copy Markdown
Collaborator

No description provided.

ottomao and others added 27 commits May 20, 2026 15:39
- Mark this as a new v2 framework, distinct from the existing YAML player
- Reframe verify/agent: verify is the deterministic gate; agent is
  exploratory and non-deterministic, advisory only (no pass/fail effect)
- Present Pi as a swappable agent layer aligned with the community direction
- Define the Pi context contract: past steps + their outputs + current UI,
  nothing more; clarify naming, skill-result lifetime, and output vs
  context.state channels
- Rewrite the Rstest section to emphasize lifecycle/fixtures/concurrency
  over raw runtime speed
Phase-0 design draft for the v2 AI-native testing framework: node model
(ui/verify/soft/agent + runtime nodes), midscene.config.ts with a single
uiAgent union (declarative or factory), defineRuntime, $name skills reusing
Pi's built-in Skills, the verify verdict contract (report_verdict customTool,
fail-closed), the natural-language output model, and the Pi context contract.
Records all settled decisions; only open item is Pi custom base-URL parity.
…0001)

Add the new package @midscene/testing-framework implementing the Phase 0
contracts from rfcs/0001-v2-testing-framework-phase0.md:

- defineMidsceneConfig / defineRuntime authoring helpers
- v2 YAML case schema parser (name + flow; single-key step nodes)
- node engine: ui / verify / soft / agent / custom runtime nodes
- verify verdict contract with fail-closed semantics; soft = warning-only
- context assembly (past intents + outputs + verdicts + current screenshot)
- output store and the output-as-only-forward-channel contract
- single `uiAgent` field (config object | factory) for the run target
- default Pi-backed agent runtime; swappable via agentRuntime
- lightweight runner + `midscene-tf` CLI (Rstest wiring out of scope)

Resolve decision C': point Pi at the same OpenAI-compatible endpoint as the
UI Agent via ModelRegistry.registerProvider({ baseUrl, apiKey, models }), so
verify/agent and ui share MIDSCENE_MODEL_BASE_URL.

Add a copy-out demo under example/, unit tests, and smoke scripts (Pi
wiring + real-browser engine validated; model-backed smoke documented for
networked environments). Sync the en/zh design docs to the decided uiAgent
field and flattened runtime context.
…e cases

Run the real runner end-to-end (discovery, YAML parsing, node engine, output
store, summary writing) against the real example/e2e cases, mocking only the
browser (fake UI Agent) and the model (mock agent runtime). No network or
Chrome required, so it runs in the standard `nx test` / `test:coverage` CI job.

Narrow the package tsconfig include to tests/unit-test so type-check:tests does
not try to type-check the standalone .mjs smoke scripts.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85bd63efbe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +75 to +77
const { config } = await loadConfig(args.config ?? args.root);
const summary = await runAll(config, {
projectRoot: args.root,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve --config paths relative to the config directory

When midscene-tf run --config path/to/midscene.config.ts is invoked from outside that config's directory without --root, the CLI loads the chosen config but still calls runAll with projectRoot: undefined, so runAll resolves testDir, explicit files, summaries, and skills against the caller's current working directory. For example, running the shipped example as midscene-tf run --config example/midscene.config.ts from the repo root looks under ./e2e instead of example/e2e; use dirname(configPath) as the default root when --root is omitted.

Useful? React with 👍 / 👎.

Comment on lines +53 to +54
/** Directory for Midscene HTML reports. */
reportDir?: string;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor output.reportDir before exposing it

Configs that set output.reportDir (including the new example/docs) get no effect because this field is declared but never read by the runner or UI-agent factory; with uiAgentOptions.generateReport: true, Midscene reports still go to the default midscene_run/report under process.cwd() via the existing report helpers, not to the configured directory. Either wire this option into report generation (for example by setting the run/report output before creating agents) or avoid documenting a non-functional setting.

Useful? React with 👍 / 👎.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 4, 2026

Copy link
Copy Markdown

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: 910625a
Status: ✅  Deploy successful!
Preview URL: https://4140114b.midscene.pages.dev
Branch Preview URL: https://claude-upbeat-noether-u2jrq.midscene.pages.dev

View logs

ottomao added 5 commits June 4, 2026 15:43
The `agent-runtime` directory sat at the intersection of two naming
collisions: "agent" (shared with the UI Agent in `ui-agent/`) and
"runtime" (shared with custom YAML nodes, `defineRuntime`). Reserve
"runtime" for custom nodes and rename the swappable general-purpose
agent layer to read as the counterpart of the UI Agent.

- src/agent-runtime/ -> src/general-agent/ (pi-runtime.ts -> pi-general-agent.ts)
- AgentRuntimeAdapter -> GeneralAgentAdapter
- AgentRunInput/AgentRunResult -> GeneralAgentInput/GeneralAgentResult
- PiAgentRuntime -> PiGeneralAgent, PiRuntimeOptions -> PiGeneralAgentOptions
- config field agentRuntime -> generalAgent

Also drop the unused `index` param threaded through runNode /
runJudgmentNode / runAgentNode in the engine.

Docs (package README, RFC 0001) updated to match.
A config without `uiAgent` is recoverable for flows that only use custom
runtime nodes, so `defineMidsceneConfig` now logs a warning (via
getDebug console channel) rather than throwing during config load.

The UI Agent factory gains a clear guard so a case that actually needs
the UI Agent still fails with an actionable message instead of a cryptic
"cannot read 'type' of undefined" crash.
A runtime node's own YAML value (`input`) was crammed into
`RuntimeNodeContext` alongside the ambient execution context (uiAgent,
outputs, state, result, env). Pull it out as a dedicated first
positional argument so the handler signature reads
`(input, context) => ...` — "what this node was invoked with" vs "what's
around it".

- RuntimeNode: `(ctx)` -> `(input, context)`
- RuntimeNodeContext: drop the `input` field
- update engine call site, unit/smoke tests, example config, RFC §3
Add `WebConnectionOpt` / `AndroidConnectionOpt` / `IOSConnectionOpt` /
`ComputerConnectionOpt` (and `HarmonyConnectionOpt`): the pure "how to
reach the target" shapes, derived from the `MidsceneYamlScript*Env`
types with the YAML run config and agent-behavior options stripped out.
They stay in sync with the env types automatically (derived via Omit) and
give consumers a connection-only contract without the YAML/agent-opt
baggage the env type names imply.

Also export `MidsceneYamlScriptComputerEnv`, which was missing from the
public surface while its web/android/ios siblings were already exported.
… types

`UIAgentConfig.options` was a hand-rolled `Record<string, unknown>`,
disconnected from the real agent launcher inputs and forcing
`as unknown as Parameters<...>` casts in the factory.

Make `UIAgentConfig` a discriminated union keyed by `type`, with each
variant's `options` typed against the canonical connection types from
`@midscene/core` (`WebConnectionOpt` / `AndroidConnectionOpt` / ...).
`UIAgentType` is derived from the union; web's `url` is required.

Also drop the now-unnecessary `agent as unknown as Agent` cast in the web
factory: `PuppeteerAgent` is `@midscene/core`'s `Agent<PuppeteerWebPage>`,
so once options are typed it is directly assignable to `Agent` — the cast
was only needed because the prior `Record` casts poisoned the return type.

Updates config test and RFC §2/§2.1.

Validation: nx build core, testing-framework build + tsc + 33 unit tests,
pnpm lint.
@ottomao ottomao force-pushed the claude/upbeat-noether-u2JRq branch from 1186085 to ec331b3 Compare June 5, 2026 07:15
ottomao added 2 commits June 5, 2026 00:30
…m them

Previous commit derived the `*ConnectionOpt` types from the
`MidsceneYamlScript*Env` types via `Omit<...>`, which framed the YAML env
as the source of truth and the connection options as a byproduct.

Invert that: the `*ConnectionOpt` types are now first-class interfaces in
a dedicated `connection-options.ts` module (web fields + JSDoc moved
there; native ones extend `Omit<*DeviceOpt, 'customActions'>`). The
`MidsceneYamlScript*Env` types are composed FROM them via `extends`
(env = connection + YAML run config + agent behavior for web).

Layering now reads: AndroidDeviceOpt (driver) ⊂ AndroidConnectionOpt
(connection) ⊂ MidsceneYamlScriptAndroidEnv (YAML flavor). Env shapes are
structurally identical — no behavior change, no impact on CLI /
ScriptPlayer / web-integration.

Validation: nx build core, nx test core (only pre-existing auto-glm prompt
snapshot failures, unrelated), testing-framework tsc + 33 unit tests,
pnpm lint.
Release prep for @midscene/testing-framework. The package is internal for
now (not exposed to users), so guard against accidental publishing and
tidy the layout:

- Move the top-level `example/` demo into `packages/testing-framework/example/`
  (it is a standalone copy-out demo, not a workspace member; `packages/*`
  globs are non-recursive so it stays standalone). Fix the relative link in
  its README and the package README pointer.
- Mark the package `"private": true` and drop `publishConfig`. The release
  workflow publishes via `pnpm -r publish`, which skips private packages —
  verified the package is no longer picked up for publish.
- Delete the Phase 0 design RFC now that the work has landed.
…-u2JRq

# Conflicts:
#	packages/core/src/yaml.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants