Skip to content

Commit 55883cb

Browse files
committed
Extend assertion DSL, flow control, and recording editor
Close common automation-verification gaps so scripts can check state, recover from failures, and ship cleaner recordings without GUI glue: - Assertions: add clipboard / process / file / http checks, plus spec-driven combinators assert_all (soft AND), assert_any (OR), and assert_eventually (poll any spec until it passes). Eight assertion kinds now share one declarative dispatcher reachable from Python, JSON, and MCP. - Flow control: add AC_try (try/catch/finally recovery, error_var, reraise; loop break/continue still propagate) and AC_while_var (variable-condition loop with max_iter cap); factor the shared comparator out of AC_if_var. - Recording editor: add dedupe_moves and merge_sleeps to compact raw recordings non-destructively. Each feature ships the full slice (headless core, facade re-export, executor command where applicable, MCP tool, tests); the top-level package stays Qt-free.
1 parent 5b3aa74 commit 55883cb

14 files changed

Lines changed: 1559 additions & 18 deletions

File tree

README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,9 @@ an `ac_*` MCP tool, and a Qt GUI tab. Full reference page:
6767
[`docs/source/Eng/doc/new_features/v3_features_doc.rst`](docs/source/Eng/doc/new_features/v3_features_doc.rst).
6868

6969
**Assertions**
70-
- **Assertion DSL** — verify screen state instead of only driving it: `assert_text` (OCR, `regex` + `present=False` for absence), `assert_image`, `assert_pixel`, `assert_window`. Returns an `AssertionResult`; raises `AutoControlAssertionException` on mismatch with optional failure screenshot (`AC_assert_text / _image / _pixel / _window`).
70+
- **Assertion DSL** — verify screen state instead of only driving it: `assert_text` (OCR, `regex` + `present=False` for absence), `assert_image`, `assert_pixel`, `assert_window`, `assert_clipboard` (`equals` / `contains` / `regex`, `present=False` to confirm a secret was cleared), `assert_process` (a named process is / isn't running, via psutil). Returns an `AssertionResult`; raises `AutoControlAssertionException` on mismatch with optional failure screenshot (`AC_assert_text / _image / _pixel / _window / _clipboard / _process`).
71+
- **Off-screen assertions**`assert_file` (existence / substring / SHA-256 / minimum size — verify a download or export) and `assert_http` (an http/https endpoint returns a status + optional body text, always with an explicit timeout). Both extend the DSL beyond the screen and plug into the combinators below (`AC_assert_file / AC_assert_http`).
72+
- **Assertion combinators**`assert_all([...specs])` runs a batch as *soft assertions* (every spec is checked, all failures collected before raising) and returns a `GroupAssertionResult`; `assert_any([...specs])` is the OR-complement (passes when at least one spec passes, short-circuiting — e.g. *either* a success dialog *or* a redirect confirms a login); `assert_eventually(spec, timeout, interval)` retries one declarative assertion spec until it passes or times out (e.g. poll a health endpoint until it returns 200, or wait for a download file to appear). Both are spec-driven (`{"kind": "text", "text": "Saved"}`, `{"kind": "http", "url": "..."}`) so they work identically from Python, JSON, and MCP across every assertion kind — text/image/pixel/window/clipboard/process/file/http (`AC_assert_all / AC_assert_eventually`).
7173
- **Media assertions**`assert_audio_activity` (record + RMS threshold for sound vs silence) and `assert_video_changes` (mean frame-to-frame diff over a segment for motion vs static); pure numeric cores, lazy `sounddevice` / OpenCV (`AC_assert_audio / AC_assert_video_changes`).
7274

7375
**Data-driven execution**
@@ -147,7 +149,7 @@ sense) a Qt GUI tab. Full reference page:
147149
- **AI Element Locator (VLM)** — describe a UI element in plain language and let a vision-language model (Anthropic / OpenAI) find its screen coordinates
148150
- **OCR** — extract text from screen regions through three pluggable backends (Tesseract for ASCII, EasyOCR for CJK without an external binary, PaddleOCR for highest-quality Chinese / Japanese / Korean). Single unified API + canonical language codes; backend chosen by `backend=` kwarg, `AUTOCONTROL_OCR_BACKEND` env var, or auto-detection. Wait for, click, or locate rendered text; regex search and full-region dump
149151
- **LLM Action Planner** — translate a plain-language description into a validated `AC_*` action list using Claude
150-
- **Runtime Variables & Control Flow**`${var}` substitution at execution time, plus `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_retry` for data-driven scripts
152+
- **Runtime Variables & Control Flow**`${var}` substitution at execution time, plus `AC_set_var` / `AC_inc_var` / `AC_if_var` / `AC_for_each` / `AC_loop` / `AC_while_var` / `AC_retry` / `AC_try` for data-driven scripts. `AC_while_var` loops while a variable comparison holds (re-checked each iteration, `max_iter` safety cap). `AC_try` adds try/catch/finally: when `body` fails it runs the `catch` recovery branch instead of aborting, always runs `finally`, exposes the error to `error_var`, and can `reraise` after cleanup (loop `break`/`continue` still propagate through it)
151153
- **Remote Desktop** — stream this machine's screen and accept remote input over a token-authenticated TCP protocol, *or* connect to another machine and view + control it (host + viewer GUIs included). Optional TLS (HTTPS-grade encryption), WebSocket transport (ws:// + wss:// for browser / firewall-friendly clients), persistent 9-digit Host ID, host→viewer audio streaming, bidirectional clipboard sync (text + image), and chunked file transfer (drag-drop + progress bar; arbitrary destination path; no size cap). Plus folder sync (additive mirror — local deletions never propagate) and a self-hosted coturn TURN config bundle generator (turnserver.conf + systemd unit + docker-compose + README). **AnyDesk-style popout**: when the viewer authenticates, the live remote desktop opens in its own resizable top-level window so the control panel stays uncluttered. The Remote Desktop tabs are wrapped in `QScrollArea` so the panel stays usable on small windows and stretches edge-to-edge on 4K displays. Driveable headlessly via `je_auto_control` and over MCP through the new `ac_remote_*` tools
152154
- **Driver-level input backends (opt-in)** — for games / apps that ignore SendInput (Win) or XTest (Linux): **Interception driver backend** for Windows (HID-layer keyboard / mouse injection via Oblita's WHQL-signed driver, opt-in via `JE_AUTOCONTROL_WIN32_BACKEND=interception`), **uinput backend** for Linux (kernel `/dev/uinput` synthetic HID device, opt-in via `JE_AUTOCONTROL_LINUX_BACKEND=uinput`), and **ViGEm virtual gamepad** for Windows games that read controllers (virtual Xbox 360 pad with friendly button / dpad / stick / trigger API, exposed as `AC_gamepad_*` executor commands and `ac_gamepad_*` MCP tools). All three fall back gracefully when the driver isn't installed, so existing deployments keep working unchanged
153155
- **Clipboard** — read/write system clipboard text on Windows, macOS, and Linux
@@ -973,10 +975,17 @@ time.sleep(10) # Record for 10 seconds
973975
# Stop recording and get the action list
974976
actions = je_auto_control.stop_record()
975977

978+
# Clean up the recording before replay: collapse runs of consecutive
979+
# mouse-move samples into their final position (often shrinks a raw
980+
# recording by an order of magnitude without changing replay behaviour)
981+
actions = je_auto_control.dedupe_moves(actions)
982+
976983
# Replay the recorded actions
977984
je_auto_control.execute_action(actions)
978985
```
979986

987+
> Non-destructive recording editors (all return a new list): `dedupe_moves` (collapse mouse-move runs), `merge_sleeps` (sum consecutive `AC_sleep` runs), `trim_actions`, `insert_action`, `remove_action`, `filter_actions`, `adjust_delays` (scale `AC_sleep` delays), `scale_coordinates` (replay at a different resolution). Exposed over MCP as `ac_dedupe_moves` / `ac_merge_sleeps` / `ac_trim_actions` / `ac_adjust_delays` / `ac_scale_coordinates`.
988+
980989
### JSON Action Scripting
981990

982991
Create a JSON action file (`actions.json`):
@@ -1020,7 +1029,7 @@ je_auto_control.execute_action([
10201029
| LLM planner | `AC_llm_plan`, `AC_llm_run` |
10211030
| Clipboard | `AC_clipboard_get`, `AC_clipboard_set` |
10221031
| Window | `AC_list_windows`, `AC_focus_window`, `AC_wait_window`, `AC_close_window` |
1023-
| Flow control | `AC_loop`, `AC_break`, `AC_continue`, `AC_if_image_found`, `AC_if_pixel`, `AC_if_var`, `AC_while_image`, `AC_for_each`, `AC_wait_image`, `AC_wait_pixel`, `AC_sleep`, `AC_retry` |
1032+
| Flow control | `AC_loop`, `AC_break`, `AC_continue`, `AC_if_image_found`, `AC_if_pixel`, `AC_if_var`, `AC_while_image`, `AC_while_var`, `AC_for_each`, `AC_wait_image`, `AC_wait_pixel`, `AC_sleep`, `AC_retry`, `AC_try` |
10241033
| Variables | `AC_set_var`, `AC_get_var`, `AC_inc_var` |
10251034
| Remote desktop | `AC_start_remote_host`, `AC_stop_remote_host`, `AC_remote_host_status`, `AC_remote_connect`, `AC_remote_disconnect`, `AC_remote_viewer_status`, `AC_remote_send_input` |
10261035
| Record | `AC_record`, `AC_stop_record`, `AC_set_record_enable` |

je_auto_control/__init__.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,10 @@
135135
)
136136
# Assertion DSL (verify screen state; raise on mismatch)
137137
from je_auto_control.utils.assertion import (
138-
AssertionResult, assert_image, assert_pixel, assert_text, assert_window,
138+
AssertionResult, GroupAssertionResult, assert_all, assert_any,
139+
assert_clipboard, assert_eventually, assert_file, assert_http,
140+
assert_image, assert_pixel, assert_process, assert_text, assert_window,
141+
run_assertion_spec,
139142
)
140143
# Data-driven execution (load rows from CSV / JSON / SQLite / Excel)
141144
from je_auto_control.utils.data_source import data_source_kinds, load_rows
@@ -282,8 +285,8 @@
282285
)
283286
# Recording editor (headless helpers)
284287
from je_auto_control.utils.recording_edit.editor import (
285-
adjust_delays, filter_actions, insert_action, remove_action,
286-
scale_coordinates, trim_actions,
288+
adjust_delays, dedupe_moves, filter_actions, insert_action,
289+
merge_sleeps, remove_action, scale_coordinates, trim_actions,
287290
)
288291
# Scheduler (headless)
289292
from je_auto_control.utils.scheduler.scheduler import (
@@ -405,7 +408,7 @@ def start_autocontrol_gui(*args, **kwargs):
405408
"find_text_regex",
406409
# Recording editor
407410
"trim_actions", "insert_action", "remove_action", "filter_actions",
408-
"adjust_delays", "scale_coordinates",
411+
"adjust_delays", "scale_coordinates", "dedupe_moves", "merge_sleeps",
409412
# Scheduler
410413
"Scheduler", "ScheduledJob", "default_scheduler",
411414
# Script variables
@@ -506,7 +509,11 @@ def start_autocontrol_gui(*args, **kwargs):
506509
"wait_until_region_idle", "wait_until_screen_stable",
507510
# Assertion DSL
508511
"AssertionResult", "assert_image", "assert_pixel",
509-
"assert_text", "assert_window",
512+
"assert_text", "assert_window", "assert_clipboard", "assert_process",
513+
"assert_file", "assert_http",
514+
# Assertion combinators (soft groups + eventual polling)
515+
"GroupAssertionResult", "assert_all", "assert_any", "assert_eventually",
516+
"run_assertion_spec",
510517
# Data-driven execution
511518
"data_source_kinds", "load_rows",
512519
# Flaky-test detection

je_auto_control/utils/assertion/__init__.py

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,22 +4,43 @@
44
55
from je_auto_control import (
66
assert_text, assert_image, assert_pixel, assert_window,
7-
AssertionResult,
7+
assert_clipboard, assert_all, assert_eventually,
8+
AssertionResult, GroupAssertionResult,
89
)
910
"""
1011
from je_auto_control.utils.assertion.assertions import (
1112
AssertionResult,
13+
assert_clipboard,
14+
assert_file,
15+
assert_http,
1216
assert_image,
1317
assert_pixel,
18+
assert_process,
1419
assert_text,
1520
assert_window,
1621
)
22+
from je_auto_control.utils.assertion.combinators import (
23+
GroupAssertionResult,
24+
assert_all,
25+
assert_any,
26+
assert_eventually,
27+
run_assertion_spec,
28+
)
1729

1830

1931
__all__ = [
2032
"AssertionResult",
33+
"GroupAssertionResult",
34+
"assert_all",
35+
"assert_any",
36+
"assert_clipboard",
37+
"assert_eventually",
38+
"assert_file",
39+
"assert_http",
2140
"assert_image",
2241
"assert_pixel",
42+
"assert_process",
2343
"assert_text",
2444
"assert_window",
45+
"run_assertion_spec",
2546
]

0 commit comments

Comments
 (0)