feat: state narrowing and transition verification for grounding cascade by abrichr · Pull Request #257 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-31T22:27:07Z

Summary

Implements Phase 4 of the grounding cascade: state narrowing before grounding and transition verification after clicking
Adds check_state_preconditions() to verify window title, nearby text, and surrounding labels match expectations before grounding a click (pre-click state check)
Adds verify_transition() to verify disappearance_text, appearance_text, and window_title_change after clicking (post-click transition verification)
Integrates both functions into DemoExecutor.run() as observational warnings (non-blocking in Phase 4, blocking/recovery deferred to later phases)
Both functions accept an optional ocr_fn callable for OCR integration (Phase 5); gracefully skip when no OCR is available

Test plan

26 new tests in tests/test_grounding.py covering:
- check_state_preconditions: no-OCR skip, no-expectations skip, window title match/mismatch, nearby text threshold, surrounding labels threshold, case insensitivity, combined checks
- verify_transition: no-expectations skip, no-OCR skip, appearance text found/missing, disappearance text gone/present, window title change, modal toggled skip, combined scenarios
- GroundingTarget round-trip serialization (to_dict/from_dict, tuple conversion, defaults omission)
All 1542 existing tests pass (54 skipped, 17 deselected by filter)
uv run --no-sources pytest tests/test_grounding.py -v -- 26 passed

🤖 Generated with Claude Code

Phase 4 of the grounding cascade — detect "wrong screen" before grounding and verify state changes after clicking. Added to grounding.py: - check_state_preconditions(): verifies window title, nearby text, and surrounding labels match expectations before grounding a click. Skips gracefully when no OCR function is provided (Phase 5). - verify_transition(): checks disappearance_text, appearance_text, and window_title_change against post-click screenshot via OCR. Modal detection deferred (logged, not enforced). - _text_present(): case-insensitive substring matching helper. Integrated into DemoExecutor.run(): - Pre-click: calls check_state_preconditions for click/double_click steps with a grounding_target. Observational only (warns, proceeds). - Post-click: calls verify_transition after action dispatch. Observational only (warns, proceeds). Tests (26 new): - 11 tests for check_state_preconditions (no-OCR, no-expectations, window title match/mismatch, nearby text, surrounding labels, case insensitivity, combined checks) - 11 tests for verify_transition (no-expectations, no-OCR, appearance/disappearance, window title change, modal skip, combined scenarios) - 4 tests for GroundingTarget round-trip serialization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

abrichr merged commit e22b404 into main Mar 31, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: state narrowing and transition verification for grounding cascade#257

feat: state narrowing and transition verification for grounding cascade#257
abrichr merged 1 commit intomainfrom
feat/grounding-state-narrowing

abrichr commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 31, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant