Skip to content

Commit 4205cd5

Browse files
abrichrclaude
andauthored
fix(ci): fix release automation — use ADMIN_TOKEN for protected branches (#8)
* fix: resolve all ruff lint errors - Remove unused variable assignments (share.py, browser_bridge.py, windows.py, test_highlevel.py) - Add noqa comment for Quartz import needed by ApplicationServices (darwin.py) - Remove unused TYPE_CHECKING import (storage/__init__.py) - Add proper TYPE_CHECKING import for CaptureStats annotation (generate_real_capture_plot.py) - Auto-fix import sorting across multiple files Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update README with share command and ecosystem links - Uncomment PyPI badges (package now published as 0.3.0) - Add "Sharing Recordings" section with Magic Wormhole usage - Update openadapt-privacy from "Coming soon" to GitHub link - Add share extra to optional extras table - Add openadapt-privacy and openadapt-evals to Related Projects Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: remove redundant openadapt-ml training section The detailed training workflow belongs in openadapt-ml's README. This keeps openadapt-capture focused on its core functionality. Users can find training info via the Related Projects link. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(ci): fix release automation — use ADMIN_TOKEN to push to protected branches Root cause: GITHUB_TOKEN cannot push commits to protected branches. Semantic-release created the v0.3.0 tag (tags bypass protection) but the "chore: release 0.3.0" commit that bumps pyproject.toml was orphaned. - Use ADMIN_TOKEN for checkout and semantic-release (can push to main) - Add skip-check to prevent infinite loops on release commits - Sync pyproject.toml version to 0.3.0 (matches latest tag) Prerequisite: Add ADMIN_TOKEN secret (GitHub PAT with repo scope) to repository settings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 9080e12 commit 4205cd5

28 files changed

+3984
-111
lines changed

.github/workflows/release.yml

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,31 +18,42 @@ jobs:
1818
uses: actions/checkout@v4
1919
with:
2020
fetch-depth: 0
21+
token: ${{ secrets.ADMIN_TOKEN }}
22+
23+
- name: Check if should skip
24+
id: check_skip
25+
run: |
26+
if [ "$(git log -1 --pretty=format:'%an')" = "semantic-release" ]; then
27+
echo "skip=true" >> $GITHUB_OUTPUT
28+
fi
2129
2230
- name: Set up Python
31+
if: steps.check_skip.outputs.skip != 'true'
2332
uses: actions/setup-python@v5
2433
with:
2534
python-version: '3.12'
2635

2736
- name: Install uv
37+
if: steps.check_skip.outputs.skip != 'true'
2838
uses: astral-sh/setup-uv@v4
2939

3040
- name: Python Semantic Release
41+
if: steps.check_skip.outputs.skip != 'true'
3142
id: release
3243
uses: python-semantic-release/python-semantic-release@v9.15.2
3344
with:
34-
github_token: ${{ secrets.GITHUB_TOKEN }}
45+
github_token: ${{ secrets.ADMIN_TOKEN }}
3546

3647
- name: Build package
37-
if: steps.release.outputs.released == 'true'
48+
if: steps.check_skip.outputs.skip != 'true' && steps.release.outputs.released == 'true'
3849
run: uv build
3950

4051
- name: Publish to PyPI
41-
if: steps.release.outputs.released == 'true'
52+
if: steps.check_skip.outputs.skip != 'true' && steps.release.outputs.released == 'true'
4253
uses: pypa/gh-action-pypi-publish@release/v1
4354

4455
- name: Publish to GitHub Releases
45-
if: steps.release.outputs.released == 'true'
56+
if: steps.check_skip.outputs.skip != 'true' && steps.release.outputs.released == 'true'
4657
uses: python-semantic-release/publish-action@v9.15.2
4758
with:
48-
github_token: ${{ secrets.GITHUB_TOKEN }}
59+
github_token: ${{ secrets.ADMIN_TOKEN }}

README.md

Lines changed: 18 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,8 @@
44
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
55
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue)](https://www.python.org/downloads/)
66

7-
<!-- PyPI badges (uncomment once package is published)
87
[![PyPI version](https://img.shields.io/pypi/v/openadapt-capture.svg)](https://pypi.org/project/openadapt-capture/)
98
[![Downloads](https://img.shields.io/pypi/dm/openadapt-capture.svg)](https://pypi.org/project/openadapt-capture/)
10-
-->
119

1210
**OpenAdapt Capture** is the data collection component of the [OpenAdapt](https://github.com/OpenAdaptAI) GUI automation ecosystem.
1311

@@ -43,7 +41,7 @@ Capture platform-agnostic GUI interaction streams with time-aligned screenshots
4341
|-----------|---------|------------|
4442
| **openadapt-capture** | Record human demonstrations | [GitHub](https://github.com/OpenAdaptAI/openadapt-capture) |
4543
| **openadapt-ml** | Train and evaluate GUI automation models | [GitHub](https://github.com/OpenAdaptAI/openadapt-ml) |
46-
| **openadapt-privacy** | PII scrubbing for recordings | Coming soon |
44+
| **openadapt-privacy** | PII scrubbing for recordings | [GitHub](https://github.com/OpenAdaptAI/openadapt-privacy) |
4745

4846
---
4947

@@ -208,75 +206,29 @@ The HTML viewer includes:
208206
uv run python scripts/generate_readme_demo.py --duration 10
209207
```
210208

211-
## Optional Extras
209+
## Sharing Recordings
212210

213-
| Extra | Features |
214-
|-------|----------|
215-
| `audio` | Audio capture + Whisper transcription |
216-
| `privacy` | PII scrubbing (openadapt-privacy) |
217-
| `all` | Everything |
218-
219-
---
220-
221-
## Training with OpenAdapt-ML
222-
223-
Captured recordings can be used to train vision-language models with [openadapt-ml](https://github.com/OpenAdaptAI/openadapt-ml).
224-
225-
### End-to-End Workflow
211+
Share recordings between machines using [Magic Wormhole](https://magic-wormhole.readthedocs.io/):
226212

227213
```bash
228-
# 1. Capture a workflow demonstration
229-
uv run python -c "
230-
from openadapt_capture import Recorder
231-
232-
with Recorder('./my_capture', task_description='Turn off Night Shift') as recorder:
233-
input('Perform the task, then press Enter to stop...')
234-
"
235-
236-
# 2. Train a model on the capture (requires openadapt-ml)
237-
uv pip install openadapt-ml
238-
uv run python -m openadapt_ml.cloud.local train \
239-
--capture ./my_capture \
240-
--open # Opens training dashboard
241-
242-
# 3. Compare human vs model predictions
243-
uv run python -m openadapt_ml.scripts.compare \
244-
--capture ./my_capture \
245-
--checkpoint checkpoints/model \
246-
--open
247-
```
214+
# On the sending machine
215+
capture share send ./my_capture
216+
# Shows a code like: 7-guitarist-revenge
248217

249-
### Cloud GPU Training
250-
251-
For faster training with cloud GPUs:
252-
253-
```bash
254-
# Train on Lambda Labs A10 (~$0.75/hr)
255-
uv run python -m openadapt_ml.cloud.lambda_labs train \
256-
--capture ./my_capture \
257-
--goal "Turn off Night Shift"
218+
# On the receiving machine
219+
capture share receive 7-guitarist-revenge
258220
```
259221

260-
See the [openadapt-ml documentation](https://github.com/OpenAdaptAI/openadapt-ml#6-cloud-gpu-training) for cloud setup.
261-
262-
### Data Format
263-
264-
OpenAdapt-ML converts captures to its Episode format automatically:
265-
266-
```python
267-
from openadapt_ml.ingest.capture import capture_to_episode
222+
The `share` command compresses the recording, sends it via Magic Wormhole, and extracts it on the receiving end. No account or setup required - just share the code.
268223

269-
episode = capture_to_episode("./my_capture")
270-
print(f"Loaded {len(episode.steps)} steps")
271-
print(f"Instruction: {episode.instruction}")
272-
```
224+
## Optional Extras
273225

274-
The conversion maps capture event types to ML action types:
275-
- `mouse.singleclick` / `mouse.click` -> `CLICK`
276-
- `mouse.doubleclick` -> `DOUBLE_CLICK`
277-
- `mouse.drag` -> `DRAG`
278-
- `mouse.scroll` -> `SCROLL`
279-
- `key.type` -> `TYPE`
226+
| Extra | Features |
227+
|-------|----------|
228+
| `audio` | Audio capture + Whisper transcription |
229+
| `privacy` | PII scrubbing ([openadapt-privacy](https://github.com/OpenAdaptAI/openadapt-privacy)) |
230+
| `share` | Recording sharing via Magic Wormhole |
231+
| `all` | Everything |
280232

281233
---
282234

@@ -290,6 +242,8 @@ uv run pytest
290242
## Related Projects
291243

292244
- [openadapt-ml](https://github.com/OpenAdaptAI/openadapt-ml) - Train and evaluate GUI automation models
245+
- [openadapt-privacy](https://github.com/OpenAdaptAI/openadapt-privacy) - PII detection and scrubbing for recordings
246+
- [openadapt-evals](https://github.com/OpenAdaptAI/openadapt-evals) - Benchmark evaluation for GUI agents
293247
- [Windows Agent Arena](https://github.com/microsoft/WindowsAgentArena) - Benchmark for Windows GUI agents
294248

295249
## License

openadapt_capture/__init__.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,12 @@
7777

7878
# Browser events and bridge (optional - requires websockets)
7979
try:
80+
from openadapt_capture.browser_bridge import (
81+
BrowserBridge,
82+
BrowserEventRecord,
83+
BrowserMode,
84+
run_browser_bridge,
85+
)
8086
from openadapt_capture.browser_events import (
8187
BoundingBox,
8288
BrowserClickEvent,
@@ -93,12 +99,6 @@
9399
SemanticElementRef,
94100
VisibleElement,
95101
)
96-
from openadapt_capture.browser_bridge import (
97-
BrowserBridge,
98-
BrowserEventRecord,
99-
BrowserMode,
100-
run_browser_bridge,
101-
)
102102
_BROWSER_BRIDGE_AVAILABLE = True
103103
except ImportError:
104104
_BROWSER_BRIDGE_AVAILABLE = False

openadapt_capture/browser_bridge.py

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -381,20 +381,14 @@ async def _handle_dom_event(self, data: dict) -> None:
381381
self._event_count += 1
382382

383383
# Parse into typed event if possible
384-
typed_event = self._parse_typed_event(event_type, payload, data)
384+
self._parse_typed_event(event_type, payload, data)
385385

386386
# Store in CaptureStorage if available
387387
if self.storage is not None:
388-
# Store as JSON in the events table
389-
# Note: We store the raw event, not Pydantic model to match storage patterns
390-
try:
391-
from openadapt_capture.events import BaseEvent
392-
# Create a minimal event for storage compatibility
393-
# Browser events don't fit the standard EventType enum
394-
# so we store them as raw JSON in a custom way
395-
pass # Storage integration would go here
396-
except ImportError:
397-
pass
388+
# Storage integration would go here
389+
# Browser events don't fit the standard EventType enum
390+
# so we store them as raw JSON in a custom way
391+
pass
398392

399393
# Notify callback
400394
if self.on_event is not None:

openadapt_capture/browser_events.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212

1313
from pydantic import BaseModel, Field
1414

15-
1615
# =============================================================================
1716
# Browser Event Types
1817
# =============================================================================

openadapt_capture/cli.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ def share(action: str, path_or_code: str, output_dir: str = ".") -> None:
353353
capture share receive 7-guitarist-revenge
354354
capture share receive 7-guitarist-revenge ./recordings
355355
"""
356-
from openadapt_capture.share import send, receive
356+
from openadapt_capture.share import receive, send
357357

358358
if action == "send":
359359
send(path_or_code)
Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
"""Platform-specific implementations for GUI event capture.
2+
3+
This module provides platform-specific implementations for:
4+
- Screen capture
5+
- Input event capture
6+
- Display information (resolution, DPI, pixel ratio)
7+
8+
The module automatically selects the appropriate implementation based on
9+
the current platform (darwin, win32, linux).
10+
"""
11+
12+
from __future__ import annotations
13+
14+
import sys
15+
from typing import TYPE_CHECKING
16+
17+
if TYPE_CHECKING:
18+
from typing import Protocol
19+
20+
class PlatformProvider(Protocol):
21+
"""Protocol for platform-specific providers."""
22+
23+
@staticmethod
24+
def get_screen_dimensions() -> tuple[int, int]:
25+
"""Get screen dimensions in physical pixels."""
26+
...
27+
28+
@staticmethod
29+
def get_display_pixel_ratio() -> float:
30+
"""Get display pixel ratio (physical/logical)."""
31+
...
32+
33+
@staticmethod
34+
def is_accessibility_enabled() -> bool:
35+
"""Check if accessibility permissions are enabled."""
36+
...
37+
38+
39+
def get_platform() -> str:
40+
"""Get the current platform identifier.
41+
42+
Returns:
43+
'darwin' for macOS, 'win32' for Windows, 'linux' for Linux.
44+
"""
45+
return sys.platform
46+
47+
48+
def get_platform_provider() -> "PlatformProvider":
49+
"""Get the platform-specific provider for the current OS.
50+
51+
Returns:
52+
Platform provider instance for the current operating system.
53+
54+
Raises:
55+
NotImplementedError: If the platform is not supported.
56+
"""
57+
platform = get_platform()
58+
59+
if platform == "darwin":
60+
from openadapt_capture.platform.darwin import DarwinPlatform
61+
return DarwinPlatform()
62+
elif platform == "win32":
63+
from openadapt_capture.platform.windows import WindowsPlatform
64+
return WindowsPlatform()
65+
elif platform.startswith("linux"):
66+
from openadapt_capture.platform.linux import LinuxPlatform
67+
return LinuxPlatform()
68+
else:
69+
raise NotImplementedError(f"Platform not supported: {platform}")
70+
71+
72+
def get_screen_dimensions() -> tuple[int, int]:
73+
"""Get screen dimensions in physical pixels.
74+
75+
This returns the actual screenshot pixel dimensions, which may be
76+
larger than logical dimensions on HiDPI/Retina displays.
77+
78+
Returns:
79+
Tuple of (width, height) in physical pixels.
80+
"""
81+
try:
82+
provider = get_platform_provider()
83+
return provider.get_screen_dimensions()
84+
except (NotImplementedError, ImportError):
85+
# Fallback to generic implementation
86+
try:
87+
from PIL import ImageGrab
88+
screenshot = ImageGrab.grab()
89+
return screenshot.size
90+
except Exception:
91+
return (1920, 1080) # Default fallback
92+
93+
94+
def get_display_pixel_ratio() -> float:
95+
"""Get the display pixel ratio (physical/logical).
96+
97+
This is the ratio of physical pixels to logical pixels.
98+
For example, 2.0 for Retina displays on macOS.
99+
100+
Returns:
101+
Pixel ratio (e.g., 1.0 for standard displays, 2.0 for Retina).
102+
"""
103+
try:
104+
provider = get_platform_provider()
105+
return provider.get_display_pixel_ratio()
106+
except (NotImplementedError, ImportError):
107+
return 1.0
108+
109+
110+
def is_accessibility_enabled() -> bool:
111+
"""Check if accessibility permissions are enabled.
112+
113+
On macOS, this checks if the application has accessibility permissions
114+
required for keyboard and mouse event capture.
115+
116+
Returns:
117+
True if accessibility is enabled, False otherwise.
118+
"""
119+
try:
120+
provider = get_platform_provider()
121+
return provider.is_accessibility_enabled()
122+
except (NotImplementedError, ImportError):
123+
return True # Assume enabled on unknown platforms
124+
125+
126+
__all__ = [
127+
"get_platform",
128+
"get_platform_provider",
129+
"get_screen_dimensions",
130+
"get_display_pixel_ratio",
131+
"is_accessibility_enabled",
132+
]

0 commit comments

Comments
 (0)