The workflow-name-based badge URL was showing "no status" because GitHub requires workflow runs on the specified branch. Using the filename-based URL format (actions/workflows/publish.yml/badge.svg) is more reliable and works regardless of when the workflow last ran.
Co-authored-by: Claude Sonnet 4.5 noreply@anthropic.com
- ci: Remove build_command from semantic-release config
(
db01f86)
The python-semantic-release action runs in a Docker container where uv is not available. Let the workflow handle building instead.
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
- Switch to python-semantic-release for automated versioning
(
a2dd7d5)
Replaces manual tag-triggered publish with python-semantic-release: - Automatic version bumping based on conventional commits - feat: -> minor, fix:/perf: -> patch - Creates GitHub releases automatically - Publishes to PyPI on release
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
-
uitars: Fix Dockerfile for vLLM deployment (
5d457ed) -
Fix CMD format: vLLM image has ENTRYPOINT, CMD should be args only - Fix --limit-mm-per-prompt format: use KEY=VALUE instead of JSON - Reduce max-model-len from 32768 to 8192 to fix CUDA OOM on L4 24GB - Remove model pre-download (causes disk space issues, download at runtime) - Increase health check start-period to 600s for model download
Also adds CLI commands: - cleanup: docker system prune for disk space recovery - wait: poll for server health with configurable timeout - setup_autoshutdown: create CloudWatch/Lambda infrastructure - build --clean: option to cleanup before building - logs: fix stderr capture
Updates CLAUDE.md with non-interactive operations requirements.
-
Prepare for PyPI publishing and update Gemini models (
7be8b1f) -
Add PyPI metadata: maintainers, classifiers, keywords, project URLs - Create LICENSE file (MIT) - Add GitHub workflow for trusted PyPI publishing - Update Google provider to use current Gemini models (3.x, 2.5.x) - Remove deprecated models (2.0, 1.5) that are retired or retiring - Update tests to reflect new model names
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
-
Add literature review, experiment plan, and evaluation harness (
13e69f8) -
Add literature review: UI-TARS (61.6%), OmniParser (39.6%), ScreenSeekeR (+254%) - Add experiment plan: 6 methods comparison across 3 datasets - Add evaluation harness with metrics and dataset formats - Update README with documentation links - Add test assets from OmniParser deployment - Fix Dockerfile for Conda ToS and PaddleOCR compatibility - Add deploy CLI commands: logs, ps, build, run, test
-
Move outdated robust_detection to legacy, update evaluation format (
2f32d31) -
Move robust_detection.md to docs/legacy/ (superseded by ScreenSeekeR approach) - Update evaluation.md Section 6-7 to align with new experiment plan - Compare OmniParser vs UI-TARS instead of baseline vs robust transforms
-
readme: Add CLI usage examples with output (
4d5f392) -
Add status, ps, logs command output examples - Show deploy workflow with .env setup - Document all available commands
-
readme: Use uv sync for dev setup (
8087ca5)
-
Add UI-TARS deployment and client (
da67e66) -
Add UITarsSettings config class for UI-TARS deployment - Create deploy/uitars module with vLLM-based Dockerfile - Implement UITarsClient for grounding via OpenAI-compatible API - Add GroundingResult dataclass with coordinate conversion - Include smart_resize() for Qwen2.5-VL coordinate scaling - Add [uitars] optional dependency group (openai) - Update CLAUDE.md with UI-TARS CLI commands - Update README.md with usage examples and API docs - Add uitars_deployment_design.md with full design spec
-
deploy: Add auto-shutdown and fix PaddleOCR compatibility (
ab758df) -
pin PaddleOCR to v2.8.1 for API compatibility - add auto-shutdown for cost management - add config.py and .env.example
-
deploy: Add CLI commands and fix transformers version (
9e6398b) -
add logs, ps, build, run, test CLI commands - add CLAUDE.md with deployment instructions - pin transformers==4.44.2 for Florence-2 compatibility
-
eval: Add evaluation framework for comparing grounding methods (
064a431)
Implement comprehensive evaluation framework:
- Dataset schema with AnnotatedElement, Sample, Dataset classes - Synthetic UI dataset generator (buttons, icons, text, links) - Evaluation methods for OmniParser and UI-TARS - Cropping strategies: baseline, fixed, ScreenSeekeR-style - Metrics: detection rate, IoU, latency by size/type - Results storage and multi-method comparison - Visualization: charts (matplotlib) and console tables - CLI: generate, run, compare, list commands
Usage: python -m openadapt_grounding.eval generate --type synthetic --count 500 python -m openadapt_grounding.eval run --method omniparser --dataset synthetic python -m openadapt_grounding.eval compare --charts-dir evaluation/charts
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
- eval: Add synthetic_hard evaluation dataset
(
c97cf70)
Add a more challenging synthetic evaluation dataset with 48 samples for testing VLM API providers and grounding methods. Contains synthetic UI screenshots with annotations for element localization testing.
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com
- providers: Add VLM API providers for Claude, GPT, and Gemini
(
9a766d1)
Add a unified provider abstraction for Visual Language Model APIs: - Base provider class with coordinate normalization and response parsing - Anthropic provider for Claude models (claude-sonnet-4-20250514) - OpenAI provider for GPT models (gpt-4o) - Google provider for Gemini models (gemini-2.0-flash-exp)
Features: - Lazy loading with optional dependencies per provider - Factory function get_provider() with name aliases - Coordinate extraction from model responses with regex fallback - Image encoding utilities (base64 conversion) - Comprehensive test suite with mocking
Optional dependencies added to pyproject.toml: - providers-anthropic, providers-openai, providers-google - providers (all providers combined)
Co-Authored-By: Claude Opus 4.5 noreply@anthropic.com