AppiumTestDistribution · saikrishna321 · May 25, 2026 · May 25, 2026 · May 25, 2026 · May 25, 2026
diff --git a/.agents/skills/use-appclaw-agent-cli/SKILL.md b/.agents/skills/use-appclaw-agent-cli/SKILL.md
@@ -0,0 +1,52 @@
+---
+name: use-appclaw-agent-cli
+description: >
+  Use the appclaw-agent CLI to directly open, inspect, and interact with a
+  mobile app via terminal commands — without writing a YAML flow. Trigger this
+  skill when the user asks to open an app, tap or fill a UI element, check
+  visibility, or perform any one-off device interaction that does not require a
+  reusable flow file.
+---
+
+# AppClaw Agent CLI
+
+When the user asks you to operate or inspect a mobile device interactively
+(open an app, tap a button, check visibility, etc.) using terminal commands
+rather than a YAML flow:
+
+1. Verify that `appclaw-agent` is installed and run `appclaw-agent help workflow`.
+2. Use a descriptive named session for the task.
+3. Inspect with `snapshot -i --json` before choosing a target.
+4. Prefer returned `@eN` references or durable selectors for interaction.
+5. Request a new snapshot after each state-changing action.
+6. Use `--vision` only when explicitly requested or when visual targeting is required and configured.
+
+## Scrolling — direction reference
+
+**Always use `scroll`, never `swipe`.** `scroll` and `swipe` are aliases in the parser, but `scroll` reads unambiguously — `scroll down` means scroll down, `scroll up` means scroll up.
+
+| Goal                                | Command                                                 |
+| ----------------------------------- | ------------------------------------------------------- |
+| See content **below** (scroll down) | `appclaw-agent --session <name> scroll down --json`     |
+| See content **above** (scroll up)   | `appclaw-agent --session <name> scroll up --json`       |
+| Scroll down within an element       | `appclaw-agent --session <name> scroll @eN down --json` |
+| Scroll up within an element         | `appclaw-agent --session <name> scroll @eN up --json`   |
+
+**Never use `swipe`** — `swipe up` is ambiguous (training data says it scrolls down; AppClaw treats it as scroll up). Using `scroll` eliminates the confusion entirely.
+
+**Never use `swipe @eN direction`** — element-scoped swipe crashes (`swipeElement is not a function`). Use `scroll @eN direction` instead. 7. Close the named session when the task is complete.
+
+## Assertions must always be visual
+
+**Never use DOM presence (`is visible`, snapshot element checks) as the sole assertion.** The DOM may contain elements that are off-screen, scrolled out of view, or clipped — DOM presence does not mean the user can see it.
+
+For every assertion or verification step:
+
+1. Take a screenshot: `appclaw-agent --session <name> screenshot /tmp/<name>.png`
+2. Read the screenshot image with the Read tool and visually analyze what is actually rendered on screen.
+3. Base your pass/fail verdict **only on what you can see in the screenshot**, not on DOM presence.
+4. If the target content is not clearly visible in the screenshot, the assertion **fails** — even if a DOM element exists for it.
+
+This applies to any check phrased as "verify X is present", "confirm X appears", "assert X is visible", or similar.
+
+The installed CLI help is the source of truth for supported commands.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -18,10 +18,10 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '22'
 
       - name: Install dependencies
-        run: npm install --no-package-lock
+        run: npm ci
 
       - name: Format check (Prettier)
         run: npm run format:check
@@ -43,13 +43,13 @@ jobs:
 
       - uses: actions/setup-node@v4
         with:
-          node-version: '20'
+          node-version: '22'
           cache: npm
           cache-dependency-path: vscode-extension/package-lock.json
 
       - name: Install dependencies
         working-directory: vscode-extension
-        run: npm ci
+        run: npm install --no-package-lock
 
       - name: Build
         working-directory: vscode-extension

diff --git a/.github/workflows/publish-agent.yml b/.github/workflows/publish-agent.yml
@@ -0,0 +1,43 @@
+name: Release appclaw-agent
+
+on:
+  push:
+    branches: [main]
+  workflow_dispatch:
+
+permissions:
+  contents: write
+  issues: write
+  pull-requests: write
+  id-token: write
+
+jobs:
+  release:
+    name: Semantic Release
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: packages/appclaw-agent
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          persist-credentials: false
+
+      - uses: actions/setup-node@v4
+        with:
+          node-version: '22'
+          registry-url: 'https://registry.npmjs.org'
+
+      - name: Install dependencies
+        run: npm install --no-package-lock
+
+      - name: Build
+        run: npm run build
+
+      - name: Semantic Release
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
+          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
+        run: npx semantic-release
diff --git a/README.md b/README.md
@@ -73,7 +73,7 @@ Screenshot-first mode using Stark (df-vision + Gemini) for element location. Req
 ```env
 LLM_PROVIDER=gemini
 LLM_API_KEY=your-gemini-api-key
-LLM_MODEL=gemini-3.1-flash-lite-preview
+LLM_MODEL=gemini-3.1-flash-lite
 AGENT_MODE=vision
 ```
 
@@ -285,7 +285,7 @@ All configuration is via `.env`:
 | **LLM**               |           |                                                                                                       |
 | `LLM_PROVIDER`        | `gemini`  | LLM provider (`anthropic`, `openai`, `gemini`, `groq`, `ollama`)                                      |
 | `LLM_API_KEY`         | —         | API key for your provider (not used for local Ollama; see `OLLAMA_*` for cloud URL / auth)            |
-| `LLM_MODEL`           | (auto)    | Model override (e.g. `gemini-3.1-flash-lite-preview`, `claude-sonnet-4-20250514`)                     |
+| `LLM_MODEL`           | (auto)    | Model override (e.g. `gemini-3.1-flash-lite`, `claude-sonnet-4-20250514`)                     |
 | `OLLAMA_BASE_URL`     | (default) | Ollama API base URL (e.g. remote or Docker). Empty = `http://127.0.0.1:11434` (`LLM_PROVIDER=ollama`) |
 | `OLLAMA_API_KEY`      | —         | Optional Bearer token for Ollama Cloud or authenticated endpoints (`LLM_PROVIDER=ollama`)             |
 | `AGENT_MODE`          | `vision`  | `dom` (XML locators) or `vision` (screenshot-first)                                                   |
@@ -377,6 +377,30 @@ This installs two skills:
 
 Skills are auto-discovered if you're working inside a clone of this repo.
 
+## Agent-Driven Device CLI
+
+For Claude Code, Gemini CLI, Codex CLI, and other agents that can run terminal
+commands, install the separate agent-native CLI:
+
+```sh
+npm install -g appclaw-agent
+appclaw-agent help workflow
+```
+
+`appclaw-agent` maintains named device sessions across commands and returns
+compact UI references for deterministic interaction:
+
+```sh
+appclaw-agent --session login open com.example.app --platform android
+appclaw-agent --session login snapshot -i --json
+appclaw-agent --session login press @e1 --json
+appclaw-agent --session login close
+```
+
+Install the `use-appclaw-agent-cli` skill to teach a supported agent this
+workflow. Vision operations are available explicitly through `--vision` when
+AppClaw vision is configured.
+
 ## License
 
 Licensed under the Apache License, Version 2.0. See `LICENSE` for the full text.
diff --git a/landing/index.html b/landing/index.html
@@ -1852,7 +1852,12 @@ <h1 class="reveal reveal-delay-1">
         <div class="providers-label">Where you run it</div>
         <p class="run-surface-text">
           <a href="https://www.npmjs.com/package/appclaw" target="_blank" rel="noopener">CLI</a>
-          (<code>npx appclaw</code>) for terminals and CI, or the
+          (<code>npx appclaw</code>) for terminals and CI,
+          <a href="https://www.npmjs.com/package/appclaw-agent" target="_blank" rel="noopener"
+            >Agent CLI</a
+          >
+          (<code>npm i -g appclaw-agent</code>) for AI coding agents like Claude Code and Gemini
+          CLI, or the
           <a
             href="https://marketplace.visualstudio.com/items?itemName=AppClaw.appclaw"
             target="_blank"
@@ -2328,7 +2333,8 @@ <h2 class="reveal reveal-delay-1">Ready to automate your<br />mobile apps?</h2>
           </a>
         </div>
         <div class="cta-meta reveal">
-          CLI<span>·</span>Cursor &amp; VS Code<span>·</span>Apache 2.0<span>·</span>BYO LLM Key
+          CLI<span>·</span>Agent CLI<span>·</span>Cursor &amp; VS Code<span>·</span>Apache
+          2.0<span>·</span>BYO LLM Key
         </div>
       </div>
     </section>