firede
diff --git a/‎README.md‎
Lines changed: 131 additions & 50 deletions b/‎README.md‎
Lines changed: 131 additions & 50 deletions
@@ -1,101 +1,147 @@
 # agent-fetch
 
-A Go CLI that always tries to return Markdown for web pages in AI-agent workflows.
+Fetch web pages as clean Markdown for AI-agent workflows.
 
 [中文](./README.zh.md) | **English**
 
-## Why
+[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](./LICENSE)
+[![Go](https://github.com/firede/agent-fetch/actions/workflows/ci.yml/badge.svg)](https://github.com/firede/agent-fetch/actions/workflows/ci.yml)
+[![Release](https://img.shields.io/github/v/release/firede/agent-fetch)](https://github.com/firede/agent-fetch/releases)
 
-Web fetch results are often raw HTML/JS/CSS, which is noisy for LLMs. This tool wraps a fallback pipeline so agents can expect Markdown output.
+## Highlights
 
-If you use tools like Codex or Claude Code, note that they may already include built-in HTML simplification/fetching. Whether you still need `agent-fetch` depends on your workflow.
+- **Markdown-first output pipeline** -- readability extraction + HTML-to-Markdown conversion, so agents receive clean text instead of noisy HTML/JS/CSS
+- **Headless browser fallback** -- renders JavaScript-heavy pages (SPAs, dynamic dashboards) when static extraction falls short
+- **Custom request headers** -- send `Authorization`, `Cookie`, or any header to access authenticated endpoints
+- **Multi-URL batch fetching** -- fetch multiple pages concurrently with structured, per-URL output
 
-## Behavior
+## Quick Start
 
-`agent-fetch` uses four modes:
-
-- `auto` (default):
-  - Request with `Accept: text/markdown`
-  - If response is `text/markdown` or already looks like Markdown, return it
-  - Else do static HTML extraction + convert to Markdown (inject `title`/`description` front matter by default)
-  - If static result quality is too low, fallback to headless browser render and convert (also inject front matter by default)
-- `static`: never uses browser fallback
-- `browser`: always uses headless browser
-- `raw`: send `Accept: text/markdown`, then print that single HTTP response body as-is (no fallback/conversion)
-- `--meta` (default `true`): control whether non-`raw` outputs include front matter (`title`/`description`). For `auto`/`static` direct markdown responses, it may do one extra HTML request to collect metadata.
-- One or more URL arguments are supported. For multiple URLs, requests run concurrently (configurable via `--concurrency`) and output is emitted in input order.
-
-## Runtime dependency
-
-`browser` mode requires a Chrome/Chromium browser available on the host.
+```bash
+# Install
+go install github.com/firede/agent-fetch/cmd/agent-fetch@latest
 
-`auto` mode may fall back to browser rendering, so it can also require Chrome/Chromium on some pages.
+# Fetch a page
+agent-fetch https://example.com
+```
 
-Use `--mode static` or `--mode raw` to avoid browser dependency.
+Or download a prebuilt binary from [Releases](https://github.com/firede/agent-fetch/releases).
 
-## Install (with Go)
+## How It Works
 
-If Go is already installed locally:
+In the default `auto` mode, agent-fetch runs a three-stage fallback pipeline:
 
-```bash
-go install github.com/firede/agent-fetch/cmd/agent-fetch@latest
+```
+Request with Accept: text/markdown
+        |
+        v
+  Markdown response? --yes--> Return as-is
+        | no
+        v
+  Static HTML extraction
+  + Markdown conversion --quality OK?--> Return
+        | no
+        v
+  Headless browser render
+  + extraction + conversion --> Return
 ```
 
-Install a specific version:
+This means most pages are handled without a browser, keeping things fast, while JS-heavy pages still get rendered correctly.
 
-```bash
-go install github.com/firede/agent-fetch/cmd/agent-fetch@v0.2.0
-```
+## Modes
 
-Make sure `$(go env GOPATH)/bin` (usually `~/go/bin`) is in your `PATH`.
+| Mode | Behavior | Browser needed |
+|------|----------|----------------|
+| `auto` (default) | Three-stage fallback: native Markdown -> static extraction -> browser render | Only when static quality is low |
+| `static` | Static HTML extraction only, no browser | No |
+| `browser` | Always use headless Chrome/Chromium | Yes |
+| `raw` | Send `Accept: text/markdown`, return HTTP body verbatim | No |
 
-## Install (from Releases)
+## Installation
 
-1. Download the archive for your platform from the [GitHub Releases](https://github.com/firede/agent-fetch/releases) page.
-2. Extract it and make the binary executable:
+### From Releases
+
+1. Download the archive for your platform from [GitHub Releases](https://github.com/firede/agent-fetch/releases).
+2. Extract and make the binary executable:
 
 ```bash
 chmod +x ./agent-fetch
 ```
 
-### macOS note
+3. Move the binary to a directory on your `PATH`, or run it directly:
 
-Current release binaries are not notarized by Apple (no Apple Developer notarization yet), so Gatekeeper may show:
+```bash
+./agent-fetch https://example.com
+```
 
-`“agent-fetch” cannot be opened because Apple cannot check it for malicious software.`
+#### macOS note
 
-For local validation, remove the quarantine attribute and run:
+Release binaries are not yet notarized by Apple, so Gatekeeper may block execution. Remove the quarantine attribute to proceed:
 
 ```bash
 xattr -dr com.apple.quarantine ./agent-fetch
-./agent-fetch https://example.com
 ```
 
-## Agent Skills
+### With Go
+
+```bash
+go install github.com/firede/agent-fetch/cmd/agent-fetch@latest
+```
 
-Skill location: [`skills/agent-fetch`](./skills/agent-fetch/SKILL.md).
+Install a specific version:
 
-It includes installation guidance and usage instructions for `agent-fetch`.
+```bash
+go install github.com/firede/agent-fetch/cmd/agent-fetch@v0.3.0
+```
+
+Ensure `$(go env GOPATH)/bin` (usually `~/go/bin`) is in your `PATH`.
 
 ## Usage
 
 ```bash
-agent-fetch <url> [url ...]
+agent-fetch [options] <url> [url ...]
 ```
 
-Common flags:
+### Flags
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--mode` | `auto` | Fetch mode: `auto` \| `static` \| `browser` \| `raw` |
+| `--meta` | `true` | Prepend `title`/`description` front matter (use `--meta=false` to disable) |
+| `--timeout` | `20s` | HTTP request timeout (applies to static/auto modes) |
+| `--browser-timeout` | `30s` | Page-load timeout (applies to browser/auto modes) |
+| `--network-idle` | `1200ms` | Wait time after last network activity before capturing content |
+| `--wait-selector` | | CSS selector to wait for before capturing, e.g. `article` |
+| `--header` | | Custom request header, repeatable. e.g. `--header 'Authorization: Bearer token'` |
+| `--user-agent` | `agent-fetch/0.1` | User-Agent header |
+| `--max-body-bytes` | `8388608` | Max response bytes to read |
+| `--concurrency` | `4` | Max concurrent fetches for multi-URL requests |
+
+### Examples
 
 ```bash
-agent-fetch --mode auto --timeout 20s --browser-timeout 30s https://example.com
+# Default auto mode
+agent-fetch https://example.com
+
+# Force browser rendering for a JS-heavy page
 agent-fetch --mode browser --wait-selector 'article' https://example.com
+
+# Static extraction without front matter
 agent-fetch --mode static --meta=false https://example.com
+
+# Get raw HTTP response body
 agent-fetch --mode raw https://example.com
-agent-fetch --mode static --concurrency 4 https://example.com https://example.org
+
+# Authenticated request
 agent-fetch --header 'Authorization: Bearer <token>' https://example.com
+
+# Batch fetch with concurrency control
+agent-fetch --concurrency 4 https://example.com https://example.org
 ```
 
-Fetched content is printed to `stdout` (`raw` mode prints the single HTTP response body unprocessed).  
-For multiple URLs, output uses task markers so each result maps back to its input URL:
+## Multi-URL Batch
+
+When multiple URLs are provided, requests run concurrently (controlled by `--concurrency`) and output is emitted in input order using task markers:
 
 ```text
 <!-- count: 3, succeeded: 2, failed: 1 -->
@@ -106,7 +152,42 @@ For multiple URLs, output uses task markers so each result maps back to its inpu
 <!-- error[2]: ... -->
 ```
 
-Exit code is `0` when all tasks succeed, `1` when any task fails, and `2` for argument/usage errors.
+Exit codes: `0` all succeeded, `1` any task failed, `2` argument/usage error.
+
+## Agent Integration
+
+This project ships a [SKILL.md](./skills/agent-fetch/SKILL.md) that can be used with coding agents that support skill files. Point your skill directory to `skills/agent-fetch` and the agent will be able to invoke `agent-fetch` when its built-in fetch capability is insufficient.
+
+`agent-fetch` reads from the command line and writes Markdown to stdout, making it easy to integrate into any agent pipeline or shell-based tool call:
+
+```bash
+result=$(agent-fetch --mode static https://example.com)
+```
+
+## When Do You Need This?
+
+The table below compares agent-fetch with the built-in web-fetch capabilities found in some coding agents. Actual built-in capabilities vary by product and version.
+
+| Scenario | Built-in web fetch | agent-fetch |
+|----------|:------------------:|:-----------:|
+| Basic page fetch with HTML simplification | Yes | Yes |
+| JavaScript-rendered pages (SPAs) | Varies | Yes (headless browser) |
+| Custom headers (auth, cookies) | Varies | Yes (`--header`) |
+| No AI summarization (outputs extracted body as-is) | Varies | Yes (subject to `--max-body-bytes`) |
+| Batch fetch multiple URLs concurrently | Varies | Yes (`--concurrency`) |
+| CSS selector-based wait/extraction | Varies | Yes (`--wait-selector`) |
+| Works outside coding agents (CLI, CI/CD) | N/A | Yes (standalone CLI) |
+
+**How built-in web fetch typically works:** Tools like Claude Code's WebFetch and Codex's built-in fetch retrieve a page over HTTP, convert the HTML to Markdown, and then pass the content through an AI model that may summarize or truncate it to fit the context window. This pipeline is fast and sufficient for most pages, but it usually does not execute JavaScript (so SPA or JS-rendered pages may return incomplete content), does not support custom request headers, and processes one URL at a time.
+
+- **No built-in web fetch available** (other agent frameworks, CLI pipelines, CI/CD) -- use agent-fetch as your primary fetch tool.
+- **Built-in web fetch available** -- use agent-fetch as a complement for JS-heavy pages, authenticated endpoints, batch fetching, or when you need the extracted content without summarization.
+
+## Runtime Dependencies
+
+`browser` and `auto` modes may require Chrome or Chromium on the host.
+
+Use `--mode static` or `--mode raw` to avoid the browser dependency entirely.
 
 ## Build