Skip to content

Commit 9a598d7

Browse files
VinciGit00claude
andcommitted
docs: add monitor.activity(), fix API base URL, update skill to just-scrape
- Add monitor.activity() tick-history docs to Python SDK, JS SDK, and MCP server - Fix API base URL from /v2 to /api/v2 across all SDK and MCP docs - Rename SGAI_TIMEOUT_S → SGAI_TIMEOUT (with legacy alias noted) - Update Claude Code skill to reference just-scrape CLI and new install paths - Bump MCP server tool count from 17 → 18 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d06063c commit 9a598d7

5 files changed

Lines changed: 116 additions & 53 deletions

File tree

integrations/claude-code-skill.mdx

Lines changed: 45 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,16 @@ icon: '/logo/claude-color.svg'
66

77
## Overview
88

9-
The ScrapeGraphAI [Claude Code Skill](https://github.com/ScrapeGraphAI/skill) gives AI coding agents full access to ScrapeGraphAI's web scraping, search, and crawling APIs. Once installed, agents like Claude Code, Cursor, Copilot, and Cline can scrape websites, extract structured data, and crawl pages — all from natural language prompts.
9+
The ScrapeGraphAI Claude Code Skill ships with [just-scrape](https://github.com/ScrapeGraphAI/just-scrape), the official CLI for the **v2 API**. Once installed, agents like Claude Code, Cursor, Copilot, Cline, and Windsurf can scrape websites, extract structured data, search the web, crawl sites, and set up page-change monitors — all from natural language prompts.
10+
11+
The skill wires `just-scrape` into your agent's skill directory so the agent knows when and how to invoke the CLI.
1012

1113
<Card
1214
title="GitHub Repository"
1315
icon="github"
14-
href="https://github.com/ScrapeGraphAI/skill"
16+
href="https://github.com/ScrapeGraphAI/just-scrape"
1517
>
16-
View the skill source code and documentation
18+
Browse the CLI and skill source
1719
</Card>
1820

1921
## Installation
@@ -22,21 +24,25 @@ The ScrapeGraphAI [Claude Code Skill](https://github.com/ScrapeGraphAI/skill) gi
2224

2325
### Option 1: Install via skills.sh (Recommended)
2426

25-
The fastest way to install. Requires [Node.js](https://nodejs.org).
27+
The fastest way to install. Requires [Node.js](https://nodejs.org) or [Bun](https://bun.sh).
2628

2729
```bash
28-
npx skills add ScrapeGraphAI/skill
30+
bunx skills add https://github.com/ScrapeGraphAI/just-scrape
31+
# or
32+
npx skills add https://github.com/ScrapeGraphAI/just-scrape
2933
```
3034

31-
This clones the skill and symlinks it into your `~/.claude/skills/` directory automatically.
35+
This symlinks `skills/just-scrape/SKILL.md` into your `~/.claude/skills/` directory automatically.
36+
37+
You can also browse the published skill at [skills.sh/scrapegraphai/just-scrape/just-scrape](https://skills.sh/scrapegraphai/just-scrape/just-scrape).
3238

3339
### Option 2: Manual install
3440

3541
Clone the repository and create the symlink yourself:
3642

3743
```bash
38-
git clone https://github.com/ScrapeGraphAI/skill.git ~/.claude/skills/scrapegraphai
39-
ln -sf ~/.claude/skills/scrapegraphai/SKILL.md ~/.claude/skills/scrapegraphai.md
44+
git clone https://github.com/ScrapeGraphAI/just-scrape.git ~/.claude/skills/just-scrape
45+
ln -sf ~/.claude/skills/just-scrape/skills/just-scrape/SKILL.md ~/.claude/skills/just-scrape.md
4046
```
4147

4248
### Option 3: Project-level install
@@ -45,58 +51,54 @@ Install the skill for a single project only:
4551

4652
```bash
4753
mkdir -p .claude/skills
48-
git clone https://github.com/ScrapeGraphAI/skill.git .claude/skills/scrapegraphai
49-
ln -sf .claude/skills/scrapegraphai/SKILL.md .claude/skills/scrapegraphai.md
54+
git clone https://github.com/ScrapeGraphAI/just-scrape.git .claude/skills/just-scrape
55+
ln -sf .claude/skills/just-scrape/skills/just-scrape/SKILL.md .claude/skills/just-scrape.md
5056
```
5157

5258
</Steps>
5359

5460
## Setup
5561

56-
Set your ScrapeGraphAI API key as an environment variable:
62+
Install the CLI and set your ScrapeGraphAI API key:
5763

5864
```bash
65+
npm install -g just-scrape@latest
5966
export SGAI_API_KEY="sgai-..."
6067
```
6168

6269
<Note>
63-
Get your API key from the [dashboard](https://scrapegraphai.com/dashboard).
70+
Get your API key from the [dashboard](https://scrapegraphai.com/dashboard). The CLI also accepts the key via a `.env` file, `~/.scrapegraphai/config.json`, or an interactive prompt.
6471
</Note>
6572

66-
## What's Included
67-
68-
The skill installs the following files:
69-
70-
| File | Description |
71-
|------|-------------|
72-
| `SKILL.md` | Main skill file with API reference, examples, and decision guide |
73-
| `references/api-endpoints.md` | Full parameter tables for all endpoints |
74-
| `references/sdk-examples.md` | Python and JavaScript SDK examples |
75-
| `references/advanced-features.md` | Stealth mode, schemas, scrolling, pagination, and more |
76-
7773
## Capabilities
7874

75+
The skill maps to the v2 API surface via `just-scrape`:
76+
7977
<CardGroup cols={2}>
80-
<Card title="SmartScraper" icon="wand-magic-sparkles">
81-
Extract structured data from any webpage using natural language prompts
78+
<Card title="extract" icon="wand-magic-sparkles">
79+
Extract structured data from any URL using AI (`just-scrape extract`)
8280
</Card>
83-
<Card title="SearchScraper" icon="magnifying-glass">
84-
Search the web and extract results with AI or as markdown
81+
<Card title="search" icon="magnifying-glass">
82+
Search the web and extract structured results (`just-scrape search`)
8583
</Card>
86-
<Card title="Markdownify" icon="file-code">
87-
Convert any webpage into clean, formatted markdown
84+
<Card title="scrape" icon="file-code">
85+
Fetch a page in 8 formats: markdown, html, screenshot, branding, links, images, summary, json
8886
</Card>
89-
<Card title="SmartCrawler" icon="spider">
90-
Crawl multiple pages from a website with depth and path controls
87+
<Card title="markdownify" icon="file-lines">
88+
Convert any webpage into clean markdown (wraps `scrape -f markdown`)
9189
</Card>
92-
<Card title="Sitemap" icon="sitemap">
93-
Extract all URLs from a website's sitemap
90+
<Card title="crawl" icon="spider">
91+
Crawl multi-page sites with depth, link, and pattern controls
9492
</Card>
95-
<Card title="Agentic Scraper" icon="robot">
96-
Browser automation — login, click, navigate, fill forms, then extract
93+
<Card title="monitor" icon="clock">
94+
Schedule page-change monitors with cron intervals, webhooks, and activity polling
9795
</Card>
9896
</CardGroup>
9997

98+
<Note>
99+
Removed from v1: `sitemap`, `agentic_scraper`, `generate-schema`, `validate`. There is no direct replacement on v2.
100+
</Note>
101+
100102
## Example Prompts
101103

102104
Once the skill is installed, you can use natural language prompts directly in your AI coding agent:
@@ -118,14 +120,18 @@ Crawl https://example.com/blog with depth 2 and extract the title and summary fr
118120
```
119121

120122
```text
121-
Get all URLs from the sitemap of https://example.com
123+
Monitor https://store.example.com/pricing every hour and webhook me when it changes
124+
```
125+
126+
```text
127+
Create a 30m monitor on https://example.com and poll its activity feed, printing new ticks as they come in
122128
```
123129

124130
```text
125-
Log into https://example.com/dashboard, click "Reports", and extract the table data
131+
Fetch a full-page screenshot and branding assets for https://example.com
126132
```
127133

128-
The agent will automatically select the right ScrapeGraphAI endpoint, handle authentication, poll for async results, and return structured data.
134+
The agent will automatically select the right `just-scrape` command, handle authentication, poll for async results (crawls), and return structured data.
129135

130136
## Supported Agents
131137

@@ -146,7 +152,7 @@ Need help with the skill?
146152
<Card
147153
title="GitHub Issues"
148154
icon="github"
149-
href="https://github.com/ScrapeGraphAI/skill/issues"
155+
href="https://github.com/ScrapeGraphAI/just-scrape/issues"
150156
>
151157
Report bugs and request features
152158
</Card>

sdks/javascript.mdx

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,32 @@ await monitor.resume("your-api-key", cronId);
349349
await monitor.delete("your-api-key", cronId);
350350
```
351351
352+
#### monitor.activity() — poll tick history
353+
354+
Paginate through per-run ticks for a monitor (what changed on each scheduled run).
355+
356+
```javascript
357+
import { monitor } from "scrapegraph-js";
358+
359+
const activity = await monitor.activity("your-api-key", cronId, { limit: 20 });
360+
361+
if (activity.status === "success") {
362+
for (const tick of activity.data?.ticks ?? []) {
363+
const changed = tick.changed ? "CHANGED" : "no change";
364+
console.log(`[${tick.createdAt}] ${tick.status} - ${changed} (${tick.elapsedMs}ms)`);
365+
}
366+
367+
if (activity.data?.nextCursor) {
368+
const next = await monitor.activity("your-api-key", cronId, {
369+
limit: 20,
370+
cursor: activity.data.nextCursor,
371+
});
372+
}
373+
}
374+
```
375+
376+
Params: `limit` (1–100, default `20`), `cursor` (opaque pagination token). Each tick has `id`, `createdAt`, `status`, `changed`, `elapsedMs`, and `diffs` with per-format deltas.
377+
352378
### getCredits
353379
354380
Check your account credit balance.
@@ -436,9 +462,9 @@ if (result.status === "success") {
436462
437463
| Variable | Description | Default |
438464
|----------|-------------|---------|
439-
| `SGAI_API_URL` | Override API base URL | `https://api.scrapegraphai.com/v2` |
465+
| `SGAI_API_URL` | Override API base URL | `https://api.scrapegraphai.com/api/v2` |
440466
| `SGAI_DEBUG` | Enable debug logging (`"1"`) | off |
441-
| `SGAI_TIMEOUT_S` | Request timeout in seconds | `120` |
467+
| `SGAI_TIMEOUT` | Request timeout in seconds | `120` |
442468
443469
## Support
444470

sdks/python.mdx

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -71,12 +71,12 @@ class ApiResult(BaseModel, Generic[T]):
7171

7272
### Environment Variables
7373

74-
| Variable | Description | Default |
75-
| ----------------- | -------------------------------------------- | ------------------------------------ |
76-
| `SGAI_API_KEY` | Your ScrapeGraphAI API key ||
77-
| `SGAI_API_URL` | Override API base URL | `https://api.scrapegraphai.com/v2` |
78-
| `SGAI_TIMEOUT_S` | Request timeout in seconds | `120` |
79-
| `SGAI_DEBUG` | Enable debug logging (set to `"1"`) | off |
74+
| Variable | Description | Default |
75+
| --------------- | -------------------------------------------- | --------------------------------------- |
76+
| `SGAI_API_KEY` | Your ScrapeGraphAI API key | |
77+
| `SGAI_API_URL` | Override API base URL | `https://api.scrapegraphai.com/api/v2` |
78+
| `SGAI_TIMEOUT` | Request timeout in seconds | `120` |
79+
| `SGAI_DEBUG` | Enable debug logging (set to `"1"`) | off |
8080

8181
The client supports context managers for automatic session cleanup:
8282

@@ -310,6 +310,28 @@ sgai.monitor.resume(cron_id)
310310
sgai.monitor.delete(cron_id)
311311
```
312312

313+
#### `monitor.activity()` — poll tick history
314+
315+
Paginate through the per-run ticks a monitor has produced (what changed on each scheduled run).
316+
317+
```python
318+
from scrapegraph_py import MonitorActivityRequest
319+
320+
act = sgai.monitor.activity(cron_id, MonitorActivityRequest(limit=20))
321+
322+
if act.status == "success":
323+
for tick in act.data.ticks:
324+
status = "CHANGED" if tick.changed else "no change"
325+
print(f"[{tick.created_at}] {tick.status} - {status} ({tick.elapsed_ms}ms)")
326+
327+
if act.data.next_cursor:
328+
more = sgai.monitor.activity(
329+
cron_id, MonitorActivityRequest(limit=20, cursor=act.data.next_cursor),
330+
)
331+
```
332+
333+
`MonitorActivityRequest` fields: `limit` (1–100, default `20`) and optional `cursor` for pagination. Each `MonitorTickEntry` exposes `id`, `created_at`, `status`, `changed`, `elapsed_ms`, and a `diffs` model with per-format deltas.
334+
313335
#### `MonitorCreateRequest` fields
314336

315337
| Field | Type | Required | Description |

services/mcp-server.mdx

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ A production‑ready Model Context Protocol (MCP) server that connects LLMs to t
2222

2323
## Key Features
2424

25-
- Full v2 API coverage: scrape, extract, search, crawl (+ stop/resume), monitor lifecycle, credits, history, and schema generation
26-
- Uses the v2 API base URL (`https://api.scrapegraphai.com/v2`) with the `SGAI-APIKEY` header — wire format matches [scrapegraph-py v2](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84)
25+
- Full v2 API coverage: scrape, extract, search, crawl (+ stop/resume), monitor lifecycle (+ activity polling), credits, history, and schema generation
26+
- Uses the v2 API base URL (`https://api.scrapegraphai.com/api/v2`) with the `SGAI-APIKEY` header — wire format matches [scrapegraph-py v2](https://github.com/ScrapeGraphAI/scrapegraph-py/pull/84)
2727
- Remote HTTP MCP endpoint and local Python server support
2828
- Works with Cursor, Claude Desktop, and any MCP‑compatible client
2929
- Robust error handling, timeouts, and production‑tested reliability
@@ -176,9 +176,10 @@ The server reads the ScrapeGraph API key from `SGAI_API_KEY` (local) or the `X-A
176176
| Variable | Description | Default |
177177
|---|---|---|
178178
| `SGAI_API_KEY` | ScrapeGraph API key ||
179-
| `SGAI_API_URL` | Override the v2 API base URL | `https://api.scrapegraphai.com/v2` |
180-
| `SGAI_TIMEOUT_S` | Request timeout in seconds | `120` |
179+
| `SGAI_API_URL` | Override the v2 API base URL | `https://api.scrapegraphai.com/api/v2` |
180+
| `SGAI_TIMEOUT` | Request timeout in seconds | `120` |
181181
| `SCRAPEGRAPH_API_BASE_URL` | Legacy alias for `SGAI_API_URL` (still honored) ||
182+
| `SGAI_TIMEOUT_S` | Legacy alias for `SGAI_TIMEOUT` (still honored) ||
182183

183184
## Available Tools
184185

@@ -316,8 +317,15 @@ monitor_get(monitor_id: str)
316317
monitor_pause(monitor_id: str)
317318
monitor_resume(monitor_id: str)
318319
monitor_delete(monitor_id: str)
320+
monitor_activity(
321+
monitor_id: str,
322+
limit: int | None = None, # 1–100, default 20
323+
cursor: str | None = None, # pagination cursor
324+
)
319325
```
320326

327+
`monitor_activity` returns the tick history (`id`, `createdAt`, `status`, `changed`, `elapsedMs`, `diffs`) plus a `nextCursor` when more results are available — mirrors `sgai.monitor.activity()` in the SDKs.
328+
321329
### Account tools
322330

323331
#### credits

services/mcp-server/introduction.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ The Model Context Protocol (MCP) is a standardized way for AI assistants to acce
2222
## Key Features
2323

2424
<CardGroup cols={2}>
25-
<Card title="17 Powerful Tools" icon="tools">
26-
Scrape, extract, search, crawl, generate schemas, monitor scheduled jobs, and manage your account
25+
<Card title="18 Powerful Tools" icon="tools">
26+
Scrape, extract, search, crawl, generate schemas, monitor scheduled jobs (with activity polling), and manage your account
2727
</Card>
2828
<Card title="Remote & Local" icon="server">
2929
Use the hosted HTTP endpoint or run locally via Python
@@ -59,6 +59,7 @@ The MCP server exposes the following tools via API v2:
5959
| **monitor_pause** | Pause a running monitor (POST /monitor/:id/pause) |
6060
| **monitor_resume** | Resume a paused monitor (POST /monitor/:id/resume) |
6161
| **monitor_delete** | Delete a monitor (DELETE /monitor/:id) |
62+
| **monitor_activity** | Poll tick history for a monitor with pagination (GET /monitor/:id/activity) |
6263

6364
<Note>
6465
Removed from v1: `sitemap`, `agentic_scrapper`, `markdownify_status`, `smartscraper_status` (no v2 API equivalents).

0 commit comments

Comments
 (0)