Skip to content

feat!: migrate CLI to scrapegraph-js v2 API#13

Open
VinciGit00 wants to merge 2 commits intomainfrom
feat/migrate-sdk-v2
Open

feat!: migrate CLI to scrapegraph-js v2 API#13
VinciGit00 wants to merge 2 commits intomainfrom
feat/migrate-sdk-v2

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

@VinciGit00 VinciGit00 commented Mar 31, 2026

Summary

  • Migrate the CLI to the scrapegraph-js v2 SDK surface per feat!: migrate scrapegraph-js to api v2 scrapegraph-js#11
  • Rename smart-scraperextract, search-scrapersearch
  • Remove commands dropped from the v2 API: agentic-scraper, generate-schema, sitemap, validate
  • Add client factory using the new scrapegraphai({ apiKey }) pattern
  • Update scrape with --format flag (markdown, html, screenshot, branding)
  • Update crawl to use crawl.start/crawl.status polling lifecycle
  • Replace --stealth boolean with --mode fetch mode enum (auto, fast, js, direct+stealth, js+stealth)
  • All commands now use try/catch (v2 throws on error) and self-timed elapsed

Breaking Changes

Old command / flag New command / flag Notes
smart-scraper extract Renamed
search-scraper search Renamed
scrape scrape Gains --format flag
crawl crawl New v2 polling model
--stealth --mode direct+stealth Fetch mode enum replaces boolean
agentic-scraper Removed from API
generate-schema Removed from API
sitemap Removed from API
validate Removed from API

Fetch Modes

Mode Description
auto Automatic selection (default)
fast Fastest, no JS rendering
js Full JS rendering
direct+stealth Direct fetch with anti-bot bypass
js+stealth JS rendering with anti-bot bypass

Test plan

  • bun run dev -- extract https://example.com -p "Extract title" --json
  • bun run dev -- search "test query" --json
  • bun run dev -- scrape https://example.com --json
  • bun run dev -- scrape https://example.com -f html --json
  • bun run dev -- scrape https://example.com -m js --json
  • bun run dev -- markdownify https://example.com --json
  • bun run dev -- crawl https://example.com --max-pages 5 --json
  • bun run dev -- credits --json
  • bun run dev -- history extract --json
  • bunx tsc --noEmit passes
  • bun run build succeeds

Note: scrapegraph-js is pinned to commit b570a57 (includes fetch mode enum). Update to ^2.0.0 once the SDK is published to npm.

🤖 Generated with Claude Code

VinciGit00 and others added 2 commits March 31, 2026 11:48
Align the CLI with ScrapeGraphAI/scrapegraph-js#11 (v2 SDK migration):

- Rename smart-scraper → extract, search-scraper → search
- Remove commands dropped from the API: agentic-scraper, generate-schema, sitemap, validate
- Add client factory (src/lib/client.ts) using the new scrapegraphai({ apiKey }) pattern
- Update scrape command with --format flag (markdown, html, screenshot, branding)
- Update crawl to use crawl.start/status polling lifecycle
- Update history to use v2 service names and parameters
- All commands now use try/catch (v2 throws on error) and self-timed elapsed

BREAKING CHANGE: CLI commands have been renamed and removed to match the v2 API surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns CLI with scrapegraph-js v2 SDK change (b570a57) that replaced
stealth/render booleans with a unified fetch mode enum:
auto, fast, js, direct+stealth, js+stealth.

- All commands: --stealth boolean → --mode <mode> string
- Pin SDK to commit b570a57 (includes fetch mode change)
- Update README and SKILL.md with new flag syntax

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@VinciGit00
Copy link
Copy Markdown
Member Author

CLI Validation with scrapegraph-js@2.0.0 (feat/sdk-v2-migration)

Built scrapegraph-js from the feat/sdk-v2-migration branch locally and tested all CLI commands against localhost:3002.

Results

Command Status Notes
credits ✅ Pass Returns balance correctly (remaining, used, plan)
scrape ✅ Pass Returns markdown for https://example.com in ~1.1s
extract ✅ Pass Extracts structured JSON with prompt — title & description correct
markdownify ✅ Pass Converts page to markdown in ~430ms
search ✅ Pass Returns search results with URLs and content in ~3.8s
history ✅ Pass Lists request history, both interactive and --json mode work
crawl (start) ✅ Pass Job starts and returns ID; status polling returns "Crawl not found" — server-side issue, not SDK/CLI

Setup

# Built SDK from PR branch
git clone --branch feat/sdk-v2-migration https://github.com/ScrapeGraphAI/scrapegraph-js.git
cd scrapegraph-js && bun install && bun run build

# Symlinked into CLI
ln -s /tmp/scrapegraph-js node_modules/scrapegraph-js

# Ran CLI against local server
SGAI_API_URL=http://localhost:3002 SGAI_API_KEY=<key> bun run src/cli.ts <command>

All v2 SDK methods work correctly through the CLI. The only issue found (crawl status polling) is server-side, not related to the SDK or CLI code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant