Skip to content

Commit 54c47ca

Browse files
authored
Merge pull request #7 from ScrapeGraphAI/fix/align-types-with-api-docs
fix: align SDK types with actual API responses
2 parents b400fe4 + 7e398b8 commit 54c47ca

File tree

127 files changed

+2082
-20057
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+2082
-20057
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
Create a PR on github with an accurate description following our naming convention for the current changes. $ARGUMENTS
2+

.claude/skills/create-skill.md

Lines changed: 0 additions & 84 deletions
This file was deleted.
Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,8 +36,8 @@ git log -1 --pretty=%s # Last commit message
3636

3737
- Check current branch: `git branch --show-current`
3838
- If on main/master/next, create feature branch with conventional naming
39-
- Switch to new branch: `git checkout -b <username>/<features|fixes>/<branch-name>`
40-
- Current branch convention `<username>/<features|fixes>/<branch-name>`
39+
- Branch convention: `<username>/<type>/<description>` (e.g., `fzuppichini/features/new-feature`)
40+
- Switch to new branch: `git checkout -b <username>/<type>/<description>`
4141

4242
2. **Analyze & Stage**:
4343

@@ -160,21 +160,28 @@ When updating existing PRs, use these comment templates to preserve the original
160160
1. Create branch and make changes
161161
2. Stage, commit, push → triggers PR creation
162162
3. Each subsequent push triggers update comment
163+
4. By default assume the PR is *wip* (work in progress) so open it appropriately
163164

164165
### Commit Message Conventions
165166

167+
See **[docs/GIT_STYLE.md](docs/GIT_STYLE.md)** for full guide.
168+
166169
- `feat:` - New features
167170
- `fix:` - Bug fixes
168171
- `refactor:` - Code refactoring
169172
- `docs:` - Documentation changes
170173
- `test:` - Test additions/modifications
171174
- `chore:` - Maintenance tasks
172175
- `style:` - Formatting changes
176+
- `content:` - Content changes (blog, copy)
177+
- `perf:` - Performance improvements
173178

174179
### Branch Naming Conventions
175180

176-
- `feature/description` - New features
177-
- `fix/bug-description` - Bug fixes
178-
- `refactor/component-name` - Code refactoring
179-
- `docs/update-readme` - Documentation updates
180-
- `test/add-unit-tests` - Test additions
181+
Always use `<username>/<type>/<description>` format:
182+
183+
- `username/features/description` - New features
184+
- `username/fix/description` - Bug fixes
185+
- `username/refactor/description` - Code refactoring
186+
- `username/docs/description` - Documentation updates
187+
- `username/test/description` - Test additions

.github/workflows/ci.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
pull_request:
7+
branches: [main]
8+
9+
jobs:
10+
test:
11+
name: Test
12+
runs-on: ubuntu-latest
13+
steps:
14+
- uses: actions/checkout@v4
15+
- uses: oven-sh/setup-bun@v2
16+
- run: bun install
17+
- run: bun run test
18+
19+
lint:
20+
name: Lint & Typecheck
21+
runs-on: ubuntu-latest
22+
steps:
23+
- uses: actions/checkout@v4
24+
- uses: oven-sh/setup-bun@v2
25+
- run: bun install
26+
- run: bun run check

.gitignore

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,7 @@
1-
node_modules
1+
node_modules
2+
dist/
3+
.DS_Store
4+
bun.lock
5+
*.tsbuildinfo
6+
.env
7+
doc/

ISSUES.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Issues Found During Example Testing
2+
3+
## SDK Bugs
4+
5+
### ~~1. Health endpoint uses wrong base URL~~ FIXED
6+
- **File**: `src/scrapegraphai.ts``checkHealth()`
7+
- **Problem**: The health endpoint lives at `https://api.scrapegraphai.com/healthz` (no `/v1` prefix), but the SDK was sending `GET /v1/healthz` which returned 404.
8+
- **Fix**: `checkHealth()` now uses `HEALTH_URL` (root domain, no `/v1` prefix).
9+
10+
### ~~7. Crawl poll response nested inside `result` wrapper~~ FIXED
11+
- **File**: `src/scrapegraphai.ts``submitAndPoll()`
12+
- **Problem**: The crawl poll API returns `{ status: "success", result: { status: "done", pages: [...], crawled_urls: [...] } }`. The SDK was returning the outer wrapper as `data`, so `data.pages` and `data.crawled_urls` were `undefined`.
13+
- **Fix**: Added `unwrapResult()` that detects the nested `result` object and promotes it to the top level. Also added `llm_result`, `credits_used`, `pages_processed`, `elapsed_time` to `CrawlResponse` type.
14+
15+
## API-Side Issues
16+
17+
### 2. Agentic Scraper returns 500
18+
- **Example**: `agenticscraper/agenticscraper_basic.ts`, `agenticscraper/agenticscraper_ai_extraction.ts`
19+
- **Error**: `Server error — try again later` (HTTP 500)
20+
- **Note**: Both basic and AI extraction modes fail. Likely an API deployment issue.
21+
22+
### 3. Generate Schema — modify existing returns empty schema
23+
- **Example**: `schema/modify_existing_schema.ts`
24+
- **Error**: `generated_schema` comes back as `{}`
25+
- **Note**: Basic generation works fine. Modifying an existing schema returns empty. May be async and needs polling, or the API doesn't fully support modification yet.
26+
27+
### 4. Crawl markdown mode returns 0 pages
28+
- **Example**: `crawl/crawl_markdown.ts`
29+
- **Error**: `extraction_mode: false` returns `{ pages: [] }` despite status `success`
30+
- **Note**: Extraction mode crawls (with prompt) work fine and return pages. Markdown-only mode seems broken on the API side.
31+
32+
### 5. Scrape endpoint rejects uppercase country codes
33+
- **Example**: `scrape/scrape_stealth.ts`
34+
- **Error**: `Invalid country code` when sending `"US"` — must be lowercase `"us"`
35+
- **Note**: Fixed in the example. SDK could validate/lowercase this automatically.
36+
37+
### 6. SearchScraper markdown mode returns empty result
38+
- **Example**: `searchscraper/searchscraper_markdown.ts`
39+
- **Error**: `result` is `{}` when `extraction_mode: false`, though `reference_urls` are populated
40+
- **Note**: The markdown content may be in a different response field, or the API doesn't support this mode correctly.

0 commit comments

Comments
 (0)