Skip to content

Commit cc6fd33

Browse files
fix(docs): re-apply v2 API migration to company-info cookbook notebook (#96)
PR #91 merged with the v1 API still in the file -- the migration script ran but Jupyter autosave overwrote it before commit. Re-applying: - Client -> ScrapeGraphAI (auto-reads SGAI_API_KEY) - smartscraper(website_url=, user_prompt=, output_schema=) -> extract(prompt, url=, schema=model_json_schema()) - Response shape: response['result'] -> response.data.json_data + status check - Fix broken async example link - Update dashboard URL: dashboard.scrapegraphai.com -> scrapegraphai.com/dashboard Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 068a296 commit cc6fd33

1 file changed

Lines changed: 29 additions & 112 deletions

File tree

cookbook/company-info/scrapegraph_sdk.ipynb

Lines changed: 29 additions & 112 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"id": "jEkuKbcRrPcK"
77
},
88
"source": [
9-
"## 🕷️ Extract Company Info with Official Scrapegraph SDK\n"
9+
"## \ud83d\udd77\ufe0f Extract Company Info with Official Scrapegraph SDK\n"
1010
]
1111
},
1212
{
@@ -24,7 +24,7 @@
2424
"id": "IzsyDXEWwPVt"
2525
},
2626
"source": [
27-
"### 🔧 Install `dependencies`"
27+
"### \ud83d\udd27 Install `dependencies`"
2828
]
2929
},
3030
{
@@ -45,7 +45,7 @@
4545
"id": "apBsL-L2KzM7"
4646
},
4747
"source": [
48-
"### 🔑 Import `ScrapeGraph` API key"
48+
"### \ud83d\udd11 Import `ScrapeGraph` API key"
4949
]
5050
},
5151
{
@@ -54,7 +54,7 @@
5454
"id": "ol9gQbAFkh9b"
5555
},
5656
"source": [
57-
"You can find the Scrapegraph API key [here](https://dashboard.scrapegraphai.com/)"
57+
"You can find the Scrapegraph API key [here](https://scrapegraphai.com/dashboard)"
5858
]
5959
},
6060
{
@@ -83,7 +83,7 @@
8383
"output_type": "stream",
8484
"text": [
8585
"SGAI_API_KEY not found in environment.\n",
86-
"Please enter your SGAI_API_KEY: ··········\n",
86+
"Please enter your SGAI_API_KEY: \u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\n",
8787
"SGAI_API_KEY has been set in the environment.\n"
8888
]
8989
}
@@ -102,7 +102,7 @@
102102
"id": "jnqMB2-xVYQ7"
103103
},
104104
"source": [
105-
"### 📝 Defining an `Output Schema` for Webpage Content Extraction\n"
105+
"### \ud83d\udcdd Defining an `Output Schema` for Webpage Content Extraction\n"
106106
]
107107
},
108108
{
@@ -237,7 +237,7 @@
237237
"id": "cDGH0b2DkY63"
238238
},
239239
"source": [
240-
"### 🚀 Initialize `SGAI Client` and start extraction"
240+
"### \ud83d\ude80 Initialize `SGAI Client` and start extraction"
241241
]
242242
},
243243
{
@@ -246,7 +246,7 @@
246246
"id": "4SLJgXgcob6L"
247247
},
248248
"source": [
249-
"Initialize the client for scraping (there's also an async version [here](https://github.com/ScrapeGraphAI/scrapegraph-sdk/blob/main/scrapegraph-py/examples/async_smartscraper_example.py))"
249+
"Initialize the client for scraping (an async version using `AsyncScrapeGraphAI` is available [here](https://github.com/ScrapeGraphAI/scrapegraph-py/blob/main/examples/extract/extract_basic_async.py))."
250250
]
251251
},
252252
{
@@ -257,10 +257,9 @@
257257
},
258258
"outputs": [],
259259
"source": [
260-
"from scrapegraph_py import Client\n",
260+
"from scrapegraph_py import ScrapeGraphAI\n",
261261
"\n",
262-
"# Initialize the client with explicit API key\n",
263-
"sgai_client = Client(api_key=sgai_api_key)"
262+
"sgai_client = ScrapeGraphAI()"
264263
]
265264
},
266265
{
@@ -269,13 +268,7 @@
269268
"id": "M1KSXffZopUD"
270269
},
271270
"source": [
272-
"Here we use `Smartscraper` service to extract structured data using AI from a webpage.\n",
273-
"\n",
274-
"\n",
275-
"> If you already have an HTML file, you can upload it and use `Localscraper` instead.\n",
276-
"\n",
277-
"\n",
278-
"\n"
271+
"Use the `extract` method to pull structured data from a URL with AI. The same method also accepts raw `html=` or `markdown=` if you already have the page content."
279272
]
280273
},
281274
{
@@ -286,11 +279,10 @@
286279
},
287280
"outputs": [],
288281
"source": [
289-
"# Request for Trending Repositories\n",
290-
"repo_response = sgai_client.smartscraper(\n",
291-
" website_url=\"https://scrapegraphai.com/\",\n",
292-
" user_prompt=\"Extract info about the company\",\n",
293-
" output_schema=CompanyInfoSchema,\n",
282+
"repo_response = sgai_client.extract(\n",
283+
" \"Extract info about the company\",\n",
284+
" url=\"https://scrapegraphai.com/\",\n",
285+
" schema=CompanyInfoSchema.model_json_schema(),\n",
294286
")"
295287
]
296288
},
@@ -323,91 +315,16 @@
323315
"id": "F1VfD8B4LPc8",
324316
"outputId": "8d7b2955-1569-4b3a-8ffe-014a8442dd12"
325317
},
326-
"outputs": [
327-
{
328-
"name": "stdout",
329-
"output_type": "stream",
330-
"text": [
331-
"Request ID: 87a7ea1a-9dd4-4d1d-ae76-b419ead57c11\n",
332-
"Company Info:\n",
333-
"{\n",
334-
" \"company_name\": \"ScrapeGraphAI\",\n",
335-
" \"description\": \"ScrapeGraphAI is a powerful AI scraping API designed for efficient web data extraction to power LLM applications and AI agents. It enables developers to perform intelligent AI scraping and extract structured information from websites using advanced AI techniques.\",\n",
336-
" \"founders\": [\n",
337-
" {\n",
338-
" \"name\": \"\",\n",
339-
" \"role\": \"Founder & Technical Lead\",\n",
340-
" \"linkedin\": \"https://www.linkedin.com/in/perinim/\"\n",
341-
" },\n",
342-
" {\n",
343-
" \"name\": \"Marco Vinciguerra\",\n",
344-
" \"role\": \"Founder & Software Engineer\",\n",
345-
" \"linkedin\": \"https://www.linkedin.com/in/marco-vinciguerra-7ba365242/\"\n",
346-
" },\n",
347-
" {\n",
348-
" \"name\": \"Lorenzo Padoan\",\n",
349-
" \"role\": \"Founder & Product Engineer\",\n",
350-
" \"linkedin\": \"https://www.linkedin.com/in/lorenzo-padoan-4521a2154/\"\n",
351-
" }\n",
352-
" ],\n",
353-
" \"logo\": \"https://scrapegraphai.com/images/scrapegraphai_logo.svg\",\n",
354-
" \"partners\": [\n",
355-
" \"PostHog\",\n",
356-
" \"AWS\",\n",
357-
" \"NVIDIA\",\n",
358-
" \"JinaAI\",\n",
359-
" \"DagWorks\",\n",
360-
" \"Browserbase\",\n",
361-
" \"ScrapeDo\",\n",
362-
" \"HackerNews\",\n",
363-
" \"Medium\",\n",
364-
" \"HackADay\"\n",
365-
" ],\n",
366-
" \"pricing_plans\": [\n",
367-
" {\n",
368-
" \"tier\": \"Free\",\n",
369-
" \"price\": \"$0\",\n",
370-
" \"credits\": 100\n",
371-
" },\n",
372-
" {\n",
373-
" \"tier\": \"Starter\",\n",
374-
" \"price\": \"$20/month\",\n",
375-
" \"credits\": 5000\n",
376-
" },\n",
377-
" {\n",
378-
" \"tier\": \"Growth\",\n",
379-
" \"price\": \"$100/month\",\n",
380-
" \"credits\": 40000\n",
381-
" },\n",
382-
" {\n",
383-
" \"tier\": \"Pro\",\n",
384-
" \"price\": \"$500/month\",\n",
385-
" \"credits\": 250000\n",
386-
" }\n",
387-
" ],\n",
388-
" \"contact_emails\": [\n",
389-
" \"contact@scrapegraphai.com\"\n",
390-
" ],\n",
391-
" \"social_links\": {\n",
392-
" \"linkedin\": \"https://www.linkedin.com/company/101881123\",\n",
393-
" \"twitter\": \"https://x.com/scrapegraphai\",\n",
394-
" \"github\": \"https://github.com/ScrapeGraphAI/Scrapegraph-ai\"\n",
395-
" },\n",
396-
" \"privacy_policy\": \"https://scrapegraphai.com/privacy\",\n",
397-
" \"terms_of_service\": \"https://scrapegraphai.com/terms\",\n",
398-
" \"api_status\": \"https://scrapegraphapi.openstatus.dev\"\n",
399-
"}\n"
400-
]
401-
}
402-
],
318+
"outputs": [],
403319
"source": [
404320
"import json\n",
405321
"\n",
406-
"# Print the response\n",
407-
"request_id = repo_response['request_id']\n",
408-
"result = repo_response['result']\n",
322+
"if repo_response.status != \"success\":\n",
323+
" raise RuntimeError(repo_response.error)\n",
324+
"\n",
325+
"result = repo_response.data.json_data\n",
409326
"\n",
410-
"print(f\"Request ID: {request_id}\")\n",
327+
"print(\"Tokens used:\", repo_response.data.usage)\n",
411328
"print(\"Company Info:\")\n",
412329
"print(json.dumps(result, indent=2))"
413330
]
@@ -418,7 +335,7 @@
418335
"id": "2as65QLypwdb"
419336
},
420337
"source": [
421-
"### 💾 Save the output to a `CSV` file"
338+
"### \ud83d\udcbe Save the output to a `CSV` file"
422339
]
423340
},
424341
{
@@ -1883,7 +1800,7 @@
18831800
"id": "-1SZT8VzTZNd"
18841801
},
18851802
"source": [
1886-
"## 🔗 Resources"
1803+
"## \ud83d\udd17 Resources"
18871804
]
18881805
},
18891806
{
@@ -1893,13 +1810,13 @@
18931810
},
18941811
"source": [
18951812
"\n",
1896-
"- 🚀 **Get your API Key:** [ScrapeGraphAI Dashboard](https://dashboard.scrapegraphai.com) \n",
1897-
"- 🐙 **GitHub:** [ScrapeGraphAI GitHub](https://github.com/scrapegraphai) \n",
1898-
"- 💼 **LinkedIn:** [ScrapeGraphAI LinkedIn](https://www.linkedin.com/company/scrapegraphai/) \n",
1899-
"- 🐦 **Twitter:** [ScrapeGraphAI Twitter](https://twitter.com/scrapegraphai) \n",
1900-
"- 💬 **Discord:** [Join our Discord Community](https://discord.gg/uJN7TYcpNa) \n",
1813+
"- \ud83d\ude80 **Get your API Key:** [ScrapeGraphAI Dashboard](https://scrapegraphai.com/dashboard) \n",
1814+
"- \ud83d\udc19 **GitHub:** [ScrapeGraphAI GitHub](https://github.com/scrapegraphai) \n",
1815+
"- \ud83d\udcbc **LinkedIn:** [ScrapeGraphAI LinkedIn](https://www.linkedin.com/company/scrapegraphai/) \n",
1816+
"- \ud83d\udc26 **Twitter:** [ScrapeGraphAI Twitter](https://twitter.com/scrapegraphai) \n",
1817+
"- \ud83d\udcac **Discord:** [Join our Discord Community](https://discord.gg/uJN7TYcpNa) \n",
19011818
"\n",
1902-
"Made with ❤️ by the [ScrapeGraphAI](https://scrapegraphai.com) Team \n"
1819+
"Made with \u2764\ufe0f by the [ScrapeGraphAI](https://scrapegraphai.com) Team \n"
19031820
]
19041821
}
19051822
],

0 commit comments

Comments
 (0)