You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Summary
- Fix inlined code examples in Introduction and Quick Start
- Standardize all guides to follow the same pattern: intro paragraph,
Introduction, Example Actor, Conclusion, Additional resources
- Move "Running webserver" from concepts to guides in v1.7 and v2.7
- Add linked library names, Apify template links, and official
documentation references to all guides
- Update quick-start links to guides and concepts
## Test plan
- [x] Executed locally and manually checked
- [x] CI passes
---
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. It provides useful features like Actor lifecycle management, local storage emulation, and Actor event handling.
Copy file name to clipboardExpand all lines: docs/03_guides/01_beautifulsoup_httpx.mdx
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,3 +28,9 @@ Below is a simple Actor that recursively scrapes titles from all linked websites
28
28
## Conclusion
29
29
30
30
In this guide, you learned how to use the [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) with the [HTTPX](https://www.python-httpx.org/) in your Apify Actors. By combining these libraries, you can efficiently extract data from HTML or XML files, making it easy to build web scraping tasks in Python. See the [Actor templates](https://apify.com/templates/categories/python) to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU). Happy scraping!
Copy file name to clipboardExpand all lines: docs/03_guides/02_parsel_impit.mdx
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,3 +26,9 @@ The following example shows a simple Actor that recursively scrapes titles from
26
26
## Conclusion
27
27
28
28
In this guide, you learned how to use [Parsel](https://github.com/scrapy/parsel) with [Impit](https://github.com/apify/impit) in your Apify Actors. By combining these libraries, you get a powerful and efficient solution for web scraping: [Parsel](https://github.com/scrapy/parsel) provides excellent CSS selector and XPath support for data extraction, while [Impit](https://github.com/apify/impit) offers a fast and simple HTTP client built by Apify. This combination makes it easy to build scalable web scraping tasks in Python. See the [Actor templates](https://apify.com/templates/categories/python) to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU). Happy scraping!
In this guide, you'll learn how to use [Playwright](https://playwright.dev) for web scraping in your Apify Actors.
14
+
15
+
## Introduction
16
+
13
17
[Playwright](https://playwright.dev) is a tool for web automation and testing that can also be used for web scraping. It allows you to control a web browser programmatically and interact with web pages just as a human would.
14
18
15
19
Some of the key features of Playwright for web scraping include:
@@ -19,8 +23,6 @@ Some of the key features of Playwright for web scraping include:
19
23
-**Powerful selectors** - Playwright provides a variety of powerful selectors that allow you to target specific elements on a web page, including CSS selectors, XPath, and text matching.
20
24
-**Emulation of user interactions** - Playwright allows you to emulate user interactions like clicking, scrolling, filling out forms, and even typing in text, which can be useful for scraping websites that have dynamic content or require user input.
21
25
22
-
## Using Playwright in Actors
23
-
24
26
To create Actors which use Playwright, start from the [Playwright & Python](https://apify.com/templates/categories/python) Actor template.
25
27
26
28
On the Apify platform, the Actor will already have Playwright and the necessary browsers preinstalled in its Docker image, including the tools and setup necessary to run browsers in headful mode.
@@ -55,3 +57,9 @@ It uses Playwright to open the pages in an automated Chrome browser, and to extr
55
57
## Conclusion
56
58
57
59
In this guide you learned how to create Actors that use Playwright to scrape websites. Playwright is a powerful tool that can be used to manage browser instances and scrape websites that require JavaScript execution. See the [Actor templates](https://apify.com/templates/categories/python) to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU). Happy scraping!
In this guide, you'll learn how to use [Selenium](https://www.selenium.dev/) for web scraping in your Apify Actors.
11
+
12
+
## Introduction
13
+
10
14
[Selenium](https://www.selenium.dev/) is a tool for web automation and testing that can also be used for web scraping. It allows you to control a web browser programmatically and interact with web pages just as a human would.
11
15
12
16
Some of the key features of Selenium for web scraping include:
@@ -21,8 +25,6 @@ including CSS selectors, XPath, and text matching.
21
25
-**Emulation of user interactions** - Selenium allows you to emulate user interactions like clicking, scrolling, filling out forms,
22
26
and even typing in text, which can be useful for scraping websites that have dynamic content or require user input.
23
27
24
-
## Using Selenium in Actors
25
-
26
28
To create Actors which use Selenium, start from the [Selenium & Python](https://apify.com/templates/categories/python) Actor template.
27
29
28
30
On the Apify platform, the Actor will already have Selenium and the necessary browsers preinstalled in its Docker image,
@@ -44,3 +46,8 @@ It uses Selenium ChromeDriver to open the pages in an automated Chrome browser,
44
46
## Conclusion
45
47
46
48
In this guide you learned how to use Selenium for web scraping in Apify Actors. You can now create your own Actors that use Selenium to scrape dynamic websites and interact with web pages just like a human would. See the [Actor templates](https://apify.com/templates/categories/python) to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU). Happy scraping!
Copy file name to clipboardExpand all lines: docs/03_guides/05_crawlee.mdx
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,3 +44,12 @@ The [`PlaywrightCrawler`](https://crawlee.dev/python/api/class/PlaywrightCrawler
44
44
## Conclusion
45
45
46
46
In this guide, you learned how to use the [Crawlee](https://crawlee.dev/python) library in your Apify Actors. By using the [`BeautifulSoupCrawler`](https://crawlee.dev/python/api/class/BeautifulSoupCrawler), [`ParselCrawler`](https://crawlee.dev/python/api/class/ParselCrawler), and [`PlaywrightCrawler`](https://crawlee.dev/python/api/class/PlaywrightCrawler) crawlers, you can efficiently scrape static or dynamic web pages, making it easy to build web scraping tasks in Python. See the [Actor templates](https://apify.com/templates/categories/python) to get started with your own scraping tasks. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU). Happy scraping!
In this guide, you'll learn how to use the [Scrapy](https://scrapy.org/) framework in your Apify Actors.
17
+
18
+
## Introduction
19
+
16
20
[Scrapy](https://scrapy.org/) is an open-source web scraping framework for Python. It provides tools for defining scrapers, extracting data from web pages, following links, and handling pagination. With the Apify SDK, Scrapy projects can be converted into Apify [Actors](https://docs.apify.com/platform/actors), integrated with Apify [storages](https://docs.apify.com/platform/storage), and executed on the Apify [platform](https://docs.apify.com/platform).
In this guide, you'll learn how to run a web server inside your Apify Actor. This is useful for monitoring Actor progress, creating custom APIs, or serving content during the Actor run.
11
+
12
+
## Introduction
13
+
10
14
Each Actor run on the Apify platform is assigned a unique hard-to-guess URL (for example `https://8segt5i81sokzm.runs.apify.net`), which enables HTTP access to an optional web server running inside the Actor run's container.
11
15
12
16
The URL is available in the following places:
@@ -17,10 +21,18 @@ The URL is available in the following places:
17
21
18
22
The web server running inside the container must listen at the port defined by the `Actor.configuration.container_port` property. When running Actors locally, the port defaults to `4321`, so the web server will be accessible at `http://localhost:4321`.
19
23
20
-
## Example
24
+
## Example Actor
21
25
22
-
The following example demonstrates how to start a simple web server in your Actor,which will respond to every GET request with the number of items that the Actor has processed so far:
26
+
The following example demonstrates how to start a simple web server in your Actor,which will respond to every GET request with the number of items that the Actor has processed so far:
In this guide, you learned how to run a web server inside your Apify Actor. By leveraging the container URL and port provided by the platform, you can expose HTTP endpoints for monitoring, reporting, or serving content during Actor execution. If you have questions or need assistance, feel free to reach out on our [GitHub](https://github.com/apify/apify-sdk-python) or join our [Discord community](https://discord.com/invite/jyEM2PRvMU).
0 commit comments