Skip to content

Commit 916a62d

Browse files
vdusekclaude
andauthored
docs: standardize language, differentiate guides, and add closing notes (#851)
## Summary - Rewrite opening paragraphs for 5 concept pages (proxy management, interacting with other Actors, API access, logging, configuration) to follow a consistent "what it does + why it matters" pattern - Standardize all 7 guide titles from gerund form ("Using X") to imperative form ("Use X"), including "Running webserver" → "Run a web server" - Differentiate Playwright and Selenium feature lists — Playwright now highlights auto-waiting, locator API, and network interception; Selenium highlights its broad ecosystem, WebDriver protocol, and flexible selection strategies - Standardize example intro phrasing to "The following example shows..." across Crawlee, Scrapy, and webserver guides; fix a stray backtick typo in the Crawlee guide - Remove duplicate opening sentence in the Crawlee guide - Add closing sentences with API reference/docs links to 5 concept pages that ended abruptly after code blocks --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent bbd0d21 commit 916a62d

13 files changed

+41
-37
lines changed

docs/02_concepts/05_proxy_management.mdx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,9 @@ import ApifyProxyConfig from '!!raw-loader!roa-loader!./code/05_apify_proxy_conf
1313
import CustomProxyFunctionExample from '!!raw-loader!roa-loader!./code/05_custom_proxy_function.py';
1414
import ProxyActorInputExample from '!!raw-loader!roa-loader!./code/05_proxy_actor_input.py';
1515
import ProxyHttpxExample from '!!raw-loader!roa-loader!./code/05_proxy_httpx.py';
16+
import ApiLink from '@site/src/components/ApiLink';
1617

17-
[IP address blocking](https://en.wikipedia.org/wiki/IP_address_blocking) is one of the oldest and most effective ways of preventing access to a website. It is therefore paramount for a good web scraping library to provide easy to use but powerful tools which can work around IP blocking. The most powerful weapon in your anti IP blocking arsenal is a [proxy server](https://en.wikipedia.org/wiki/Proxy_server).
18-
19-
With the Apify SDK, you can use your own proxy servers, proxy servers acquired from third-party providers, or you can rely on [Apify Proxy](https://apify.com/proxy) for your scraping needs.
18+
The Apify SDK provides built-in proxy management through the <ApiLink to="class/ProxyConfiguration">`ProxyConfiguration`</ApiLink> class, supporting both [Apify Proxy](https://apify.com/proxy) and custom proxy servers. Proxies are essential for web scraping to avoid [IP address blocking](https://en.wikipedia.org/wiki/IP_address_blocking) and distribute requests across multiple addresses.
2019

2120
## Quick start
2221

@@ -107,3 +106,5 @@ Make sure you have the `httpx` library installed:
107106
```bash
108107
pip install httpx
109108
```
109+
110+
For full details on proxy configuration options, see the <ApiLink to="class/ProxyConfiguration">`ProxyConfiguration`</ApiLink> API reference and the [Apify Proxy documentation](https://docs.apify.com/proxy).

docs/02_concepts/06_interacting_with_other_actors.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,9 @@ import InteractingStartExample from '!!raw-loader!roa-loader!./code/06_interacti
1010
import InteractingCallExample from '!!raw-loader!roa-loader!./code/06_interacting_call.py';
1111
import InteractingCallTaskExample from '!!raw-loader!roa-loader!./code/06_interacting_call_task.py';
1212
import InteractingMetamorphExample from '!!raw-loader!roa-loader!./code/06_interacting_metamorph.py';
13+
import ApiLink from '@site/src/components/ApiLink';
1314

14-
There are several methods that interact with other Actors and Actor tasks on the Apify platform.
15+
The Apify SDK lets you start, call, and transform (metamorph) other Actors directly from your Actor code. This is useful for composing complex workflows from smaller, reusable Actors.
1516

1617
## Actor start
1718

@@ -50,3 +51,5 @@ For example, imagine you have an Actor that accepts a hotel URL on input, and th
5051
<RunnableCodeBlock className="language-python" language="python">
5152
{InteractingMetamorphExample}
5253
</RunnableCodeBlock>
54+
55+
For the full list of methods for interacting with other Actors, see the <ApiLink to="class/Actor">`Actor`</ApiLink> API reference.

docs/02_concepts/07_webhooks.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,3 +30,5 @@ To ensure that duplicate ad-hoc webhooks won't get created in a case of Actor re
3030
<RunnableCodeBlock className="language-python" language="python">
3131
{WebhookPreventingExample}
3232
</RunnableCodeBlock>
33+
34+
For more information about webhooks, including event types and payloads, see the [Apify webhooks documentation](https://docs.apify.com/platform/integrations/webhooks).

docs/02_concepts/08_access_apify_api.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,7 @@ import RunnableCodeBlock from '@site/src/components/RunnableCodeBlock';
99
import ActorClientExample from '!!raw-loader!roa-loader!./code/08_actor_client.py';
1010
import ActorNewClientExample from '!!raw-loader!roa-loader!./code/08_actor_new_client.py';
1111

12-
The Apify SDK contains many useful features for making Actor development easier. However, it does not cover all the features the Apify API offers.
13-
14-
For working with the Apify API directly, you can use the provided instance of the [Apify API Client](https://docs.apify.com/api/client/python) library.
12+
The Apify SDK provides a built-in instance of the [Apify API Client](https://docs.apify.com/api/client/python) for accessing Apify platform features beyond what the SDK covers directly.
1513

1614
## Actor client
1715

@@ -30,3 +28,5 @@ If you want to create a completely new instance of the client, for example, to g
3028
<RunnableCodeBlock className="language-python" language="python">
3129
{ActorNewClientExample}
3230
</RunnableCodeBlock>
31+
32+
For the full API client documentation, see the [Apify API Client for Python](https://docs.apify.com/api/client/python).

docs/02_concepts/09_logging.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ import LoggerUsageExample from '!!raw-loader!roa-loader!./code/09_logger_usage.p
1111
import RedirectLog from '!!raw-loader!roa-loader!./code/09_redirect_log.py';
1212
import RedirectLogExistingRun from '!!raw-loader!roa-loader!./code/09_redirect_log_existing_run.py';
1313

14-
The Apify SDK is logging useful information through the [`logging`](https://docs.python.org/3/library/logging.html) module from Python's standard library, into the logger with the name `apify`.
14+
The Apify SDK logs through Python's standard [`logging`](https://docs.python.org/3/library/logging.html) module, using the `apify` logger. Configuring log levels and formatting helps you debug Actors locally and monitor them on the platform.
1515

1616
## Automatic configuration
1717

docs/02_concepts/10_configuration.mdx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,9 @@ description: Customize Actor behavior through the Configuration class or environ
77
import RunnableCodeBlock from '@site/src/components/RunnableCodeBlock';
88

99
import ConfigExample from '!!raw-loader!roa-loader!./code/10_config.py';
10+
import ApiLink from '@site/src/components/ApiLink';
1011

11-
The [`Actor`](../../reference/class/Actor) class gets configured using the [`Configuration`](../../reference/class/Configuration) class, which initializes itself based on the provided environment variables.
12-
13-
If you're using the Apify SDK in your Actors on the Apify platform, or Actors running locally through the Apify CLI, you don't need to configure the `Actor` class manually, unless you have some specific requirements, everything will get configured automatically.
12+
The <ApiLink to="class/Actor">`Actor`</ApiLink> class is configured through the <ApiLink to="class/Configuration">`Configuration`</ApiLink> class, which reads its settings from environment variables. When running on the Apify platform or through the Apify CLI, configuration is automatic — manual setup is only needed for custom requirements.
1413

1514
If you need some special configuration, you can adjust it either through the `Configuration` class directly, or by setting environment variables when running the Actor locally.
1615

@@ -33,3 +32,5 @@ This Actor run will not persist its local storages to the filesystem:
3332
```bash
3433
APIFY_PERSIST_STORAGE=0 apify run
3534
```
35+
36+
For the full list of configuration options, see the <ApiLink to="class/Configuration">`Configuration`</ApiLink> API reference.

docs/03_guides/01_beautifulsoup_httpx.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: beautifulsoup-httpx
3-
title: Using BeautifulSoup with HTTPX
3+
title: Use BeautifulSoup with HTTPX
44
description: Build an Apify Actor that scrapes web pages using BeautifulSoup and HTTPX.
55
---
66

docs/03_guides/02_parsel_impit.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: parsel-impit
3-
title: Using Parsel with Impit
3+
title: Use Parsel with Impit
44
description: Build an Apify Actor that scrapes web pages using Parsel selectors and the Impit HTTP client.
55
---
66

docs/03_guides/03_playwright.mdx

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: playwright
3-
title: Using Playwright
3+
title: Use Playwright
44
description: Build an Apify Actor that scrapes dynamic web pages using Playwright browser automation.
55
---
66

@@ -19,10 +19,11 @@ In this guide, you'll learn how to use [Playwright](https://playwright.dev) for
1919

2020
Some of the key features of Playwright for web scraping include:
2121

22-
- **Cross-browser support** - Playwright supports the latest versions of major browsers like Chrome, Firefox, and Safari, so you can choose the one that suits your needs the best.
23-
- **Headless mode** - Playwright can run in headless mode, meaning that the browser window is not visible on your screen while it is scraping, which can be useful for running scraping tasks in the background or in containers without a display.
24-
- **Powerful selectors** - Playwright provides a variety of powerful selectors that allow you to target specific elements on a web page, including CSS selectors, XPath, and text matching.
25-
- **Emulation of user interactions** - Playwright allows you to emulate user interactions like clicking, scrolling, filling out forms, and even typing in text, which can be useful for scraping websites that have dynamic content or require user input.
22+
- **Cross-browser support** - Playwright supports Chromium, Firefox, and WebKit with a single API, ensuring consistent behavior across all browsers.
23+
- **Auto-waiting** - Playwright automatically waits for elements to be ready before performing actions, reducing flaky scripts and eliminating the need for manual sleep calls.
24+
- **Headless and headful modes** - Playwright can run with or without a visible browser window, making it suitable for both local development and containerized environments.
25+
- **Powerful selectors** - Playwright provides CSS selectors, XPath, text matching, and its own resilient locator API for targeting elements on a page.
26+
- **Network interception** - Playwright can intercept and modify network requests, allowing you to block unnecessary resources or mock API responses during scraping.
2627

2728
To create Actors which use Playwright, start from the [Playwright & Python](https://apify.com/templates/categories/python) Actor template.
2829

docs/03_guides/04_selenium.mdx

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
id: selenium
3-
title: Using Selenium
3+
title: Use Selenium
44
description: Build an Apify Actor that scrapes dynamic web pages using Selenium WebDriver.
55
---
66

@@ -16,15 +16,11 @@ In this guide, you'll learn how to use [Selenium](https://www.selenium.dev/) for
1616

1717
Some of the key features of Selenium for web scraping include:
1818

19-
- **Cross-browser support** - Selenium supports the latest versions of major browsers like Chrome, Firefox, and Safari,
20-
so you can choose the one that suits your needs the best.
21-
- **Headless mode** - Selenium can run in headless mode,
22-
meaning that the browser window is not visible on your screen while it is scraping,
23-
which can be useful for running scraping tasks in the background or in containers without a display.
24-
- **Powerful selectors** - Selenium provides a variety of powerful selectors that allow you to target specific elements on a web page,
25-
including CSS selectors, XPath, and text matching.
26-
- **Emulation of user interactions** - Selenium allows you to emulate user interactions like clicking, scrolling, filling out forms,
27-
and even typing in text, which can be useful for scraping websites that have dynamic content or require user input.
19+
- **Broad ecosystem** - Selenium has a large community and extensive documentation, with support for multiple programming languages beyond Python.
20+
- **WebDriver protocol** - Selenium uses the W3C WebDriver protocol, providing standardized browser automation that works with Chrome, Firefox, Edge, and Safari.
21+
- **Headless and headful modes** - Selenium can run with or without a visible browser window, making it suitable for both local development and containerized environments.
22+
- **Flexible element selection** - Selenium provides CSS selectors, XPath, ID, class name, and other strategies for locating elements on a page.
23+
- **User interaction emulation** - Selenium allows you to emulate user actions like clicking, scrolling, filling out forms, and typing, which is useful for scraping dynamic websites.
2824

2925
To create Actors which use Selenium, start from the [Selenium & Python](https://apify.com/templates/categories/python) Actor template.
3026

0 commit comments

Comments
 (0)