You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. It provides useful features like Actor lifecycle management, local storage emulation, and Actor event handling.
12
+
The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. It gives you everything you need to build an Actor and run it both locally and on the [Apify platform](https://docs.apify.com/platform), including:
13
+
14
+
-**Actor lifecycle management** — initialization, graceful shutdown, status messages, rebooting, and metamorphing.
15
+
-**Storage access** — datasets, key-value stores, and request queues, with automatic local emulation when running outside the platform.
16
+
-**Actor input** — convenient access to the Actor input, including automatic decryption of secret fields.
17
+
-**Events & state persistence** — react to platform events (system info, migration, abort) and persist state across migrations and restarts.
18
+
-**Proxy management** — Apify Proxy and custom proxies, with session and tiered-proxy support.
19
+
-**Platform interaction** — start, call, and abort other Actors and tasks, create webhooks, and reach the full Apify API client.
20
+
-**Monetization** — charge users with the pay-per-event pricing model.
21
+
-**Framework integrations** — first-class support for [Crawlee](../guides/crawlee) and [Scrapy](../guides/scrapy), with guides for [Playwright](../guides/playwright) and others.
13
22
14
23
<CodeBlockclassName="language-python">
15
24
{IntroductionExample}
@@ -29,7 +38,7 @@ Explore the Guides section in the sidebar for a deeper understanding of the SDK'
29
38
30
39
## Installation
31
40
32
-
The Apify SDK for Python requires Python version 3.10 or above. It is typically installed when you create a new Actor project using the [Apify CLI](https://docs.apify.com/cli). To install it manually in an existing project, use:
41
+
The Apify SDK for Python requires Python version 3.11 or above. It is typically installed when you create a new Actor project using the [Apify CLI](https://docs.apify.com/cli). To install it manually in an existing project, use:
Copy file name to clipboardExpand all lines: docs/01_introduction/quick-start.mdx
+6-3Lines changed: 6 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,15 +59,15 @@ The Actor's runtime dependencies are specified in the `requirements.txt` file, w
59
59
The Actor's source code is in the `src` folder. This folder contains two important files:
60
60
61
61
-`main.py` - which contains the main function of the Actor
62
-
-`__main__.py` - which is the entrypoint of the Actor package setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run).
62
+
-`__main__.py` - which is the entrypoint of the Actor package, executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run).
63
63
64
64
<Tabs>
65
65
<TabItemvalue="main.py"label="main.py"default>
66
66
<CodeBlockclassName="language-python">
67
67
{MainExample}
68
68
</CodeBlock>
69
69
</TabItem>
70
-
<TabItemvalue="__main__.py"label="__main.py__">
70
+
<TabItemvalue="__main__.py"label="__main__.py">
71
71
<CodeBlockclassName="language-python">
72
72
{UnderscoreMainExample}
73
73
</CodeBlock>
@@ -79,13 +79,15 @@ We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file.
79
79
80
80
## Next steps
81
81
82
+
Now that you can create and run an Actor locally, explore the rest of the SDK's features and its framework integrations.
83
+
82
84
### Concepts
83
85
84
86
To learn more about the features of the Apify SDK and how to use them, check out the Concepts section in the sidebar:
85
87
86
88
-[Actor lifecycle](../concepts/actor-lifecycle)
87
89
-[Actor input](../concepts/actor-input)
88
-
-[Working with storages](../concepts/storages)
90
+
-[Storages](../concepts/storages)
89
91
-[Actor events & state persistence](../concepts/actor-events)
90
92
-[Proxy management](../concepts/proxy-management)
91
93
-[Interacting with other Actors](../concepts/interacting-with-other-actors)
@@ -94,6 +96,7 @@ To learn more about the features of the Apify SDK and how to use them, check out
Copy file name to clipboardExpand all lines: docs/02_concepts/01_actor_lifecycle.mdx
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -106,4 +106,4 @@ Update the status only when the user's understanding of progress changes - avoid
106
106
107
107
## Conclusion
108
108
109
-
This page has presented the full Actor lifecycle: initialization, execution, error handling, rebooting, shutdown and status messages. You've seen how the SDK supports both context-based and manual control patterns. For deeper dives, explore the <ApiLinkto="">reference docs</ApiLink>, [guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx), and [platform documentation](https://docs.apify.com/platform).
109
+
This page has presented the full Actor lifecycle: initialization, execution, error handling, rebooting, shutdown and status messages. You've seen how the SDK supports both context-based and manual control patterns. For deeper dives, explore the <ApiLinkto="class/Actor">`Actor`</ApiLink> API reference, [guides](../guides/beautifulsoup-httpx), and [platform documentation](https://docs.apify.com/platform).
Copy file name to clipboardExpand all lines: docs/02_concepts/02_actor_input.mdx
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ import ApiLink from '@theme/ApiLink';
12
12
13
13
The Actor gets its [input](https://docs.apify.com/platform/actors/running/input) from the input record in its default [key-value store](https://docs.apify.com/platform/storage/key-value-store).
14
14
15
-
To access it, instead of reading the record manually, you can use the <ApiLinkto="class/Actor#get_input">`Actor.get_input`</ApiLink> convenience method. It will get the input record key from the Actor configuration, read the record from the default key-value store,and decrypt any [secret input fields](https://docs.apify.com/platform/actors/development/secret-input).
15
+
To access it, instead of reading the record manually, you can use the <ApiLinkto="class/Actor#get_input">`Actor.get_input`</ApiLink> convenience method. It will get the input record key from the Actor configuration, read the record from the default key-value store,and decrypt any [secret input fields](https://docs.apify.com/platform/actors/development/secret-input).
16
16
17
17
For example, if an Actor received a JSON input with two fields, `{ "firstNumber": 1, "secondNumber": 2 }`, this is how you might process it:
No special handling is needed in your code — when you call <ApiLinkto="class/Actor#get_input">`Actor.get_input`</ApiLink>, encrypted fields are automatically decrypted using the Actor's private key, which is provided by the platform via environment variables. You receive the plaintext values directly.
36
36
37
+
## Conclusion
38
+
39
+
This page has shown how to read Actor input with <ApiLinkto="class/Actor#get_input">`Actor.get_input`</ApiLink>, how to load URL sources with <ApiLinkto="class/ApifyRequestList">`ApifyRequestList`</ApiLink>, and how secret input fields are decrypted automatically when you read them.
40
+
37
41
For more details on Actor input and how to define input schemas, see the [Actor input](https://docs.apify.com/platform/actors/running/input) and [input schema](https://docs.apify.com/platform/actors/development/input-schema) documentation on the Apify platform.
Copy file name to clipboardExpand all lines: docs/02_concepts/03_storages.mdx
+10-6Lines changed: 10 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
id: storages
3
-
title: Working with storages
3
+
title: Storages
4
4
description: Use datasets, key-value stores, and request queues to persist Actor data.
5
5
---
6
6
@@ -45,11 +45,11 @@ Each dataset item, key-value store record, or request in a request queue is then
45
45
46
46
When developing locally, opening any storage will by default use local storage. To change this behavior and to use remote storage you have to use `force_cloud=True` argument in <ApiLinkto="class/Actor#open_dataset">`Actor.open_dataset`</ApiLink>, <ApiLinkto="class/Actor#open_request_queue">`Actor.open_request_queue`</ApiLink> or <ApiLinkto="class/Actor#open_key_value_store">`Actor.open_key_value_store`</ApiLink>. Proper use of this argument allows you to work with both local and remote storages.
47
47
48
-
Calling another remote Actor and accessing its default storage is typical use-case for using `force-cloud=True` argument to open remote Actor's storages.
48
+
Calling another remote Actor and accessing its default storage is a typical use-case for using `force_cloud=True` argument to open remote Actor's storages.
49
49
50
50
### Local storage persistence
51
51
52
-
By default, the storage contents are persisted across multiple Actor runs. To clean up the Actor storages before the running the Actor, use the `--purge` flag of the [`apify run`](https://docs.apify.com/cli/docs/reference#apify-run) command of the Apify CLI.
52
+
By default, the storage contents are persisted across multiple Actor runs. To clean up the Actor storages before running the Actor, use the `--purge` flag of the [`apify run`](https://docs.apify.com/cli/docs/reference#apify-run) command of the Apify CLI.
53
53
54
54
```bash
55
55
apify run --purge
@@ -106,8 +106,8 @@ To get an iterator of the data, you can use the <ApiLink to="class/Dataset#itera
106
106
### Exporting items
107
107
108
108
You can also export the dataset items into a key-value store, as either a CSV or a JSON record,
109
-
using the <ApiLinkto="class/Dataset#export_to_csv">`Dataset.export_to_csv`</ApiLink>
110
-
or <ApiLinkto="class/Dataset#export_to_json">`Dataset.export_to_json`</ApiLink> method.
109
+
using the <ApiLinkto="class/Dataset#export_to">`Dataset.export_to`</ApiLink> method with the
110
+
`content_type` argument set to `'csv'` or `'json'`.
@@ -183,6 +183,10 @@ To check if all the requests in the queue are handled, you can use the <ApiLink
183
183
184
184
## Storage clients
185
185
186
-
Behind the scenes, the SDK uses storage clients to communicate with the storage backend. The appropriate client is selected automatically based on the runtime environment — on the Apify platform, data is persisted via the Apify API, while local runs use the filesystem. For most use cases, you don't need to think about storage clients at all. If you want to learn more about how storage clients work, the available implementations, or how to configure them, see the [Crawlee storage clients guide](https://crawlee.dev/python/docs/guides/storage-clients). The Apify-specific clients are available in the `apify.storage_clients` module.
186
+
Behind the scenes, the SDK uses storage clients to communicate with the storage backend. The appropriate client is selected automatically based on the runtime environment — on the Apify platform, data is persisted via the Apify API, while local runs use the filesystem. For most use cases, you don't need to think about storage clients at all. To learn about the available implementations, how to switch between a single and shared request queue, or how to configure a custom client, see the [Storage clients](./storage-clients) page. For a deeper look at how storage clients work internally, see the [Crawlee storage clients guide](https://crawlee.dev/python/docs/guides/storage-clients).
187
+
188
+
## Conclusion
189
+
190
+
This page has covered the three storage types — datasets, key-value stores, and request queues — how they are emulated on the local filesystem, how to open named and unnamed storages, and how to read from and write to each through the `Actor` shortcuts and the storage classes.
187
191
188
192
For comprehensive information about storage on the Apify platform, see the [storage documentation](https://docs.apify.com/platform/storage), including the pages on [datasets](https://docs.apify.com/platform/storage/dataset), [key-value stores](https://docs.apify.com/platform/storage/key-value-store), and [request queues](https://docs.apify.com/platform/storage/request-queue).
Copy file name to clipboardExpand all lines: docs/02_concepts/04_actor_events.mdx
+18-14Lines changed: 18 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,8 @@ During its runtime, the Actor receives Actor events sent by the Apify platform o
14
14
15
15
## Event types
16
16
17
+
A listener can optionally receive a single argument — a Pydantic model with the event's data. The table below lists the events, the type of that data object, and when each event is emitted.
18
+
17
19
<table>
18
20
<thead>
19
21
<tr>
@@ -25,25 +27,23 @@ During its runtime, the Actor receives Actor events sent by the Apify platform o
Emitted by the SDK (not the platform) when the Actor is about to exit. You can use this event to perform final cleanup tasks,
79
79
such as closing external connections or sending notifications, before the Actor shuts down.
@@ -103,4 +103,8 @@ You can optionally specify a `key` (the key-value store key under which the stat
103
103
{UseStateExample}
104
104
</RunnableCodeBlock>
105
105
106
+
## Conclusion
107
+
108
+
This page has described the events emitted during a run — `SYSTEM_INFO`, `MIGRATING`, `ABORTING`, `PERSIST_STATE`, and `EXIT` — how to handle them with <ApiLinkto="class/Actor#on">`Actor.on`</ApiLink>, and how to persist state automatically with <ApiLinkto="class/Actor#use_state">`Actor.use_state`</ApiLink>.
109
+
106
110
For more details on platform events and state persistence, see the [system events](https://docs.apify.com/platform/actors/development/programming-interface/system-events) and [state persistence](https://docs.apify.com/platform/actors/development/state-persistence) documentation on the Apify platform.
Copy file name to clipboardExpand all lines: docs/02_concepts/05_proxy_management.mdx
+9-3Lines changed: 9 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@ The Apify SDK provides built-in proxy management through the <ApiLink to="class/
22
22
23
23
If you want to use Apify Proxy locally, make sure that you run your Actors via the Apify CLI and that you are [logged in](https://docs.apify.com/cli/docs/installation#login-with-your-apify-account) with your Apify account in the CLI.
@@ -38,7 +38,7 @@ If you want to use Apify Proxy locally, make sure that you run your Actors via t
38
38
39
39
All your proxy needs are managed by the <ApiLinkto="class/ProxyConfiguration">`ProxyConfiguration`</ApiLink> class. You create an instance using the <ApiLinkto="class/Actor#create_proxy_configuration">`Actor.create_proxy_configuration()`</ApiLink> method. Then you generate proxy URLs using the <ApiLinkto="class/ProxyConfiguration#new_url">`ProxyConfiguration.new_url()`</ApiLink> method.
40
40
41
-
### Apify proxy vs. your own proxies
41
+
### Apify Proxy vs. your own proxies
42
42
43
43
The `ProxyConfiguration` class covers both Apify Proxy and custom proxy URLs, so that you can easily switch between proxy providers. However, some features of the class are available only to Apify Proxy users, mainly because Apify Proxy is what one would call a super-proxy. It's not a single proxy server, but an API endpoint that allows connection through millions of different IP addresses. So the class essentially has two modes: Apify Proxy or Your proxy.
44
44
@@ -54,7 +54,7 @@ When no `session_id` is provided, your custom proxy URLs are rotated round-robin
54
54
{ProxyRotationExample}
55
55
</RunnableCodeBlock>
56
56
57
-
### Apify proxy configuration
57
+
### Apify Proxy configuration
58
58
59
59
With Apify Proxy, you can select specific proxy groups to use, or countries to connect from. For even finer control, you can also target a specific subdivision (e.g. a US state) using the `subdivision_code` parameter alongside `country_code`. This allows you to get better proxy performance after some initial research.
60
60
@@ -106,6 +106,8 @@ You can then use that input to create the proxy configuration:
106
106
107
107
## Using the generated proxy URLs
108
108
109
+
`ProxyConfiguration` only generates proxy URLs — it does not make requests itself. Pass a generated URL to whichever HTTP client your Actor uses to route requests through the proxy.
110
+
109
111
### HTTPX
110
112
111
113
To use the generated proxy URLs with the `httpx` library, use the [`proxies`](https://www.python-httpx.org/advanced/#http-proxying) argument:
@@ -120,4 +122,8 @@ Make sure you have the `httpx` library installed:
120
122
pip install httpx
121
123
```
122
124
125
+
## Conclusion
126
+
127
+
This page has explained how to manage proxies with the <ApiLinkto="class/ProxyConfiguration">`ProxyConfiguration`</ApiLink> class — using Apify Proxy or your own servers, keeping sessions sticky across requests, configuring tiered proxy rotation, and feeding proxy settings from Actor input.
128
+
123
129
For full details on proxy configuration options, see the <ApiLinkto="class/ProxyConfiguration">`ProxyConfiguration`</ApiLink> API reference and the [Apify Proxy documentation](https://docs.apify.com/proxy).
0 commit comments