Skip to content

Commit 10203bc

Browse files
authored
refactor!: Adapt to apify-client v3 (#719)
### Description - This PR updates the SDK to work with `apify-python-client` v3, which introduces fully typed API clients generated from OpenAPI specifications. - See apify/apify-client-python#604 for more details. ### Issues - Closes: #736 - Closes: #770 - Closes: #697 - Closes: #853 ### Testing - The existing SDK tests pass with `apify-python-client` v3.
1 parent a6c1a73 commit 10203bc

54 files changed

Lines changed: 1391 additions & 840 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.rules.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ This file provides guidance to programming agents when working with code in this
44

55
## Project Overview
66

7-
The Apify SDK for Python (`apify` package on PyPI) is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. It provides Actor lifecycle management, storage access (datasets, key-value stores, request queues), event handling, proxy configuration, and pay-per-event charging. It builds on top of the [Crawlee](https://crawlee.dev/python) web scraping framework and the [Apify API Client](https://docs.apify.com/api/client/python). Supports Python 3.10–3.14. Build system: hatchling.
7+
The Apify SDK for Python (`apify` package on PyPI) is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. It provides Actor lifecycle management, storage access (datasets, key-value stores, request queues), event handling, proxy configuration, and pay-per-event charging. It builds on top of the [Crawlee](https://crawlee.dev/python) web scraping framework and the [Apify API Client](https://docs.apify.com/api/client/python). Supports Python 3.11–3.14. Build system: hatchling.
88

99
## Common Commands
1010

@@ -46,7 +46,7 @@ uv run poe e2e-tests
4646
## Code Style
4747

4848
- **Formatter/Linter**: Ruff (line length 120, single quotes for inline, double quotes for docstrings)
49-
- **Type checker**: ty (targets Python 3.10)
49+
- **Type checker**: ty (targets Python 3.11)
5050
- **All ruff rules enabled** with specific ignores — see `pyproject.toml` `[tool.ruff.lint]` for the full ignore list
5151
- Tests are exempt from docstring rules (`D`), assert warnings (`S101`), and private member access (`SLF001`)
5252
- Unused imports are allowed in `__init__.py` files (re-exports)
@@ -71,7 +71,7 @@ uv run poe e2e-tests
7171

7272
- **`_proxy_configuration.py`**`ProxyConfiguration` manages Apify proxy setup (residential, datacenter, groups, country targeting).
7373

74-
- **`_models.py`**Pydantic models for API data structures (Actor runs, webhooks, pricing info, etc.).
74+
- **`_webhook.py`**The `Webhook` dataclass (ad-hoc / persistent webhook definition) and the `to_client_representations` helper. Response and data models are no longer defined in the SDK — they come from `apify-client` v3 (e.g. `Run`, the Actor pricing-info models).
7575

7676
### Storage Clients (`src/apify/storage_clients/`)
7777

@@ -101,8 +101,9 @@ Optional integration (`apify[scrapy]` extra) providing Scrapy scheduler, middlew
101101
### Key Dependencies
102102

103103
- **`crawlee`** — Base framework providing storage abstractions, event system, configuration, service locator pattern
104-
- **`apify-client`** — HTTP client for the Apify API (`ApifyClientAsync`)
105-
- **`apify-shared`** — Shared constants and utilities (`ApifyEnvVars`, `ActorEnvVars`, etc.)
104+
- **`apify-client`** — HTTP client for the Apify API (`ApifyClientAsync`); also the source of response and data models (`Run`, pricing info, webhook representations)
105+
106+
The SDK no longer depends on `apify-shared`. The platform env-var enums (`ApifyEnvVars`, `ActorEnvVars`) are vendored in `apify._consts` and re-exported from the top-level `apify` package.
106107

107108
## Testing
108109

docs/02_concepts/code/04_use_state.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,10 @@ async def main() -> None:
99
# On restart or migration, the state is loaded from the KVS.
1010
state = await Actor.use_state(default_value={'processed_items': 0})
1111

12-
# Resume from previous state
12+
# Resume from the persisted state (stored as JSON, so narrow the type).
1313
start_index = state['processed_items']
14+
if not isinstance(start_index, int):
15+
start_index = 0
1416
Actor.log.info(f'Resuming from item {start_index}')
1517

1618
# Do some work and update the state — it is persisted automatically

docs/02_concepts/code/05_custom_proxy_function.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,11 @@
77

88
async def custom_new_url_function(
99
session_id: str | None = None,
10-
_: Request | None = None,
10+
request: Request | None = None,
1111
) -> str | None:
12+
# Pick a proxy URL based on the session and/or the request being proxied.
13+
if request is not None:
14+
Actor.log.debug(f'Selecting a proxy URL for {request.url}.')
1215
if session_id is not None:
1316
return f'http://my-custom-proxy-supporting-sessions.com?session-id={session_id}'
1417
return 'http://my-custom-proxy-not-supporting-sessions.com'

docs/02_concepts/code/07_webhook.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
import asyncio
22

3-
from apify import Actor, Webhook, WebhookEventType
3+
from apify import Actor, Webhook
44

55

66
async def main() -> None:
77
async with Actor:
88
# Create a webhook that will be triggered when the Actor run fails.
99
webhook = Webhook(
10-
event_types=[WebhookEventType.ACTOR_RUN_FAILED],
10+
event_types=['ACTOR.RUN.FAILED'],
1111
request_url='https://example.com/run-failed',
1212
)
1313

docs/02_concepts/code/07_webhook_preventing.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,19 @@
11
import asyncio
22

3-
from apify import Actor, Webhook, WebhookEventType
3+
from apify import Actor, Webhook
44

55

66
async def main() -> None:
77
async with Actor:
8-
# Create a webhook that will be triggered when the Actor run fails.
8+
# Create a webhook with an idempotency key to prevent duplicates on retries.
99
webhook = Webhook(
10-
event_types=[WebhookEventType.ACTOR_RUN_FAILED],
10+
event_types=['ACTOR.RUN.FAILED'],
1111
request_url='https://example.com/run-failed',
12+
idempotency_key=Actor.configuration.actor_run_id,
1213
)
1314

1415
# Add the webhook to the Actor.
15-
await Actor.add_webhook(webhook, idempotency_key=Actor.configuration.actor_run_id)
16+
await Actor.add_webhook(webhook)
1617

1718
# Raise an error to simulate a failed run.
1819
raise RuntimeError('I am an error and I know it!')

docs/04_upgrading/upgrading_to_v4.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,3 +68,133 @@ run = await Actor.call('user/actor', timeout='inherit')
6868
The deprecated `latest_sdk_version`, `log_format`, and `standby_port` fields have been removed from `Configuration`:
6969
- In place of `standby_port`, use `web_server_port`.
7070
- `latest_sdk_version` and `log_format` don't have replacement. SDK version checking isn't supported for the Python SDK and the log format should be adjusted in code instead.
71+
72+
## Built on `apify-client` v3
73+
74+
The SDK is now built on [`apify-client`](https://docs.apify.com/api/client/python) v3 and no longer depends on `apify-shared`. The sections below cover the user-visible consequences; see the client's [Upgrading to v3](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3) guide for the full list of changes in the client itself.
75+
76+
### Environment variable enums moved
77+
78+
If you imported the platform environment-variable enums from `apify_shared.consts` (`ApifyEnvVars`, `ActorEnvVars`), import them from `apify` instead — they are now vendored in the SDK and re-exported from the top-level package.
79+
80+
```python
81+
# Before (v3)
82+
from apify_shared.consts import ApifyEnvVars
83+
84+
# After (v4)
85+
from apify import ApifyEnvVars
86+
```
87+
88+
## Typed responses
89+
90+
`Actor.start`, `Actor.abort`, `Actor.call`, and `Actor.call_task` now return `apify_client._models.Run` instead of the SDK-side `ActorRun`. Both are [Pydantic](https://docs.pydantic.dev/latest/) models with the same snake_case fields, so field access is unchanged — only the type and import path differ. The SDK no longer ships its own response models (`apify._models` has been removed); response shapes come from `apify-client`.
91+
92+
## Literal string aliases instead of StrEnum classes
93+
94+
Generated enum-like types are now [`Literal`](https://docs.python.org/3/library/typing.html#typing.Literal) string aliases instead of `StrEnum` classes. Pass plain strings instead of enum members.
95+
96+
- `apify.WebhookEventType` is now a `Literal[...]` instead of a `StrEnum`. Use plain string values (`'ACTOR.RUN.FAILED'`) instead of enum members.
97+
- `apify_shared.consts.ActorEventTypes` (a `StrEnum`) is replaced by `apify.ActorEventTypes`, now a `Literal['systemInfo', 'persistState', 'migrating', 'aborting']`. For runtime values, use `apify.Event` (re-exported from Crawlee) instead of enum members.
98+
99+
```python
100+
# Before (v3)
101+
from apify import Actor
102+
from apify_shared.consts import ActorEventTypes
103+
104+
Actor.on(ActorEventTypes.SYSTEM_INFO, callback)
105+
106+
# After (v4)
107+
from apify import Actor, Event
108+
109+
Actor.on(Event.SYSTEM_INFO, callback)
110+
```
111+
112+
## Actor pricing info models
113+
114+
The Actor pricing-info models exposed through `Actor.configuration.actor_pricing_info``FreeActorPricingInfo`, `FlatPricePerMonthActorPricingInfo`, `PricePerDatasetItemActorPricingInfo`, `PayPerEventActorPricingInfo`, and the nested `ActorChargeEvent` / `PricingPerEvent` — are now thin subclasses of the corresponding `apify-client` models instead of standalone SDK copies. The discriminated-union shape is unchanged, so existing access (`pricing_model`, per-event titles and prices) keeps working; the models now expose the full `apify-client` field set, and a charge event's `event_price_usd` is optional (it is unset for tier-priced events). `ChargingManager.get_pricing_info()` is unchanged.
115+
116+
## `Webhook` API simplified
117+
118+
The `Webhook` model has been slimmed down to only the fields a user sets when defining a webhook. Server-populated response fields (`id`, `created_at`, `modified_at`, `user_id`, `is_ad_hoc`, `condition`, `last_dispatch`, `stats`) and the unused `WebhookCondition` helper class have been removed. The `description` and `should_interpolate_strings` fields have also been removed — they are not part of the ad-hoc webhook representation (`event_types`, `request_url`, `payload_template`, `headers_template`) that `Actor.start` / `Actor.call` / `Actor.call_task` and `Actor.add_webhook` now send. `Webhook` is now a plain `@dataclass` instead of a Pydantic `BaseModel` — construct it with snake_case kwargs; `.model_dump()` / `.model_validate()` are gone.
119+
120+
The retry and idempotency kwargs that used to live on `Actor.add_webhook` have moved onto the `Webhook` instance itself.
121+
122+
```python
123+
# Before (v3)
124+
from apify import Actor, Webhook
125+
126+
await Actor.add_webhook(
127+
Webhook(event_types=['ACTOR.RUN.FAILED'], request_url='https://example.com'),
128+
ignore_ssl_errors=False,
129+
do_not_retry=False,
130+
idempotency_key='my-key',
131+
)
132+
133+
# After (v4)
134+
from apify import Actor, Webhook
135+
136+
await Actor.add_webhook(
137+
Webhook(
138+
event_types=['ACTOR.RUN.FAILED'],
139+
request_url='https://example.com',
140+
ignore_ssl_errors=False,
141+
do_not_retry=False,
142+
idempotency_key='my-key',
143+
)
144+
)
145+
```
146+
147+
The `idempotency_key` kwarg form on `Actor.add_webhook` still works for one more release but emits a `DeprecationWarning` and will be removed in v5.0. The `ignore_ssl_errors` and `do_not_retry` kwargs have been removed outright — set them on the `Webhook` instance.
148+
149+
`apify.WebhookCondition` is no longer exported; the SDK now binds the webhook to the current Actor run internally.
150+
151+
The `webhooks` argument on `Actor.start`, `Actor.call`, and `Actor.call_task` still accepts `list[Webhook]` and the fields used at the call site (`event_types`, `request_url`, `payload_template`, `headers_template`) are unchanged.
152+
153+
## `Actor.new_client``timeout` scales all tiers
154+
155+
`apify-client` v3 split its single timeout into four tiers (short / medium / long / max). `Actor.new_client(timeout=...)` still takes a single `timedelta`; the SDK uses it as the medium-tier baseline and scales the other tiers proportionally (short = `timeout / 6`, long = `timeout * 12`, max = `timeout * 12`). The public signature is unchanged — no migration needed.
156+
157+
## Using the client from `Actor.new_client`
158+
159+
`Actor.new_client()` (and the `Actor.apify_client` property) now returns an `apify-client` v3 `ApifyClientAsync`. When you use that client directly, the client's v3 breaking changes apply — the most impactful ones are below. See the client's [Upgrading to v3](https://docs.apify.com/api/client/python/docs/upgrading/upgrading-to-v3) guide for the complete reference.
160+
161+
### 404 raises `NotFoundError` on ambiguous endpoints
162+
163+
Direct `.get(id)` and `.delete(id)` calls still swallow 404 into `None`. But where a 404 could mean either the parent or the sub-resource is missing, the client now raises `NotFoundError` instead of returning `None`.
164+
165+
```python
166+
# Before (v3)
167+
client = Actor.new_client()
168+
169+
# Returned None on 404.
170+
dataset = await client.run('some-run-id').dataset().get()
171+
172+
# After (v4)
173+
from apify_client.errors import NotFoundError
174+
175+
client = Actor.new_client()
176+
177+
# Raises NotFoundError; handle it explicitly.
178+
try:
179+
dataset = await client.run('some-run-id').dataset().get()
180+
except NotFoundError:
181+
dataset = None
182+
```
183+
184+
### Keyword-only arguments
185+
186+
Secondary parameters on several client methods can no longer be passed positionally.
187+
188+
```python
189+
# Before (v3)
190+
await client.key_value_store('my-store').set_record('my-key', {'data': 1}, 'application/json')
191+
await client.run('my-run').charge('my-event', 5)
192+
193+
# After (v4)
194+
await client.key_value_store('my-store').set_record('my-key', {'data': 1}, content_type='application/json')
195+
await client.run('my-run').charge('my-event', count=5)
196+
```
197+
198+
### Async `iterate_*` are no longer coroutine functions
199+
200+
`DatasetClientAsync.iterate_items()` and `KeyValueStoreClientAsync.iterate_keys()` are now plain `def` functions returning `AsyncIterator[T]`. Consumer code (`async for ...`) is unchanged; if you annotate the call's return value, change `AsyncGenerator[T, None]` to `AsyncIterator[T]`.

pyproject.toml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,7 @@ keywords = [
3434
"scraping",
3535
]
3636
dependencies = [
37-
"apify-client>=2.3.0,<3.0.0",
38-
"apify-shared>=2.0.0,<3.0.0",
37+
"apify-client>=3.0.0,<4.0.0",
3938
"crawlee>=1.0.4,<2.0.0",
4039
"cachetools>=5.5.0",
4140
"cryptography>=42.0.0",
@@ -197,7 +196,7 @@ builtins-ignorelist = ["id"]
197196

198197
[tool.ruff.lint.isort]
199198
known-local-folder = ["apify"]
200-
known-first-party = ["apify_client", "apify_shared", "crawlee"]
199+
known-first-party = ["apify_client", "crawlee"]
201200

202201
[tool.ruff.lint.pylint]
203202
max-branches = 18

src/apify/__init__.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
from importlib import metadata
22

3-
from apify_shared.consts import WebhookEventType
3+
from apify_client._literals import WebhookEventType
44
from crawlee import Request
55
from crawlee.events import (
66
Event,
@@ -14,13 +14,18 @@
1414

1515
from apify._actor import Actor
1616
from apify._configuration import Configuration
17-
from apify._models import Webhook
17+
from apify._consts import ActorEnvVars, ApifyEnvVars
1818
from apify._proxy_configuration import ProxyConfiguration, ProxyInfo
19+
from apify._webhook import Webhook
20+
from apify.events._types import ActorEventTypes
1921

2022
__version__ = metadata.version('apify')
2123

2224
__all__ = [
2325
'Actor',
26+
'ActorEnvVars',
27+
'ActorEventTypes',
28+
'ApifyEnvVars',
2429
'Configuration',
2530
'Event',
2631
'EventAbortingData',

0 commit comments

Comments
 (0)