Skip to content

Commit 6aaf8b4

Browse files
vdusekclaude
andauthored
docs: Add missing content to concept pages (#699)
## Summary - **Pagination page**: Added "Generator-based iteration" section documenting `iterate_items()` as the recommended approach, with sync/async code examples and a mention of `iterate_keys()` for key-value stores. - **Async support & Error handling pages**: Added closing notes with `<ApiLink>` references to `ApifyClientAsync` and `ApifyApiError` so pages no longer end abruptly after the code block. - **Timeouts page** (new): Added `docs/02_concepts/11_timeouts.mdx` covering the tiered timeout system (short/medium/long/no_timeout), global configuration via constructor, per-call overrides with `timedelta` or tier literals, and interaction with the retry system. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 905ab7c commit 6aaf8b4

File tree

8 files changed

+166
-6
lines changed

8 files changed

+166
-6
lines changed

docs/02_concepts/01_async_support.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ description: Use the async client for non-blocking API calls with Python asyncio
77
import Tabs from '@theme/Tabs';
88
import TabItem from '@theme/TabItem';
99
import CodeBlock from '@theme/CodeBlock';
10-
1110
import ApiLink from '@site/src/components/ApiLink';
1211

1312
import AsyncSupportExample from '!!raw-loader!./code/01_async_support.py';
@@ -19,3 +18,5 @@ The following example shows how to run an Actor asynchronously and stream its lo
1918
<CodeBlock className="language-python">
2019
{AsyncSupportExample}
2120
</CodeBlock>
21+
22+
For the full async client API, see the <ApiLink to="class/ApifyClientAsync">`ApifyClientAsync`</ApiLink> reference.

docs/02_concepts/04_error_handling.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@ description: Handle API errors with the ApifyApiError exception and automatic da
77
import Tabs from '@theme/Tabs';
88
import TabItem from '@theme/TabItem';
99
import CodeBlock from '@theme/CodeBlock';
10-
1110
import ApiLink from '@site/src/components/ApiLink';
1211

1312
import ErrorAsyncExample from '!!raw-loader!./code/04_error_async.py';
@@ -27,3 +26,5 @@ When you use the Apify client, it automatically extracts all relevant data from
2726
</CodeBlock>
2827
</TabItem>
2928
</Tabs>
29+
30+
For a complete list of error classes, see the <ApiLink to="class/ApifyApiError">`ApifyApiError`</ApiLink> reference.

docs/02_concepts/08_pagination.mdx

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ import ApiLink from '@site/src/components/ApiLink';
1313
import PaginationAsyncExample from '!!raw-loader!./code/08_pagination_async.py';
1414
import PaginationSyncExample from '!!raw-loader!./code/08_pagination_sync.py';
1515

16+
import IterateItemsAsyncExample from '!!raw-loader!./code/08_iterate_items_async.py';
17+
import IterateItemsSyncExample from '!!raw-loader!./code/08_iterate_items_sync.py';
18+
1619
Most methods named `list` or `list_something` in the Apify client return a <ApiLink to="class/ListPage">`ListPage`</ApiLink> object. This object provides a consistent interface for working with paginated data and includes the following properties:
1720

1821
- `items` - The main results you're looking for.
@@ -38,8 +41,25 @@ The following example shows how to fetch all items from a dataset using paginati
3841
</TabItem>
3942
</Tabs>
4043

41-
This approach lets you efficiently retrieve large datasets through pagination while keeping memory usage under control.
44+
The <ApiLink to="class/ListPage">`ListPage`</ApiLink> interface offers several key benefits. Its consistent structure ensures predictable results for most `list` methods, providing a uniform way to work with paginated data. It also offers flexibility, allowing you to customize the `limit` and `offset` parameters to control data fetching according to your needs. Additionally, it provides scalability, enabling you to efficiently handle large datasets through pagination. This approach ensures efficient data retrieval while keeping memory usage under control, making it ideal for managing and processing large collections.
45+
46+
## Generator-based iteration
47+
48+
For most use cases, `iterate_items()` is the recommended way to process all items in a dataset. It handles pagination automatically using a Python generator, fetching items in batches behind the scenes so you don't need to manage offsets or limits yourself.
49+
50+
<Tabs>
51+
<TabItem value="AsyncExample" label="Async client" default>
52+
<CodeBlock className="language-python">
53+
{IterateItemsAsyncExample}
54+
</CodeBlock>
55+
</TabItem>
56+
<TabItem value="SyncExample" label="Sync client">
57+
<CodeBlock className="language-python">
58+
{IterateItemsSyncExample}
59+
</CodeBlock>
60+
</TabItem>
61+
</Tabs>
62+
63+
`iterate_items()` accepts the same filtering parameters as `list_items()` (`clean`, `fields`, `omit`, `unwind`, `skip_empty`, `skip_hidden`), so you can combine automatic pagination with data filtering.
4264

43-
:::tip
44-
For most use cases, prefer `iterate_items()` which handles pagination automatically and yields items one by one.
45-
:::
65+
Similarly, `KeyValueStoreClient` provides an `iterate_keys()` method for iterating over all keys in a key-value store without manual pagination.

docs/02_concepts/11_timeouts.mdx

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
id: timeouts
3+
title: Timeouts
4+
description: Configure the tiered timeout system for controlling how long API requests can take.
5+
---
6+
7+
import Tabs from '@theme/Tabs';
8+
import TabItem from '@theme/TabItem';
9+
import CodeBlock from '@theme/CodeBlock';
10+
import ApiLink from '@site/src/components/ApiLink';
11+
12+
import TimeoutsAsyncExample from '!!raw-loader!./code/11_timeouts_async.py';
13+
import TimeoutsSyncExample from '!!raw-loader!./code/11_timeouts_sync.py';
14+
15+
The Apify client uses a tiered timeout system to set appropriate time limits for different types of API requests. Each tier has a default value suited to its use case:
16+
17+
| Tier | Default | Purpose |
18+
|---|---|---|
19+
| `short` | 5 seconds | Fast CRUD operations (get, update, delete) |
20+
| `medium` | 30 seconds | Batch, list, and data transfer operations |
21+
| `long` | 360 seconds | Long-polling, streaming, and heavy operations |
22+
| `no_timeout` || Disables the timeout entirely |
23+
24+
Every client method has a pre-assigned tier that matches the expected duration of the underlying API call. You generally don't need to change these unless you're working with unusually large payloads or slow network conditions.
25+
26+
## Configuring default timeouts
27+
28+
You can override the default values for each tier in the <ApiLink to="class/ApifyClient">`ApifyClient`</ApiLink> or <ApiLink to="class/ApifyClientAsync">`ApifyClientAsync`</ApiLink> constructor. The `timeout_max` parameter sets an upper cap on the timeout for any individual API request, limiting exponential growth during retries.
29+
30+
<Tabs>
31+
<TabItem value="AsyncExample" label="Async client" default>
32+
<CodeBlock className="language-python">
33+
{TimeoutsAsyncExample}
34+
</CodeBlock>
35+
</TabItem>
36+
<TabItem value="SyncExample" label="Sync client">
37+
<CodeBlock className="language-python">
38+
{TimeoutsSyncExample}
39+
</CodeBlock>
40+
</TabItem>
41+
</Tabs>
42+
43+
## Per-call overrides
44+
45+
Most client methods accept a `timeout` parameter that overrides the default tier for that specific call. You can pass either a `timedelta` for an exact duration or a tier literal (`'short'`, `'medium'`, `'long'`, `'no_timeout'`) to switch tiers.
46+
47+
```python
48+
from datetime import timedelta
49+
50+
# Use an exact timeout for this call.
51+
client.dataset('id').list_items(timeout=timedelta(seconds=120))
52+
53+
# Switch to a different tier.
54+
client.dataset('id').list_items(timeout='long')
55+
56+
# Disable the timeout entirely.
57+
client.dataset('id').list_items(timeout='no_timeout')
58+
```
59+
60+
## Interaction with retries
61+
62+
Timeouts work together with the [retry system](/api/client/python/docs/concepts/retries). When a request times out, it counts as a failed attempt and triggers a retry (up to `max_retries`). The timeout applies to each individual attempt, not the total time across all retries.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
from apify_client import ApifyClientAsync
2+
3+
TOKEN = 'MY-APIFY-TOKEN'
4+
5+
6+
async def main() -> None:
7+
apify_client = ApifyClientAsync(TOKEN)
8+
dataset_client = apify_client.dataset('dataset-id')
9+
10+
# Iterate through all items automatically.
11+
async for item in dataset_client.iterate_items():
12+
print(item)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
from apify_client import ApifyClient
2+
3+
TOKEN = 'MY-APIFY-TOKEN'
4+
5+
6+
def main() -> None:
7+
apify_client = ApifyClient(TOKEN)
8+
dataset_client = apify_client.dataset('dataset-id')
9+
10+
# Iterate through all items automatically.
11+
for item in dataset_client.iterate_items():
12+
print(item)
13+
14+
15+
if __name__ == '__main__':
16+
main()
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
from datetime import timedelta
2+
3+
from apify_client import ApifyClientAsync
4+
5+
TOKEN = 'MY-APIFY-TOKEN'
6+
7+
8+
async def main() -> None:
9+
# Configure default timeout tiers globally.
10+
apify_client = ApifyClientAsync(
11+
token=TOKEN,
12+
timeout_short=timedelta(seconds=10),
13+
timeout_medium=timedelta(seconds=60),
14+
timeout_long=timedelta(seconds=600),
15+
timeout_max=timedelta(seconds=600),
16+
)
17+
18+
dataset_client = apify_client.dataset('dataset-id')
19+
20+
# Override the timeout for a single call using a timedelta.
21+
items = await dataset_client.list_items(timeout=timedelta(seconds=120))
22+
23+
# Or use a tier literal to select a predefined timeout.
24+
items = await dataset_client.list_items(timeout='long')
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
from datetime import timedelta
2+
3+
from apify_client import ApifyClient
4+
5+
TOKEN = 'MY-APIFY-TOKEN'
6+
7+
8+
def main() -> None:
9+
# Configure default timeout tiers globally.
10+
apify_client = ApifyClient(
11+
token=TOKEN,
12+
timeout_short=timedelta(seconds=10),
13+
timeout_medium=timedelta(seconds=60),
14+
timeout_long=timedelta(seconds=600),
15+
timeout_max=timedelta(seconds=600),
16+
)
17+
18+
dataset_client = apify_client.dataset('dataset-id')
19+
20+
# Override the timeout for a single call using a timedelta.
21+
items = dataset_client.list_items(timeout=timedelta(seconds=120))
22+
23+
# Or use a tier literal to select a predefined timeout.
24+
items = dataset_client.list_items(timeout='long')

0 commit comments

Comments
 (0)