You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/core/map.md
+33-82Lines changed: 33 additions & 82 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,8 +28,6 @@
28
28
29
29
**Concurrency control** - Limiting how many items process simultaneously using `max_concurrency` in `MapConfig`.
30
30
31
-
**Item batching** - Grouping multiple items together for processing as a single unit to optimize efficiency.
32
-
33
31
**Completion criteria** - Rules that determine when a map operation succeeds or fails based on individual item results.
34
32
35
33
[↑ Back to top](#table-of-contents)
@@ -42,7 +40,7 @@ Use map operations to:
42
40
- Transform collections with automatic checkpointing
43
41
- Process lists of items in parallel
44
42
- Handle large datasets with resilience
45
-
- Control concurrency and batching behavior
43
+
- Control concurrency behavior
46
44
- Define custom success/failure criteria
47
45
48
46
Map operations use `context.map()` to process collections efficiently. Each item becomes an independent operation that executes in parallel with other items.
@@ -55,7 +53,6 @@ Map operations use `context.map()` to process collections efficiently. Each item
55
53
-**Independent checkpointing** - Each item's result is saved separately
56
54
-**Partial completion** - Completed items don't reprocess on replay
57
55
-**Concurrency control** - Limit simultaneous processing with `max_concurrency`
58
-
-**Batching support** - Group items for efficient processing
**Use max_concurrency wisely**- Too much concurrency can overwhelm external services or exhaust Lambda resources. Start conservative and increase as needed.
417
380
418
-
**Batch small operations**- If each item processes quickly (<100ms), batching reduces overhead:
419
-
420
-
```python
421
-
config= MapConfig(
422
-
item_batcher=ItemBatcher(max_items_per_batch=10)
423
-
)
424
-
```
425
-
426
381
**Optimize map functions**- Keep map functions lightweight. Move heavy computation into steps within the map function.
427
382
428
383
**Use appropriate completion criteria**- Fail fast with`tolerated_failure_count` to avoid processing remaining items when many fail.
429
384
430
385
**Monitor checkpoint size**- Large result objects increase checkpoint size and Lambda memory usage. Return only necessary data.
431
386
432
-
**Consider memory limits**- Processing thousands of items creates many checkpoints. Monitor Lambda memory and adjust batch size orconcurrency.
387
+
**Consider memory limits**- Processing thousands of items creates many checkpoints. Monitor Lambda memory and adjust concurrency.
433
388
434
-
**Profile your workload**- Test with representative data to find optimal concurrency and batch settings.
389
+
**Profile your workload**- Test with representative data to find optimal concurrency settings.
435
390
436
391
[↑ Back to top](#table-of-contents)
437
392
@@ -443,7 +398,7 @@ A: Map operations process a collection of similar items using the same function.
443
398
444
399
**Q: How many items can I process?**
445
400
446
-
A: There's no hard limit, but consider Lambda's memory and timeout constraints. For very large collections (10,000+ items), use batching or processin chunks.
401
+
A: There's no hard limit, but consider Lambda's memory and timeout constraints. For very large collections (10,000+ items), consider processingin chunks.
447
402
448
403
**Q: Do items process in order?**
449
404
@@ -471,10 +426,6 @@ for item_result in batch_result.results:
471
426
472
427
A: Yes, you can call `context.map()` inside a map function to process nested collections.
473
428
474
-
**Q: How does batching work?**
475
-
476
-
A: When you configure `item_batcher`, multiple items are grouped together and passed as a `BatchedInput` to your map function. Process all items in`batch.items`.
477
-
478
429
**Q: What's the difference between serdes and item_serdes?**
479
430
480
431
A: `item_serdes` serializes individual item results as they complete. `serdes` serializes the entire `BatchResult` at the end. Use both for custom serialization at different levels.
0 commit comments