|
12 | 12 | * [Comparing](#comparing) |
13 | 13 | * [Aggregation](#aggregation) |
14 | 14 | * [Transforming](#transforming) |
| 15 | +* [Evaluation strategies](#evaluation-strategies) |
15 | 16 | * [FAQ](#faq) |
16 | 17 | * [License](#license) |
17 | 18 | * [Contributing](#contributing) |
@@ -101,24 +102,28 @@ final class Invoices extends Collection |
101 | 102 | } |
102 | 103 | ``` |
103 | 104 |
|
104 | | -<div id='writing'></div> |
105 | | - |
106 | | -#### Creating from a closure |
107 | | - |
108 | | -The `createLazyFromClosure` method creates a lazy collection backed by a closure that produces an iterable. The |
109 | | -closure is invoked each time the collection is iterated, enabling safe re-iteration over generators or other single-use |
110 | | -iterables. |
| 105 | +### Creating collections |
111 | 106 |
|
112 | 107 | ```php |
113 | 108 | use TinyBlocks\Collection\Collection; |
114 | 109 |
|
115 | | -$collection = Collection::createLazyFromClosure(factory: static function (): iterable { |
| 110 | +$eager = Collection::createFrom(elements: [1, 2, 3]); |
| 111 | + |
| 112 | +$eagerFromClosure = Collection::createFromClosure(factory: static function (): array { |
| 113 | + return [1, 2, 3]; |
| 114 | +}); |
| 115 | + |
| 116 | +$lazy = Collection::createLazyFrom(elements: [1, 2, 3]); |
| 117 | + |
| 118 | +$lazyFromClosure = Collection::createLazyFromClosure(factory: static function (): iterable { |
116 | 119 | yield 1; |
117 | 120 | yield 2; |
118 | 121 | yield 3; |
119 | 122 | }); |
120 | 123 | ``` |
121 | 124 |
|
| 125 | +<div id='writing'></div> |
| 126 | + |
122 | 127 | ## Writing |
123 | 128 |
|
124 | 129 | These methods enable adding, removing, and modifying elements in the Collection. |
@@ -407,6 +412,133 @@ These methods allow the Collection's elements to be transformed or converted int |
407 | 412 | $collection->toJson(keyPreservation: KeyPreservation::DISCARD); |
408 | 413 | ``` |
409 | 414 |
|
| 415 | +<div id='evaluation-strategies'></div> |
| 416 | + |
| 417 | +## Evaluation strategies |
| 418 | + |
| 419 | +The complexity of every operation in this library is determined by the evaluation strategy chosen at creation time. |
| 420 | +Calling `createFrom`, `createFromEmpty`, or `createFromClosure` produces a collection backed by an `EagerPipeline`. |
| 421 | +Calling `createLazyFrom`, `createLazyFromEmpty`, or `createLazyFromClosure` produces a collection backed by a |
| 422 | +`LazyPipeline`. All subsequent operations on that collection inherit the behavior of the chosen pipeline. |
| 423 | + |
| 424 | +This is analogous to how `java.util.ArrayList` and `java.util.LinkedList` both implement `java.util.List`, but each |
| 425 | +operation has different costs depending on which concrete class backs the list. |
| 426 | + |
| 427 | +### Eager pipeline |
| 428 | + |
| 429 | +When the collection is created eagerly, elements are stored in a plain PHP array. This array is the source of truth |
| 430 | +for all operations. |
| 431 | + |
| 432 | +**Creation.** Factory methods like `createFrom` call `iterator_to_array` on the input, consuming all elements |
| 433 | +immediately. Time: O(n). Space: O(n). |
| 434 | + |
| 435 | +**Transforming operations.** Every call to a transforming method (`add`, `filter`, `map`, `sort`, etc.) calls |
| 436 | +`pipe()` internally, which executes `iterator_to_array($operation->apply($this->elements))`. This means the |
| 437 | +operation is applied to all elements immediately and the result is stored in a new array. The time cost depends |
| 438 | +on the operation (O(n) for filter, O(n log n) for sort), and the space cost is always O(n) because a new array |
| 439 | +is allocated. |
| 440 | + |
| 441 | +**Access operations.** Methods like `count`, `first`, `last`, and `getBy` read the internal array directly. |
| 442 | +`count` calls PHP's native `count()` on the array. `first` and `last` use `array_key_first` and `array_key_last`. |
| 443 | +`getBy` uses `array_key_exists`. All are O(1) time and O(1) space. |
| 444 | + |
| 445 | +**Terminal operations.** Methods like `contains`, `reduce`, `each`, `equals`, and `findBy` iterate over the |
| 446 | +collection. Since the elements are already materialized, the iteration itself is O(n). No additional |
| 447 | +materialization cost is incurred. |
| 448 | + |
| 449 | +### Lazy pipeline |
| 450 | + |
| 451 | +When the collection is created lazily, nothing is computed at creation time. The source (iterable or closure) is |
| 452 | +stored by reference, and operations are accumulated as stages in an array. |
| 453 | + |
| 454 | +**Creation.** Factory methods like `createLazyFrom` store a reference to the iterable. `createLazyFromClosure` |
| 455 | +stores the closure without invoking it. Time: O(1). Space: O(1). |
| 456 | + |
| 457 | +**Transforming operations.** Every call to a transforming method calls `pipe()`, which appends the operation to |
| 458 | +the internal `$stages` array. No elements are processed. Time: O(1). Space: O(1). The actual cost is deferred |
| 459 | +to the moment the collection is consumed. |
| 460 | + |
| 461 | +**Consumption.** When the collection is iterated (explicitly or through `count`, `toArray`, `reduce`, etc.), |
| 462 | +`process()` is called. It invokes the source closure (if applicable), then chains all stages into a generator |
| 463 | +pipeline. Elements flow one at a time through every stage: each element passes through stage 0, then stage 1, |
| 464 | +then stage 2, and so on, before the next element enters the pipeline. For k streaming stages, total time is |
| 465 | +O(n * k). |
| 466 | + |
| 467 | +**Access operations.** `count` calls `iterator_count`, which consumes the entire generator: O(n). `first` and |
| 468 | +`isEmpty` yield one element from the generator: O(1). `last` and `getBy` iterate the generator: O(n) worst case. |
| 469 | + |
| 470 | +**Barrier operations.** Most operations are streaming: they process one element at a time without accumulating |
| 471 | +state. Two operations are exceptions. `sort` must consume all input (via `iterator_to_array`), sort it, then |
| 472 | +yield the sorted result: O(n log n) time, O(n) space. `groupBy` must accumulate all elements into a groups |
| 473 | +array, then yield: O(n) time, O(n) space. When a barrier exists in a lazy pipeline, it forces full evaluation |
| 474 | +of all preceding stages before any subsequent stage can process an element. This means that calling `first()` |
| 475 | +on a lazy collection that has a `sort()` in its pipeline still costs O(n log n), because the sort barrier must |
| 476 | +consume everything first. |
| 477 | + |
| 478 | +### Complexity reference |
| 479 | + |
| 480 | +The table below summarizes the time and space complexity of each method under both strategies. Each value was |
| 481 | +derived by tracing the execution path from `Collection` through the `Pipeline` into the underlying `Operation`. |
| 482 | +The column "Why" references the pipeline behavior described above. |
| 483 | + |
| 484 | +#### Factory methods |
| 485 | + |
| 486 | +| Method | Time | Space | Why | |
| 487 | +|-------------------------|------|-------|------------------------------------------------------| |
| 488 | +| `createFrom` | O(n) | O(n) | Calls `iterator_to_array` on the input. | |
| 489 | +| `createFromEmpty` | O(1) | O(1) | Creates an empty array. | |
| 490 | +| `createFromClosure` | O(n) | O(n) | Invokes the closure, then calls `iterator_to_array`. | |
| 491 | +| `createLazyFrom` | O(1) | O(1) | Stores the iterable reference without iterating. | |
| 492 | +| `createLazyFromEmpty` | O(1) | O(1) | Stores an empty array reference. | |
| 493 | +| `createLazyFromClosure` | O(1) | O(1) | Stores the closure without invoking it. | |
| 494 | + |
| 495 | +#### Transforming methods |
| 496 | + |
| 497 | +For lazy collections, all transforming methods are O(1) time and O(1) space at call time because `pipe()` only |
| 498 | +appends a stage. The cost shown below is for eager collections, where `pipe()` materializes immediately. |
| 499 | + |
| 500 | +| Method | Time | Space | Why | |
| 501 | +|-------------|------------|----------|------------------------------------------------------------------------------------------| |
| 502 | +| `add` | O(n + m) | O(n + m) | Yields all existing elements, then the m new ones. | |
| 503 | +| `merge` | O(n + m) | O(n + m) | Yields all elements from both collections. | |
| 504 | +| `filter` | O(n) | O(n) | Tests each element against the predicate. | |
| 505 | +| `map` | O(n * t) | O(n) | Applies t transformations to each element. | |
| 506 | +| `flatten` | O(n + s) | O(n + s) | Iterates each element; expands nested iterables by one level. s = total nested elements. | |
| 507 | +| `remove` | O(n) | O(n) | Tests each element for equality. | |
| 508 | +| `removeAll` | O(n) | O(n) | Tests each element against the predicate. | |
| 509 | +| `sort` | O(n log n) | O(n) | Materializes all elements, sorts via `uasort` or `ksort`, then yields. Barrier. | |
| 510 | +| `slice` | O(n) | O(n) | Iterates up to offset + length elements. | |
| 511 | +| `groupBy` | O(n) | O(n) | Accumulates all elements into a groups array, then yields. Barrier. | |
| 512 | + |
| 513 | +#### Access methods |
| 514 | + |
| 515 | +These delegate directly to the pipeline. The cost differs between eager and lazy because eager reads the |
| 516 | +internal array, while lazy must evaluate the generator. |
| 517 | + |
| 518 | +| Method | Eager | Lazy | Why | |
| 519 | +|-----------|-------|------|------------------------------------------------------------------------| |
| 520 | +| `count` | O(1) | O(n) | Eager: `count($array)`. Lazy: `iterator_count($generator)`. | |
| 521 | +| `first` | O(1) | O(1) | Eager: `array_key_first`. Lazy: first yield from the generator. | |
| 522 | +| `last` | O(1) | O(n) | Eager: `array_key_last`. Lazy: iterates all to reach the last element. | |
| 523 | +| `getBy` | O(1) | O(n) | Eager: `array_key_exists`. Lazy: iterates until the index. | |
| 524 | +| `isEmpty` | O(1) | O(1) | Checks if the first element exists. | |
| 525 | + |
| 526 | +#### Terminal methods |
| 527 | + |
| 528 | +These iterate the collection to produce a result. Since eager collections already hold a materialized array, the |
| 529 | +iteration cost is the same for both strategies. |
| 530 | + |
| 531 | +| Method | Time | Space | Why | |
| 532 | +|----------------|----------|-------|-----------------------------------------------------------------| |
| 533 | +| `contains` | O(n) | O(1) | Iterates until the element is found or the end is reached. | |
| 534 | +| `findBy` | O(n * p) | O(1) | Tests p predicates per element until a match. | |
| 535 | +| `each` | O(n * a) | O(1) | Applies a actions to every element. | |
| 536 | +| `equals` | O(n) | O(1) | Walks two generators in parallel, comparing element by element. | |
| 537 | +| `reduce` | O(n) | O(1) | Folds all elements into a single carry value. | |
| 538 | +| `joinToString` | O(n) | O(n) | Accumulates into an intermediate array, then calls `implode`. | |
| 539 | +| `toArray` | O(n) | O(n) | Iterates all elements into a new array. | |
| 540 | +| `toJson` | O(n) | O(n) | Calls `toArray`, then `json_encode`. | |
| 541 | + |
410 | 542 | <div id='faq'></div> |
411 | 543 |
|
412 | 544 | ## FAQ |
@@ -434,13 +566,12 @@ recreate the `Collection`. |
434 | 566 |
|
435 | 567 | ### 03. What is the difference between eager and lazy evaluation? |
436 | 568 |
|
437 | | -- **Eager evaluation** (`createFrom` / `createFromEmpty`): Elements are materialized immediately into an array, enabling |
438 | | - constant-time access by index, count, and repeated iteration. |
| 569 | +- **Eager evaluation** (`createFrom` / `createFromEmpty` / `createFromClosure`): Elements are materialized immediately |
| 570 | + into an array, enabling constant-time access by index, count, first, last, and repeated iteration. |
439 | 571 |
|
440 | 572 | - **Lazy evaluation** (`createLazyFrom` / `createLazyFromEmpty` / `createLazyFromClosure`): Elements are processed |
441 | | - on-demand through generators, |
442 | | - consuming memory only as each element is yielded. Ideal for large datasets or pipelines where not all elements need to |
443 | | - be materialized. |
| 573 | + on-demand through generators, consuming memory only as each element is yielded. Ideal for large datasets or pipelines |
| 574 | + where not all elements need to be materialized. |
444 | 575 |
|
445 | 576 | <div id='license'></div> |
446 | 577 |
|
|
0 commit comments