pathwaycom
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion b/‎CHANGELOG.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/2.developers/4.user-guide/80.advanced/60.worker_count_scaling.md‎
Lines changed: 126 additions & 0 deletions b/‎docs/2.developers/4.user-guide/80.advanced/60.worker_count_scaling.md‎
Lines changed: 126 additions & 0 deletions
diff --git a/‎external/timely-dataflow/timely/src/worker.rs‎
Lines changed: 25 additions & 5 deletions b/‎external/timely-dataflow/timely/src/worker.rs‎
Lines changed: 25 additions & 5 deletions
diff --git a/‎integration_tests/common/conftest.py‎
Lines changed: 17 additions & 0 deletions b/‎integration_tests/common/conftest.py‎
Lines changed: 17 additions & 0 deletions
diff --git a/‎integration_tests/common/example_scaling.py‎
Lines changed: 74 additions & 0 deletions b/‎integration_tests/common/example_scaling.py‎
Lines changed: 74 additions & 0 deletions
@@ -8,7 +8,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
 ### Added
 - `pw.io.kafka.read` and `pw.io.kafka.write` connectors now support OAUTHBEARER authentication.
 - `pw.io.mongodb.write` connector now supports an `output_table_type` parameter with two modes: `stream_of_changes` (default) and `snapshot`. In `snapshot` mode, the connector maintains the current state of the Pathway table in MongoDB using the `_id` field as the primary key, while `stream_of_changes` preserves the existing behavior by writing all events with `time` and `diff` flags to reflect transactional minibatches and the nature of each change.
-
+- Workers can now automatically scale up or down based on pipeline load, using a configurable monitoring window. This feature requires persistence to be enabled and can be configured via `worker_scaling_enabled` and `workload_tracking_window_ms` in `pw.persistence.Config`. Please refer to the tutorial for more details.
 
 ## [0.29.0] - 2026-01-22
 
 
@@ -0,0 +1,126 @@
+---
+title: 'Dynamic Worker Scaling'
+description: 'This page describes how to set up dynamic worker count scaling in Pathway'
+---
+
+# Dynamic Worker Scaling
+
+Programs built with Pathway allow you to scale workload and split heavy pipelines into parallel units of execution. This enables effective parallelization and higher throughput for computationally intensive workloads.
+
+To run computations in parallel, you can use the `pathway spawn` command and specify the number of workers to use.
+
+This way:
+- Each worker executes in parallel.
+- Workers can be implemented either as **processes** or as **threads**.
+- By increasing the number of workers, you can split computations into more parallel parts and achieve better throughput.
+
+When you start your program this way, it will run **continuously with exactly `N` workers** for its entire lifetime. It helps in the set-ups with a consistent load profile, but has certain limitations if the load profile is volatile: if the system load is low, resources may be underutilized, while if the load increases, the fixed number of workers may become a bottleneck. In other words, the worker count doesn't adapt automatically and you have a **static execution model**.
+
+In many real-world scenarios, it is desirable for the number of workers to be dynamic: under low load, the number of active workers should decrease to save resources, while under high load, the number of workers should increase proportionally to handle the demand. To support this use case, Pathway provides **built-in autoscaling mechanisms** that allow worker counts to grow and shrink automatically based on workload.
+
+## How Dynamic Scaling Works
+
+Understanding the basics of the scaling mechanism is useful for setting it up efficiently.
+
+When you launch a Pathway computation via `pathway spawn`, an orchestrator creates and manages the workers. It receives information from the workers based on which the number of workers can be adjusted.
+
+Each worker, if configured, tracks its load profile. This tracking is based on the time spent on computation over a sliding window (2 minutes by default, configurable by the user), as well as the worker's idle time and the intervals the scheduler expected the computations to take. After the full window interval has passed, patterns with excessive idle time lead to worker termination, where an exit code informs the orchestrator that the engine should be scaled down. Conversely, if computations consistently fall behind, the orchestrator determines that the computation should be restarted with a larger number of workers.
+
+After receiving a signal, the orchestrator adjusts the number of workers and restarts the computation. The sharding and work-splitting mechanisms are updated for the new worker count.
+
+To ensure computations resume from the point reached when the decision to change the number of workers was made, **data persistence is required**. Scaling can also be configured within the data persistence settings.
+
+Last, but not the least, please note that the described procedure implies a full restart of the computation graph. Persistence mitigates this, but does not eliminate restart costs.
+
+### Worker Count Adjustment Rules
+
+As mentioned, the number of workers can be recalculated by the orchestration process. The following rules illustrate how this adjustment works:
+
+- **Increasing workers:** The orchestrator doubles the current number of workers.
+  - Example: If you have 4 workers and scaling up is required, the system will increase the count to 8.
+- **Decreasing workers:** The orchestrator halves the current number of workers.  
+  - Example: If you have 4 workers and scaling down is required, the system will reduce the count to 2.
+
+In any case, you can't have less than one worker. Therefore, even if the pipeline is very light and only one worker is running, the orchestrator cannot reduce the number further.
+
+The scaling process scales only by increasing or decreasing the number of **processes**. Threads are **not used for dynamic scaling** in this mechanism. This way, if your initial configuration uses thread workers or uses both, threads and processes, the scaling will only change the process number. For example, if you launched the computation with one process, containing two workers, the upscaling will lead to two processes, having two workers each. On the other hand, downscaling from the initial configuration in this case won't be possible, since the number of processes is already equal to one.
+
+### License Limitations
+
+You need a Pathway License in order for the scaling to work. You can obtain your free Pathway Scale license [here](/get-license). The page contains instructions for getting the license and using it in the pipeline.
+
+Please note that the free scaling license has a constraint on the maximum worker count: **no more than 8**. This way, if the free tier is used, once the system uses 8 workers, it won't scale up, even if the computation falls behind. Note also that if the system scales up, the number of workers after scaling in the free tier can't exceed 8.
+
+## Configuring and Running
+
+With these restrictions in mind, you are ready to configure and run auto-scaling.
+
+First, you need to create a persistence configuration to preserve the computation state and progress between worker restarts, as without it, once restarted, the computation will commence from the beginning. A simple version would look as follows:
+
+```python
+from pathway.internals import api
+
+persistence_config = pw.persistence.Config(
+    backend=pw.persistence.Backend.filesystem(your_persistent_storage_path),  # e.g., /tmp/Pathway-Cache
+    persistence_mode=api.PersistenceMode.OPERATOR_PERSISTING,
+)
+```
+
+Please note the `persistence_mode` parameter: in high-load scenarios, it is crucial to use `OPERATOR_PERSISTING`. This mode allows the system to dump only the state of internal computation structures, avoiding heavy recomputations that may occur if upscaling becomes necessary.
+
+However, this configuration does not yet include scaling settings; in this form, scaling remains disabled. You need to enable it by toggling the `worker_scaling_enabled` flag:
+
+```python
+from pathway.internals import api
+
+persistence_config = pw.persistence.Config(
+    backend=pw.persistence.Backend.filesystem(your_persistent_storage_path),  # e.g., /tmp/Pathway-Cache
+    persistence_mode=api.PersistenceMode.OPERATOR_PERSISTING,
+    worker_scaling_enabled=True,
+)
+```
+
+With this setting, the program will track the workload and be capable of scaling up and down. By default, statistics are computed over a two-minute window. You can change this by specifying the number of milliseconds in the `workload_tracking_window_ms` parameter:
+
+```python
+from pathway.internals import api
+
+persistence_config = pw.persistence.Config(
+    backend=pw.persistence.Backend.filesystem(your_persistent_storage_path),  # e.g., /tmp/Pathway-Cache
+    persistence_mode=api.PersistenceMode.OPERATOR_PERSISTING,
+    worker_scaling_enabled=True,
+    workload_tracking_window_ms=300000,  # 5 minutes
+)
+```
+
+Keep in mind that this value should not be too small. At startup, data sources may not kick off immediately, taking several seconds to begin providing data. During these initial seconds, the graph will be underloaded because there is no computation to perform without input. Therefore, ensure the window allows for this initial startup duration and is at least 20-30 seconds long.
+
+Once you have configured the persistence settings, pass the object as the `persistence_config` parameter in the `pw.run` method:
+
+```python
+pw.run(persistence_config=persistence_config)
+```
+
+Finally, you can spawn the execution using a console command, for example: `pathway spawn -n 2 python pipeline.py`. This command starts the pipeline, having initially two workers, each one being a process.
+
+## Conclusion
+
+To manage a **dynamic number of workers**, follow these steps:
+
+1. **Configure persistence**  
+   It is strongly recommended to use **operator persistence** (`OPERATOR_PERSISTING`) to ensure computation state is safely stored between worker restarts and the restarts are as fast as possible.
+
+2. **Enable worker scaling**  
+   In your persistence configuration, set the `worker_scaling_enabled` flag. By default, scaling is disabled.
+
+3. **Adjust the workload tracking window**  
+   Set the appropriate `workload_tracking_window_ms` to control how the orchestrator evaluates workload patterns or leave the default, which is two minutes.
+
+4. **Start the computation**  
+   Launch your pipeline with `pathway spawn`. You can also specify the **initial number of workers** at startup:
+
+   ```bash
+   pathway spawn -n <initial_worker_count> python pipeline.py
+   ```
+
+If you have any questions, feel free to reach out on [Discord](http://discord.com/invite/pathway) or open an issue on our [GitHub](https://github.com/pathwaycom/pathway/issues/).
@@ -206,6 +206,17 @@ pub trait AsWorker : Scheduler {
     fn logging(&self) -> Option<crate::logging::TimelyLogger> { self.log_register().get("timely") }
 }
 
+/// Contains statistics from `step_or_park` method.
+#[derive(Debug)]
+pub struct WorkerStepStats {
+    /// Denotes if more computational steps are needed.
+    pub has_more_work: bool,
+
+    /// Contains the duration spent on computation, excluding
+    /// time spent during possible thread parking.
+    pub compute_duration: Duration,
+}
+
 /// A `Worker` is the entry point to a timely dataflow computation. It wraps a `Allocate`,
 /// and has a list of dataflows that it manages.
 pub struct Worker<A: Allocate> {
@@ -299,7 +310,7 @@ impl<A: Allocate> Worker<A> {
     ///     worker.step();
     /// });
     /// ```
-    pub fn step(&mut self) -> bool {
+    pub fn step(&mut self) -> WorkerStepStats {
         self.step_or_park(Some(Duration::from_secs(0)))
     }
 
@@ -330,7 +341,7 @@ impl<A: Allocate> Worker<A> {
     ///     worker.step_or_park(Some(Duration::from_secs(1)));
     /// });
     /// ```
-    pub fn step_or_park(&mut self, duration: Option<Duration>) -> bool {
+    pub fn step_or_park(&mut self, duration: Option<Duration>) -> WorkerStepStats {
 
         {   // Process channel events. Activate responders.
             let mut allocator = self.allocator.borrow_mut();
@@ -367,7 +378,7 @@ impl<A: Allocate> Worker<A> {
             (x, y) => x.or(y),
         };
 
-        if delay != Some(Duration::new(0,0)) {
+        let compute_duration = if delay != Some(Duration::new(0,0)) {
 
             // Log parking and flush log.
             if let Some(l) = self.logging().as_mut() {
@@ -381,8 +392,12 @@ impl<A: Allocate> Worker<A> {
 
             // Log return from unpark.
             self.logging().as_mut().map(|l| l.log(crate::logging::ParkEvent::unpark()));
+
+            // Nothing happens, the thread was parked all the time
+            Duration::ZERO
         }
         else {   // Schedule active dataflows.
+            let computation_started_at = Instant::now();
 
             let active_dataflows = &mut self.active_dataflows;
             self.activations
@@ -404,12 +419,17 @@ impl<A: Allocate> Worker<A> {
                     }
                 }
             }
-        }
+
+            computation_started_at.elapsed()
+        };
 
         // Clean up, indicate if dataflows remain.
         self.logging.borrow_mut().flush();
         self.allocator.borrow_mut().release();
-        !self.dataflows.borrow().is_empty()
+        WorkerStepStats {
+            has_more_work: !self.dataflows.borrow().is_empty(),
+            compute_duration
+        }
     }
 
     /// Calls `self.step()` as long as `func` evaluates to true.
 
@@ -0,0 +1,17 @@
+import pytest
+
+from pathway.tests.utils import UniquePortDispenser
+
+# The configuration is different because there are many workers
+# and each one must provide unique range of 8 consecutive ports,
+# (as the max number of workers) not just one as in other tests
+PORT_DISPENSER = UniquePortDispenser(
+    range_start=1000,
+    worker_range_size=600,
+    step_size=8,
+)
+
+
+@pytest.fixture
+def port(testrun_uid):
+    yield PORT_DISPENSER.get_unique_port(testrun_uid)
@@ -0,0 +1,74 @@
+import argparse
+import json
+import logging
+import time
+
+import pathway as pw
+from pathway.internals import api
+
+
+class StreamerSubject(pw.io.python.ConnectorSubject):
+
+    def __init__(self, rate: int, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.rate = rate
+        self.current_number = 10000000000000
+
+    def run(self):
+        while True:
+            second_started_at = time.time()
+            for index in range(self.rate):
+                event_time = time.time()
+                next_json = {
+                    "number": self.current_number,
+                }
+                self.current_number += 1
+
+                self.next_json(next_json)
+                if index > 0 or index == self.rate - 1:
+                    expected_duration = index * 1.0 / self.rate
+                    actual_duration = event_time - second_started_at
+                    if actual_duration > expected_duration + 0.1:
+                        logging.warning(
+                            "The streaming severely falls behind the target frequency"
+                        )
+                    elif actual_duration < expected_duration:
+                        time.sleep(expected_duration - actual_duration)
+
+
+@pw.udf(deterministic=True)
+def is_prime(event_json) -> bool:
+    number = json.loads(event_json)["number"]
+    if number < 2:
+        return False
+
+    is_prime_flag = number % 2 != 0
+    i = 3
+    while i * i <= number and is_prime_flag:
+        if number % i == 0:
+            is_prime_flag = False
+            break
+        i += 2
+
+    return is_prime_flag
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--rate", type=int, required=True)
+    parser.add_argument("--persistent-storage-path", type=str, required=True)
+    args = parser.parse_args()
+
+    table = pw.io.python.read(subject=StreamerSubject(rate=args.rate), format="raw")
+    table = table.select(prime=is_prime(pw.this.data))
+    pw.io.null.write(table)
+
+    pw.run(
+        persistence_config=pw.persistence.Config(
+            backend=pw.persistence.Backend.filesystem(args.persistent_storage_path),
+            persistence_mode=api.PersistenceMode.OPERATOR_PERSISTING,
+            worker_scaling_enabled=True,
+            workload_tracking_window_ms=60000,
+        ),
+        monitoring_level=pw.MonitoringLevel.NONE,
+    )