Skip to content

feat: add --max-requests for worker process recycling#1354

Open
sansyrox wants to merge 6 commits into
mainfrom
feat/process-recycling
Open

feat: add --max-requests for worker process recycling#1354
sansyrox wants to merge 6 commits into
mainfrom
feat/process-recycling

Conversation

@sansyrox
Copy link
Copy Markdown
Member

@sansyrox sansyrox commented Mar 27, 2026

Summary

  • Adds --max-requests CLI option (and Config.max_requests) to recycle worker processes after N requests, similar to Gunicorn's max_requests. Helps contain memory leaks from long-running Python processes.
  • Adds a global atomic request counter in Rust (REQUEST_COUNT), incremented on every HTTP request in the index handler. Exposed to Python via get_request_count().
  • In child processes, a periodic check (every 5s via loop.call_later) stops the event loop when the threshold is exceeded, allowing the shutdown handler to fire and the process to exit cleanly.
  • In the parent (multi-process mode), a supervisor loop detects exited workers and respawns them with fresh state.
  • Includes graceful shutdown (terminate + join with timeout) since the supervisor loop requires it.

Test plan

  • Existing integration tests pass (no behavior change when max_requests is not set)
  • Manual test: start with --max-requests 50 --processes 2, send 100+ requests, verify server stays alive and worker PIDs change
  • Manual test: verify --max-requests with single process (no recycling, since single-process blocks in spawn_process)
  • Verify get_request_count() returns accurate count from Python

Made with Cursor

Summary by CodeRabbit

  • New Features

    • Added a --max-requests CLI option to recycle worker processes after a configurable number of handled requests.
    • Added a callable to report the number of requests handled by a worker.
  • Improvements

    • Workers now auto-recycle and are automatically replaced when the limit is set.
    • Server stops a worker once its request limit is reached.
    • Shutdown and signal handling for workers is more robust.

Add support for recycling worker processes after a configurable number
of requests, similar to Gunicorn's max_requests. This helps contain
memory leaks from long-running Python processes.

- Add global atomic request counter in Rust, incremented on every HTTP
  request and exposed to Python via get_request_count().
- Add --max-requests CLI argument to Config.
- In child processes, schedule a periodic check (every 5s) that stops
  the event loop when the request count exceeds the threshold.
- In the parent process (multi-process mode), run a supervisor loop
  that detects exited workers and respawns them with fresh state.
- Also includes graceful shutdown (terminate + join with timeout)
  since the supervisor loop requires it.

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
robyn Ready Ready Preview, Comment Apr 12, 2026 8:28pm

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a CLI --max-requests, threads it into the process pool, exposes a Rust-backed get_request_count() to Python, increments a global request counter per HTTP request, and implements per-worker request-count-based shutdown/replacement and coordinated termination.

Changes

Cohort / File(s) Summary
CLI & Config
robyn/argument_parser.py
Add --max-requests CLI flag validated as positive int; store on Config.max_requests (`int
Package Init
robyn/__init__.py
Forward self.config.max_requests into run_processes(...) when starting the server.
Process Pool
robyn/processpool.py
Add max_requests parameter to run_processes, init_processpool, spawn_process; implement SIGINT/SIGTERM shutdown flag, worker termination/join/kill escalation, and worker recycling by detecting exited workers, cloning socket and spawning replacements with max_requests.
Type Stubs
robyn/robyn.pyi
Add get_request_count() -> int; update Server.start(...) signature to accept `max_requests: int
Rust bindings & server
src/lib.rs, src/server.rs
Add atomic REQUEST_COUNT and get_request_count() accessor; Actix middleware to increment counter per request; expose accessor via PyO3; extend Server::start with max_requests and stop server when global count reaches limit.

Sequence Diagram

sequenceDiagram
    participant Client
    participant HTTP as Rust HTTP Server
    participant Worker as Python Worker Process
    participant PoolMgr as Process Pool Manager

    Client->>HTTP: send HTTP request
    HTTP->>HTTP: REQUEST_COUNT += 1
    HTTP-->>Client: respond

    loop periodic poll (if max_requests set)
        Worker->>HTTP: call get_request_count()
        HTTP-->>Worker: return N
        alt N >= max_requests
            Worker->>Worker: initiate graceful shutdown
            Worker-->>PoolMgr: exit observed
            PoolMgr->>PoolMgr: clone socket & spawn replacement (pass max_requests)
        else
            Worker->>Worker: continue serving
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I tally hops beneath the moonlit trees,
When counts grow tall I bow and take my leave.
New paws arrive to mind the busy glen,
We trade the watch and start the dance again.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main feature: adding a --max-requests CLI option for worker process recycling, which matches the primary objective.
Description check ✅ Passed The PR description covers the main changes, implementation details, and includes a comprehensive test plan addressing all key scenarios and functional areas.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/process-recycling

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
robyn/processpool.py (1)

135-151: ⚠️ Potential issue | 🔴 Critical

Don't enable recycling in the inline worker path.

When processes == 1 or on Windows, init_processpool() executes spawn_process() in the current process and returns no supervisor pool. Passing max_requests through this branch makes the only server process stop itself once the threshold is hit, with nothing to replace it. That contradicts the single-process behavior described in the PR objective.

🛠️ Minimal fix
     if sys.platform.startswith("win32") or processes == 1:
+        if max_requests and max_requests > 0:
+            logger.warning("--max-requests is ignored when worker recycling is unavailable.")
         spawn_process(
             directories,
             request_headers,
             routes,
             global_middlewares,
@@
             response_headers,
             excluded_response_headers_paths,
             client_timeout,
             keep_alive_timeout,
-            max_requests,
+            None,
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@robyn/processpool.py` around lines 135 - 151, In the branch guarded by "if
sys.platform.startswith('win32') or processes == 1" (the inline worker path),
stop passing the recycling/max-requests behavior into spawn_process; call
spawn_process without forwarding max_requests (or explicitly pass None/0) so the
single inline server cannot self-terminate once the threshold is hit. Update the
spawn_process invocation in that branch (referenced as spawn_process) to
omit/neutralize the max_requests argument while leaving the other parameters
unchanged so recycling only applies to the supervisor-managed pool.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@robyn/argument_parser.py`:
- Around line 87-94: The --max-requests argument currently accepts non-positive
integers which silently become "disabled" later; update the parser for the
argument (the parser.add_argument call that sets dest="max_requests") to
validate at parse time by rejecting values <= 0: implement a custom type or
argparse check that raises an argparse.ArgumentTypeError for values <= 0 so the
CLI fails fast with a clear error message (e.g., "max-requests must be a
positive integer"); ensure the validation is applied where max_requests is
defined so downstream code (e.g., logic in process pool that expects > 0) no
longer sees ambiguous inputs.

In `@robyn/processpool.py`:
- Around line 84-109: The worker-recycling loop is replacing dead Process
objects in process_pool without reaping them, causing zombie processes; modify
the loop in the block that iterates over process_pool (where Process is
constructed with target=spawn_process and socket.try_clone()) to call
process.join() on the exited process before creating/assigning new_process and
replacing process_pool[i]; ensure you still log the replacement (logger.info)
and then start new_process and assign it into process_pool[i] after the join
completes.

In `@src/server.rs`:
- Around line 490-491: The request counter (REQUEST_COUNT.fetch_add(...)) is
only called inside index(), so requests served by Files handlers, WebSocket
routes, or the default_service const-route path bypass it and under-report via
get_request_count(); move this increment into an app-wide Actix middleware (or
wrapper) that executes for every incoming HTTP request before route dispatch.
Implement a middleware struct (or use actix_web::middleware::Logger-style
middleware) that calls REQUEST_COUNT.fetch_add(1, Relaxed) for each request,
register it on the App (so it wraps all services including default_service,
Files handlers, and WS routes), and remove the manual increment from index() to
avoid double-counting.

---

Outside diff comments:
In `@robyn/processpool.py`:
- Around line 135-151: In the branch guarded by "if
sys.platform.startswith('win32') or processes == 1" (the inline worker path),
stop passing the recycling/max-requests behavior into spawn_process; call
spawn_process without forwarding max_requests (or explicitly pass None/0) so the
single inline server cannot self-terminate once the threshold is hit. Update the
spawn_process invocation in that branch (referenced as spawn_process) to
omit/neutralize the max_requests argument while leaving the other parameters
unchanged so recycling only applies to the supervisor-managed pool.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b4ae53f6-8ba4-4604-83fe-879e3fa30025

📥 Commits

Reviewing files that changed from the base of the PR and between a54ff96 and 62ce87c.

📒 Files selected for processing (6)
  • robyn/__init__.py
  • robyn/argument_parser.py
  • robyn/processpool.py
  • robyn/robyn.pyi
  • src/lib.rs
  • src/server.rs

Comment thread robyn/argument_parser.py
Comment thread robyn/processpool.py
Comment thread robyn/processpool.py
Comment thread src/server.rs Outdated
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 27, 2026

Merging this PR will not alter performance

✅ 189 untouched benchmarks


Comparing feat/process-recycling (cfe0649) with main (41c94df)

Open in CodSpeed

Replace `type=int` with a custom type callback that raises
argparse.ArgumentTypeError for zero and negative values.

Made-with: Cursor
… inline workers

- Call process.join() on exited workers before spawning replacements to
  prevent zombie processes in the recycling loop.
- Move REQUEST_COUNT increment from index() into an app-wide actix
  middleware so requests served by Files handlers, WebSocket routes, and
  the const-route fast path are all counted toward max_requests.
- Omit max_requests when running in single-process/Windows mode where
  there is no supervisor to respawn the worker.

Made-with: Cursor
The Python-side _check_max_requests callback (loop.call_later) was
scheduled after server.start(), but start() blocks internally on
event_loop.run_forever() and never returns — making lines 270-283
dead code that never executes.

Move the check into a background task on the actix runtime where it
has direct access to REQUEST_COUNT and can call server_handle.stop(true)
for a graceful shutdown. This also eliminates the cross-runtime
coordination problem between Python asyncio and Rust actix.

Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 28, 2026

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details
{"name":"HttpError","status":404,"request":{"method":"PATCH","url":"https://api.github.com/repos/sparckles/Robyn/issues/comments/4148992722","headers":{"accept":"application/vnd.github.v3+json","user-agent":"octokit.js/0.0.0-development octokit-core.js/7.0.6 Node.js/24","authorization":"token [REDACTED]","content-type":"application/json; charset=utf-8"},"body":{"body":"<!-- This is an auto-generated comment: autofix status by CodeRabbit -->\n> [!NOTE]\n> Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.\n\n⚠️ **Branch updated during autofix.**\n\nThe branch was updated while autofix was in progress. Please try again.\n<!-- autofix-run-id: 734fa9c1-3009-4e47-94d9-86eb86e84333 -->"},"request":{"retryCount":1,"signal":{}}},"response":{"url":"https://api.github.com/repos/sparckles/Robyn/issues/comments/4148992722","status":404,"headers":{"access-control-allow-origin":"*","access-control-expose-headers":"ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset","content-encoding":"gzip","content-security-policy":"default-src 'none'","content-type":"application/json; charset=utf-8","date":"Sat, 28 Mar 2026 23:22:35 GMT","referrer-policy":"origin-when-cross-origin, strict-origin-when-cross-origin","server":"github.com","strict-transport-security":"max-age=31536000; includeSubdomains; preload","transfer-encoding":"chunked","vary":"Accept-Encoding, Accept, X-Requested-With","x-accepted-github-permissions":"issues=write; pull_requests=write","x-content-type-options":"nosniff","x-frame-options":"deny","x-github-api-version-selected":"2022-11-28","x-github-media-type":"github.v3; format=json","x-github-request-id":"8990:C671F:10A2E33:43B46B0:69C862BA","x-ratelimit-limit":"5300","x-ratelimit-remaining":"5104","x-ratelimit-reset":"1774743288","x-ratelimit-resource":"core","x-ratelimit-used":"196","x-xss-protection":"0"},"data":{"message":"Not Found","documentation_url":"https://docs.github.com/rest/issues/comments#update-an-issue-comment","status":"404"}}}

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
robyn/processpool.py (1)

33-55: ⚠️ Potential issue | 🟠 Major

Normalize non-positive max_requests before handing it to children.

Line 82 treats recycling as enabled only for values > 0, but Lines 54, 105, and 173 still forward the raw value into Rust. With 0, each worker satisfies REQUEST_COUNT >= limit on its first poll and shuts down immediately. Normalize max_requests to None unless it is strictly positive, and reuse that normalized value for both initial spawns and replacements.

Minimal normalization
 def run_processes(
     url: str,
     port: int,
     directories: List[Directory],
@@
     max_requests: Optional[int] = None,
 ) -> List[Process]:
     import time
+    effective_max_requests = (
+        max_requests if max_requests is not None and max_requests > 0 else None
+    )

     socket = SocketHeld(url, port)

     process_pool = init_processpool(
@@
-        max_requests,
+        effective_max_requests,
     )
@@
-    if max_requests and max_requests > 0 and len(process_pool) > 0:
+    if effective_max_requests is not None and len(process_pool) > 0:
         while not shutting_down:
             for i, process in enumerate(process_pool):
                 if not process.is_alive() and not shutting_down:
@@
-                            max_requests,
+                            effective_max_requests,
                         ),
                     )

 def init_processpool(
@@
     max_requests: Optional[int] = None,
 ) -> List[Process]:
     process_pool: List = []
+    effective_max_requests = (
+        max_requests if max_requests is not None and max_requests > 0 else None
+    )
@@
-                max_requests,
+                effective_max_requests,
             ),
         )

Also applies to: 82-105, 155-174

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@robyn/processpool.py` around lines 33 - 55, Normalize the max_requests
parameter to None unless it's strictly positive and use that normalized value
everywhere you pass it to child processes: create a local
normalized_max_requests = max_requests if max_requests and max_requests > 0 else
None, then replace raw uses of max_requests when calling init_processpool and
any subsequent spawn/replacement logic so children receive
normalized_max_requests; update all call sites that currently forward
max_requests (including the initial init_processpool invocation and worker
respawn paths) to use normalized_max_requests and ensure the existing > 0 check
(REQUEST_COUNT logic) stays consistent with this normalization.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/server.rs`:
- Around line 340-355: The spawned task that checks REQUEST_COUNT currently
calls handle.stop(true) but never stops the Python event loop, leaving
run_forever() blocked; after handle.stop(true).await in that task, acquire the
Python GIL and call event_loop.call_method0("stop") (or call a shutdown helper
that calls event_loop.call_method0("stop") / event_loop.call_method0("close"))
so the blocking event_loop.call_method0("run_forever") can return, or
alternatively call std::process::exit(0) to terminate the worker; update the
spawned task (where server_handle/handle and REQUEST_COUNT are used) to perform
this additional stop/exit so the worker fully recycles.

---

Outside diff comments:
In `@robyn/processpool.py`:
- Around line 33-55: Normalize the max_requests parameter to None unless it's
strictly positive and use that normalized value everywhere you pass it to child
processes: create a local normalized_max_requests = max_requests if max_requests
and max_requests > 0 else None, then replace raw uses of max_requests when
calling init_processpool and any subsequent spawn/replacement logic so children
receive normalized_max_requests; update all call sites that currently forward
max_requests (including the initial init_processpool invocation and worker
respawn paths) to use normalized_max_requests and ensure the existing > 0 check
(REQUEST_COUNT logic) stays consistent with this normalization.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 469d3a95-d214-437d-ac1f-5985479fab01

📥 Commits

Reviewing files that changed from the base of the PR and between e2ddef7 and 7546da5.

📒 Files selected for processing (3)
  • robyn/processpool.py
  • robyn/robyn.pyi
  • src/server.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • robyn/robyn.pyi

Comment thread src/server.rs
Three bugs fixed:

1. The processes==1 path never passed max_requests to spawn_process,
   so the Rust server never knew about the request limit.

2. The processes==1 path ran spawn_process inline (not forked), so
   process_pool was empty and the parent recycling loop never ran.
   Now when max_requests is set, even processes==1 goes through the
   fork path so the parent can watch and respawn the child.

3. After the Actix server stops (handle.stop(true)), the Python
   event loop (loop.run_forever()) never returns because it has no
   knowledge of the Rust thread stopping. Added std::process::exit(0)
   in the Rust server thread after server.await completes when
   max_requests is configured, so the child process exits cleanly
   and the parent can detect it and spawn a replacement.

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@robyn/processpool.py`:
- Around line 82-110: The truthy checks on max_requests are wrong: normalize by
testing for None explicitly so a value of 0 does not disable supervision;
replace conditions like "if max_requests and max_requests > 0 and
len(process_pool) > 0:" with "if max_requests is not None and len(process_pool)
> 0:" (and similarly update the other branch around the Windows/inline path at
the block referencing process_pool and spawn_process) so recycling supervision
is enabled for any non-None max_requests value.
- Around line 82-110: The parent loop currently sleeps for 5 seconds which
delays detection of exited workers and causes outages; change the recycling loop
so exited processes are reaped and replaced immediately by calling
process.join() as soon as process.is_alive() is False (or use
process.join(timeout=0) to non-blockingly reap), spawn the replacement Process
(using Process(target=spawn_process, ...), socket.try_clone(), etc.) right away,
and reduce or remove the coarse time.sleep(5) (replace with a short poll
interval like 0.1s if needed) so worker replacement happens immediately instead
of waiting on the 5s boundary.

In `@src/server.rs`:
- Around line 344-358: The periodic 5s polling allows a hot worker to far
overshoot max_requests; instead, detect the threshold immediately when
incrementing REQUEST_COUNT: in the code path that does
REQUEST_COUNT.fetch_add(1, Relaxed) (the one-shot transition referenced), check
the returned previous value and if previous + 1 >= max_requests (or previous <
max_requests && previous + 1 == max_requests) call
server_handle.clone().stop(true).await (or otherwise schedule an immediate
shutdown) so the worker begins recycling right when the limit is crossed; keep
or remove the existing spawn loop but ensure the atomic fetch_add branch is the
primary immediate trigger.
- Around line 361-366: The spawned Actix thread must not call
std::process::exit(0) when max_requests.is_some(); instead signal the Python
event loop to perform a graceful shutdown and return from the async block so the
process can continue cleanup. Replace the hard exit in the block that awaits
server (the code referencing server.await and max_requests) with logic that
notifies the Python shutdown mechanism used by your project (e.g., trigger the
existing shutdown_handler or send the same inter-thread signal the Python side
expects) and allow the spawned task to finish normally; ensure the Python-side
shutdown_handler and its event loop cleanup in robyn/processpool.py still run
and that any Events.SHUTDOWN handlers are invoked.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 58f09024-f20d-497f-a17d-52282788a423

📥 Commits

Reviewing files that changed from the base of the PR and between 7546da5 and cfe0649.

📒 Files selected for processing (2)
  • robyn/processpool.py
  • src/server.rs

Comment thread robyn/processpool.py
Comment on lines +82 to +110
if max_requests and max_requests > 0 and len(process_pool) > 0:
while not shutting_down:
for i, process in enumerate(process_pool):
if not process.is_alive() and not shutting_down:
process.join()
logger.info("Worker process exited (recycling), spawning replacement.")
copied_socket = socket.try_clone()
new_process = Process(
target=spawn_process,
args=(
directories,
request_headers,
routes,
global_middlewares,
route_middlewares,
web_sockets,
event_handlers,
copied_socket,
workers,
response_headers,
excluded_response_headers_paths,
client_timeout,
keep_alive_timeout,
max_requests,
),
)
new_process.start()
process_pool[i] = new_process
time.sleep(5)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize max_requests before these truthiness branches.

Line 82 and Line 136 use truthiness, but src/server.rs enables recycling for any non-None value. 0 therefore disables supervision here while still enabling immediate shutdown in Rust. Line 136 also forces all Windows runs through the inline path, so there is no supervisor to replace that exit. Both cases can turn recycling into a permanent shutdown.

💡 Minimal guardrail
 def run_processes(
@@
     keep_alive_timeout: int = 20,
     max_requests: Optional[int] = None,
 ) -> List[Process]:
+    if max_requests is not None and max_requests <= 0:
+        max_requests = None
@@
-    if max_requests and max_requests > 0 and len(process_pool) > 0:
+    if max_requests is not None and len(process_pool) > 0:
         while not shutting_down:
             ...
 def init_processpool(
@@
     keep_alive_timeout: int = 20,
     max_requests: Optional[int] = None,
 ) -> List[Process]:
+    if max_requests is not None and max_requests <= 0:
+        max_requests = None
+    if sys.platform.startswith("win32") and max_requests is not None:
+        raise RuntimeError("--max-requests is not supported on Windows yet")
@@
-    if sys.platform.startswith("win32") or (processes == 1 and not max_requests):
+    if sys.platform.startswith("win32") or (processes == 1 and max_requests is None):
         spawn_process(...)

Also applies to: 133-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@robyn/processpool.py` around lines 82 - 110, The truthy checks on
max_requests are wrong: normalize by testing for None explicitly so a value of 0
does not disable supervision; replace conditions like "if max_requests and
max_requests > 0 and len(process_pool) > 0:" with "if max_requests is not None
and len(process_pool) > 0:" (and similarly update the other branch around the
Windows/inline path at the block referencing process_pool and spawn_process) so
recycling supervision is enabled for any non-None max_requests value.

⚠️ Potential issue | 🟠 Major

Worker replacement starts late enough to cause outages.

The parent only notices exited workers on the next 5-second sleep boundary at Line 110. In processes=1 mode that is a full outage on every recycle; if several workers hit the limit together, it can briefly drop the whole server. Please wait on process sentinels, or at least poll much more frequently, so replacements start immediately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@robyn/processpool.py` around lines 82 - 110, The parent loop currently sleeps
for 5 seconds which delays detection of exited workers and causes outages;
change the recycling loop so exited processes are reaped and replaced
immediately by calling process.join() as soon as process.is_alive() is False (or
use process.join(timeout=0) to non-blockingly reap), spawn the replacement
Process (using Process(target=spawn_process, ...), socket.try_clone(), etc.)
right away, and reduce or remove the coarse time.sleep(5) (replace with a short
poll interval like 0.1s if needed) so worker replacement happens immediately
instead of waiting on the 5s boundary.

Comment thread src/server.rs
Comment on lines +344 to +358
if let Some(limit) = max_requests {
let handle = server_handle.clone();
actix_web::rt::spawn(async move {
loop {
actix_web::rt::time::sleep(std::time::Duration::from_secs(5)).await;
if REQUEST_COUNT.load(Relaxed) >= limit {
log::info!(
"Max requests ({}) reached, worker shutting down for recycling.",
limit
);
handle.stop(true).await;
break;
}
}
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The recycle trigger is enforced far too late under load.

Line 348 only wakes every 5 seconds, so a hot worker can serve far more than max_requests before Line 354 starts shutdown. That weakens the leak-containment goal of this feature. You already have the exact threshold crossing at Line 90 via fetch_add; using that one-shot transition would keep the limit meaningful.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server.rs` around lines 344 - 358, The periodic 5s polling allows a hot
worker to far overshoot max_requests; instead, detect the threshold immediately
when incrementing REQUEST_COUNT: in the code path that does
REQUEST_COUNT.fetch_add(1, Relaxed) (the one-shot transition referenced), check
the returned previous value and if previous + 1 >= max_requests (or previous <
max_requests && previous + 1 == max_requests) call
server_handle.clone().stop(true).await (or otherwise schedule an immediate
shutdown) so the worker begins recycling right when the limit is crossed; keep
or remove the existing spawn loop but ensure the atomic fetch_add branch is the
primary immediate trigger.

Comment thread src/server.rs
Comment on lines +361 to +366
server.await.unwrap();

if max_requests.is_some() {
log::info!("Actix server stopped, exiting worker process for recycling.");
std::process::exit(0);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Show the recycle exit path and the only Python-side shutdown/cleanup locations.
rg -n -C2 'std::process::exit\(0\)|shutdown_handler|run_forever|loop.close' src/server.rs robyn/processpool.py

Repository: sparckles/Robyn

Length of output: 2553


🏁 Script executed:

#!/bin/bash
# Get broader context around the exit call and the event loop handling
sed -n '355,380p' src/server.rs | cat -n

Repository: sparckles/Robyn

Length of output: 1215


🏁 Script executed:

#!/bin/bash
# Also check the full structure of the run method to understand task spawning/awaiting
sed -n '150,175p' src/server.rs | cat -n

Repository: sparckles/Robyn

Length of output: 1298


🏁 Script executed:

#!/bin/bash
# Get full context of the run method structure, including what spawns/awaits the async block
sed -n '140,200p' src/server.rs | cat -n

Repository: sparckles/Robyn

Length of output: 2774


🏁 Script executed:

#!/bin/bash
# Check what happens in the Python processpool.py around the run_forever and finally block
sed -n '265,285p' robyn/processpool.py | cat -n

Repository: sparckles/Robyn

Length of output: 426


Hard exit(0) from the spawned Actix thread terminates the process before Python shutdown handlers and event loop cleanup can execute.

When std::process::exit(0) is called on line 365 within the spawned thread's async block, it kills the entire process immediately. This prevents the Python event loop at line 370 from ever running, which means:

  • The shutdown_handler check at line 372 is never reached
  • The Python event loop cleanup in robyn/processpool.py line 276 (the finally block with loop.close()) is skipped
  • Any registered Events.SHUTDOWN handlers are never invoked

A graceful recycle must signal shutdown to the Python event loop instead of terminating the process directly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server.rs` around lines 361 - 366, The spawned Actix thread must not call
std::process::exit(0) when max_requests.is_some(); instead signal the Python
event loop to perform a graceful shutdown and return from the async block so the
process can continue cleanup. Replace the hard exit in the block that awaits
server (the code referencing server.await and max_requests) with logic that
notifies the Python shutdown mechanism used by your project (e.g., trigger the
existing shutdown_handler or send the same inter-thread signal the Python side
expects) and allow the spawned task to finish normally; ensure the Python-side
shutdown_handler and its event loop cleanup in robyn/processpool.py still run
and that any Events.SHUTDOWN handlers are invoked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant