perf: replace platform-specific CPU/memory probing with gopsutil by johanjanssens · Pull Request #2306 · php/frankenphp

johanjanssens · 2026-03-25T18:52:22Z

Replace cpu_unix.go (POSIX clock_gettime), cpu_windows.go (noop), memory_linux.go (syscall.Sysinfo), and memory_others.go (returns 0) with cross-platform implementations using gopsutil/v4

The CPU probe changes from ProbeCPUs (blocking 120ms sleep measuring CLOCK_PROCESS_CPUTIME_ID) to ProbeLoad (instant process CPU percentage check via /proc or platform API). This removes the 120ms latency in the scaling path and the CGO dependency for CPU probing.

Memory detection now works on all platforms (was Linux-only, returned 0 on macOS/Windows which disabled automatic max_threads calculation).

Replace cpu_unix.go (POSIX clock_gettime), cpu_windows.go (noop), memory_linux.go (syscall.Sysinfo), and memory_others.go (returns 0) with cross-platform implementations using gopsutil/v4. The CPU probe changes from ProbeCPUs (blocking 120ms sleep measuring CLOCK_PROCESS_CPUTIME_ID) to ProbeLoad (instant process CPU percentage check via /proc or platform API). This removes the 120ms latency in the scaling path and the CGO dependency for CPU probing. Memory detection now works on all platforms (was Linux-only, returned 0 on macOS/Windows which disabled automatic max_threads calculation).

AlliBalliBaba · 2026-03-26T09:12:58Z

+
+	// Get CPU percent for this process
+	// Note: this returns percent across all cores, so 100% per core * number of cores is the max
+	cpuPercent, err := proc.CPUPercent()


How does proc.CPUPercent() acutally work? Doesn't it take the total CPU time since the process was created? So you'd still have to wait for some time between NewProcess and CPUPercent.

You're right, my bad, not sure how I missed this, CPUPercent() returns the lifetime average since process start, which is useless.

Switched to Percent(0) which returns the CPU delta since the last call (non-blocking): https://pkg.go.dev/github.com/shirou/gopsutil/v4/process#Process.Percent

Wouldn't you still have to call cpuProc.Percent(100 * time.Millisecond) to get the CPU time in the last 100ms? Otherwise you're getting the CPU time since the last call, which might have been several minutes ago.

True, its a tradeoff, but the impact is minimal.

Percent(100ms) blocks the scaling goroutine for 100ms per scale decision. This is what I am trying to avoid, in the FrankenAsync POC I am dispatching PHP scripts internally to separate threads within a single request. In this scenario I like threads to scale up instantly, not wait 100ms each.

Percent(0) returns the delta since last call. Under scaling pressure, calls come every ~5ms (minStallTime), so the window stays tight. The only potential gap would the first call after idle returns 0, which could wrongly allow or prevents one scale-up. After that it's accurate.

Its a tradeoff between instant scaling, with potentially being off 1 thread, or delayed scaling, which adds 100msec unwanted latency to internal sub-requests.

Oh I get it now, ideally we should also be able to somehow differentiate between scaling a http worker and something regarding async/parallelism, with parallelism you probably would not want to wait at all.

Can you share a link to your POC again? Might also make sense to somehow combine your approach with #2287, which tries to do something similar with 'background workers'.

You can find the info and links to the POC I did here: #2223

I do feel this is unrelated though from that work, it just emerged while working on that. Also why I submitted it as separate PR. Reasoning:

The 120ms in the current ProbeCPUs isn't intentional scaling delay; it is a side effect of how CPU is measured (sleep, then compare). The probe's only job is "is CPU overloaded?": yes or no.

The scaling loop already has a hardcoded minStallTime (5ms) as the gate for when to consider scaling. If that needs tuning, it could be made configurable, which would allow to properly control the behavior.

Percent(0) is non-blocking, cross-platform, and self-regulates under load since the scaling goroutine calls it every ~5ms when threads are stalled.

This PR also matters for the Windows support shipped in v1.12.0, the current code has no CPU probing on Windows at all.

Yeah I'm in favor of making this cross-platform 👍 . The stalling is still kind of intentional to keep scaling from consuming too many resources at once.

It's a bit unfortunate that this will take past CPU consumption as metric instead of current one. Maybe we could also repeatedly measure the metric in the downscaling loop. There are also some k6 tests in the repo to simulate hanging while scaling.

Good idea! I added a probeCPULoad() call in deactivateThreads() to keep the Percent(0) baseline fresh. Delta window is now at most 5s instead of potentially stale.

dunglas · 2026-03-25T21:28:33Z

@@ -0,0 +1,15 @@
+package memory


Maybe could we totally remove this package and inline the call to the lib?

For sure. The memory package is gone, now and calls have been inlined.

Remove internal/cpu package and inline probeCPULoad() into scaling.go. Switch from CPUPercent() (lifetime average) to Percent(0) (delta since last call, non-blocking) for accurate current load measurement. Use sync.Once for process initialization.

Remove internal/memory package and call gopsutil mem.VirtualMemory() directly. Both internal packages are now eliminated.

Call getCPUUsage() in deactivateThreads() so Percent(0) always has a recent reference point. The delta window for scale-up decisions is now at most 5s (downScaleCheckTime) instead of potentially minutes.

johanjanssens · 2026-03-28T15:48:36Z

Based on feedback from @AlliBalliBaba created a prototype for a pluggable scaling policy system in a https://github.com/johanjanssens/frankenphp/tree/scaling-policy see johanjanssens@2b64563

This separates measurement from decision with a ScalingPolicy interface:

  type ScalingPolicy interface {                            
      ShouldScaleUp(metrics ScalingMetrics) bool
      ShouldScaleDown(metrics ScalingMetrics) bool                                                        
  }

Ships with DefaultScalingPolicy (matches current behavior exactly) and ImmediateScalingPolicy (no stall
delay, for parallelism workloads). Pluggable via WithScalingPolicy(). The CPU probe just measures, the policy decides.

Design doc: https://github.com/johanjanssens/frankenphp/blob/scaling-policy/scaling-policy.md

Happy to keep this PR focused on the cross-platform probe, or expand if, or do a new/follow-up PR.

AlliBalliBaba · 2026-04-06T17:09:13Z

Yeah I think we should decouple scaling based on incoming HTTP requests from scaling based on parallelism. With HTTP scaling you sometimes want to be more conservative.

What you are trying to solve with parallelism is currently also being worked on in #2287. We should probably find some common API, that PR is still a bit too specific and there's more room for optimization here in core. If you have a 'non-http' thread, scaling should be made instant.

dunglas · 2026-04-08T13:06:40Z

Could you rebase, please? Otherwise, is this PR ready for merge (it LGTM)?

Copilot

Pull request overview

This PR updates FrankenPHP’s autoscaling heuristics to use cross-platform CPU and memory probing via gopsutil/v4, removing platform-specific CPU/memory implementations (and the prior CPU probe’s 120ms sleep/CGO dependency) while enabling memory-based thread sizing on non-Linux platforms.

Changes:

Replace CPU “probe over time” (POSIX clock_gettime / Windows fallback) with a non-blocking per-process CPU percent probe (gopsutil/process).
Replace Linux-only syscall.Sysinfo memory detection with gopsutil/mem.VirtualMemory() for broader platform support.
Remove internal/cpu and internal/memory implementations and add gopsutil (and indirect deps) to module files.

Reviewed changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
scaling.go	Switch autoscaling CPU gating to `gopsutil/process.Percent(0)` and refresh baseline during downscaling checks.
phpmainthread.go	Compute automatic `maxThreads` using `gopsutil/mem.VirtualMemory()` (available memory).
internal/memory/memory_others.go	Removed (previously returned 0 on non-Linux).
internal/memory/memory_linux.go	Removed (previously used `syscall.Sysinfo`).
internal/cpu/cpu_windows.go	Removed (previously always allowed scaling after sleep).
internal/cpu/cpu_unix.go	Removed (previously used POSIX `clock_gettime` + sleep).
go.mod / go.sum	Add `github.com/shirou/gopsutil/v4` and update dependency checksums.
caddy/go.mod / caddy/go.sum	Update indirect dependencies to reflect top-level `gopsutil` addition.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T13:10:59Z

+func probeCPULoad(maxLoadFactor float64) bool {
+	cpuProcOnce.Do(func() {
+		cpuProc, _ = process.NewProcess(int32(os.Getpid()))
+	})
+	if cpuProc == nil {
+		return false
+	}


probeCPULoad initializes cpuProc via cpuProcOnce.Do but ignores the error from process.NewProcess. If that call fails, cpuProc remains nil and (because of sync.Once) will never be retried, permanently disabling autoscaling decisions that depend on CPU probing. Consider capturing the init error and retrying initialization on later calls (or only marking initialization as done after a successful NewProcess).

Copilot · 2026-04-08T13:11:00Z

+	if cpuProc == nil {
+		return false
+	}
+
+	cpuPercent, err := cpuProc.Percent(0)
+	if err != nil {
+		return false
+	}
+
+	return cpuPercent < float64(cpuCount)*100.0*maxLoadFactor


On CPU probe failures (cpuProc == nil or Percent returns an error), probeCPULoad returns false which prevents any scale-up from happening. This can leave the system stuck at the initial thread count on platforms/environments where process CPU stats are unavailable. Consider falling back to allowing scaling when probing fails (and optionally logging the failure once) so autoscaling remains functional.

Copilot · 2026-04-08T13:11:00Z

+// probeCPULoad checks if current process CPU usage is below a threshold.
+// Uses Percent(0) which returns the delta since the last call (non-blocking).
+// See: https://pkg.go.dev/github.com/shirou/gopsutil/v4/process#Process.Percent
+func probeCPULoad(maxLoadFactor float64) bool {
+	cpuProcOnce.Do(func() {
+		cpuProc, _ = process.NewProcess(int32(os.Getpid()))
+	})
+	if cpuProc == nil {
+		return false
+	}
+
+	cpuPercent, err := cpuProc.Percent(0)
+	if err != nil {
+		return false
+	}
+
+	return cpuPercent < float64(cpuCount)*100.0*maxLoadFactor


Process.Percent(0) returns CPU usage delta since the last call; on the first call it commonly returns 0 (it only establishes a baseline). That means the first autoscaling decision will almost always treat CPU as idle and may scale up even under heavy load. Consider warming the baseline once during autoscaling init (or treating the first Percent(0) result as “unknown” and skipping scaling until a second sample is available).

Copilot · 2026-04-08T13:11:00Z

+	vmStat, err := mem.VirtualMemory()
+	totalSysMemory := uint64(0)
+	if err == nil {
+		totalSysMemory = vmStat.Available
+	}


totalSysMemory is now populated from vmStat.Available, not total system memory. This makes the name/comment misleading and can confuse future changes (it’s effectively available memory at the time of the call). Consider renaming the variable (e.g., availableSysMemory) and/or adjusting the comment to match the actual semantics.

henderkes · 2026-04-21T08:10:23Z

@johanjanssens could you rebase please? Is this good for a merge?

johanjanssens and others added 2 commits March 25, 2026 16:50

go mod tidy

eeeb4ef

AlliBalliBaba reviewed Mar 26, 2026

View reviewed changes

dunglas reviewed Mar 26, 2026

View reviewed changes

johanjanssens added 3 commits March 26, 2026 17:38

refactor: inline memory probe into phpmainthread.go

7d87306

Remove internal/memory package and call gopsutil mem.VirtualMemory() directly. Both internal packages are now eliminated.

refactor: probe CPU in downscale loop to keep baseline fresh

663c71c

Call getCPUUsage() in deactivateThreads() so Percent(0) always has a recent reference point. The delta window for scale-up decisions is now at most 5s (downScaleCheckTime) instead of potentially minutes.

johanjanssens mentioned this pull request Mar 31, 2026

feat: background workers = non-HTTP workers with shared state #2287

Open

henderkes mentioned this pull request Apr 1, 2026

FrankenPHP plugin fails to build on FreeBSD #2327

Closed

dunglas requested a review from Copilot April 8, 2026 13:06

Copilot started reviewing on behalf of dunglas April 8, 2026 13:06 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Conversation

johanjanssens commented Mar 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johanjanssens Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

johanjanssens commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlliBalliBaba commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dunglas commented Apr 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

henderkes commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

johanjanssens Mar 27, 2026 •

edited

Loading

johanjanssens commented Mar 28, 2026 •

edited

Loading

AlliBalliBaba commented Apr 6, 2026 •

edited

Loading