fix(proactor_threads): fix that proactor_threads is not respected in k8s env#7030
Conversation
c58b415 to
94810e6
Compare
|
@dranikpg would you be so kind to fill in some reviewers? I am not so familiar with GitHub, and it seems I cannot add any after the PR is created. Thanks! |
6086c40 to
bd85b67
Compare
|
@starek4 thanks for the contribution. I am trying to understand - the tests are not related to the fix, correct? They are just "fixing" the current behavior with cli args and the env vars? |
|
Also, can I reproduce the problem locally with (Sorry, not a k8s expert). |
|
@romange, you can definitely reproduce the issue locally (and also the fix), let me show you how:
And about the tests, you are right. It is only about passing the arguments. |
|
Thanks! Let me few days to review it (currently on Passover vacation). Meanwhile can you pls edit your comment above and add what is the actual behavior before the fix for each of the cases you wrote down? |
|
Enjoy your vacation! I have added the before and after fix behaviour. |
|
The fix is correct and minimal. When However, the tests don't validate the actual bug scenario. They run on a normal machine where Suggested: integration test with real cgroup limitsThe existing CI runs pytest inside Docker containers where cgroup-thread-detection:
runs-on: ubuntu-latest
needs: [build]
steps:
- uses: actions/checkout@v6
# Reuse the binary built by the build job (requires uploading it as an artifact there)
- uses: actions/download-artifact@v7
with:
name: dragonfly-binary
path: build
- name: Test cgroup CPU auto-detection
run: |
chmod +x build/dragonfly
# Scenario 1: --cpus=2, no flag -> should auto-detect 2 threads
CID=$(docker run --rm -d --cpus=2 --security-opt seccomp=unconfined \
-v "$PWD/build/dragonfly:/dragonfly" -p 16380:6379 ubuntu:24.04 \
/dragonfly --port 6379 --maxmemory 1G --dbfilename "" --noversion_check)
#
# Wait for PING to return PONG
#
THREADS=$(redis-cli -p 16380 INFO server | grep -oP 'thread_count:\K\d+')
docker rm -f "$CID"
echo "Scenario 1: expected 2, got $THREADS"
[ "$THREADS" -eq 2 ]
# Scenario 2: --cpus=2 + --proactor_threads=4 -> flag must win
CID=$(docker run --rm -d --cpus=2 --security-opt seccomp=unconfined \
-v "$PWD/build/dragonfly:/dragonfly" -p 16381:6379 ubuntu:24.04 \
/dragonfly --port 6379 --proactor_threads 4 --maxmemory 1G --dbfilename "" --noversion_check)
#
# Wait for PING to return PONG
#
THREADS=$(redis-cli -p 16381 INFO server | grep -oP 'thread_count:\K\d+')
docker rm -f "$CID"
echo "Scenario 2: expected 4, got $THREADS"
[ "$THREADS" -eq 4 ]Minor suggestionAdd a if (absl::GetFlag(FLAGS_proactor_threads) > 0) {
LOG(INFO) << "Using proactor_threads=" << absl::GetFlag(FLAGS_proactor_threads)
<< " (overriding cgroup-derived " << max_available_threads << ")";
max_available_threads = 0;
} |
c8f0bc5 to
04e2a23
Compare
|
@starek4 you need to install redis-tools or valkey-tools to be able to ping the server |
|
@romange yep, no worries, will do it when I have time. |
b232107 to
8d96ea5
Compare
|
@romange it should work now, tested on my forked actions |
|
@starek4 please squash and sign your commits |
3e2d6c2 to
17d2038
Compare
17d2038 to
f7c2807
Compare
|
@romange done |
|
thank you! |
Fixing issue: #4251
When a pod has CPU resource limits set, UpdateResourceLimitsIfInsideContainer derives a thread count from the cgroup CPU quota and passes it directly to ProactorPool. Because helio's ProactorPool constructor only checks FLAGS_proactor_threads when its pool_size argument is 0, a non-zero cgroup-derived value silently overrides whatever the user set via --proactor_threads (or DFLY_proactor_threads).
This caused pods to start with a thread count dictated by the CPU limit rather than the explicitly configured value, often failing the maxmemory >= threads * 256MB check on startup.
Changes:
Behavior:
Tests: