Skip to content

Add pthread CPU affinity support#3122

Merged
wasphin merged 3 commits intoapache:masterfrom
wenjiecn:bind_thread
Oct 31, 2025
Merged

Add pthread CPU affinity support#3122
wasphin merged 3 commits intoapache:masterfrom
wenjiecn:bind_thread

Conversation

@wenjiecn
Copy link
Copy Markdown
Contributor

@wenjiecn wenjiecn commented Oct 23, 2025

What problem does this PR solve?

Issue Number:#1140

Problem Summary:

What is changed and the side effects?

Changed:
添加gflag
cpu_set:用户需要绑定的cpu集合,默认为空串,表示不开启绑核
当开启绑核时,会解析cpu集合到_cpus数组中
每次调用pthread_create(&_workers[i], NULL, worker_thread, arg);后,会绑定worker_id对应的_cpus[worker_id % _cpus.size()]

Side effects:

  • Performance effects:

  • Breaking backward compatibility:


Check List:

Comment thread src/bthread/task_control.cpp Outdated
DEFINE_bool(task_group_set_worker_name, true,
"Whether to set the name of the worker thread");
DEFINE_bool(thread_affinity, false, "Whether to Bind Cores");
DEFINE_string(cpu_set , "",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can combine these two flags into one flag. If cpu_set is an empty string, then the thread affinity feature is off.

Copy link
Copy Markdown
Contributor Author

@wenjiecn wenjiecn Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@wwbmmm wwbmmm requested a review from Copilot October 23, 2025 05:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds pthread CPU affinity support to enable binding worker threads to specific CPU cores for better performance control. The implementation allows users to control thread-to-CPU binding through configuration flags.

Key Changes:

  • Added two gflags: thread_affinity to enable/disable CPU binding and cpu_set to specify target CPUs
  • Implemented CPU set parsing with support for range notation (e.g., "0-3,5,6-7")
  • Modified worker thread creation to bind threads to CPUs in round-robin fashion when affinity is enabled

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/bthread/task_control.h Added CPU affinity helper functions and static _cpus vector to store parsed CPU set
src/bthread/task_control.cpp Implemented CPU set parsing, thread binding logic, and integrated affinity support into worker thread initialization

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread src/bthread/task_control.cpp Outdated
Comment thread src/bthread/task_control.cpp Outdated
Comment thread src/bthread/task_control.cpp Outdated
Comment thread src/bthread/task_control.cpp Outdated
Comment thread src/bthread/task_control.cpp Outdated
@yanglimingcn
Copy link
Copy Markdown
Contributor

这个过程能放到startfn这些函数里面吗?

@wenjiecn
Copy link
Copy Markdown
Contributor Author

wenjiecn commented Oct 23, 2025

这个过程能放到startfn这些函数里面吗?

@yanglimingcn
startfn方法是给用户自定义的Hook方法
run_tagged_worker_startfn(tag)方法,用户可以自定义将tag[i]下的线程组绑定到自定义的cpus[tag[i]];
我这里实现一些轮子

@yanglimingcn
Copy link
Copy Markdown
Contributor

动态增加线程会怎么设置呢?

@wenjiecn
Copy link
Copy Markdown
Contributor Author

wenjiecn commented Oct 23, 2025

动态增加线程会怎么设置呢?

@yanglimingcn
动态增加worker_thread的方法是TaskControl::add_workers,我添加了

if (!_cpus.empty()) {
    arg->cpuId = (i + old_concurency) % _cpus.size();
}

无论如何增加,它都是存在_workers里的,所以只要保证_workers[i]对应_cpus[i % _cpus.size()]

@yanglimingcn
Copy link
Copy Markdown
Contributor

收到

Comment thread src/bthread/task_control.cpp Outdated
@wenjiecn wenjiecn force-pushed the bind_thread branch 3 times, most recently from 3bd580f to 2169ba1 Compare October 26, 2025 11:21
@wenjiecn wenjiecn closed this Oct 27, 2025
@wenjiecn wenjiecn reopened this Oct 27, 2025
Comment thread src/bthread/task_control.h Outdated

static int parse_cpuset(std::string value, std::vector<unsigned>& cpus);

static inline void bind_thread(pthread_t pthread, unsigned cpuId) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么把这个函数定义为inline的?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为函数体逻辑简单

@yanglimingcn
Copy link
Copy Markdown
Contributor

有没有测试过绑核和不绑核的数据?

@wenjiecn
Copy link
Copy Markdown
Contributor Author

wenjiecn commented Oct 30, 2025

有没有测试过绑核和不绑核的数据?

@yanglimingcn

不绑核 connection_type=single

./echo_server -minloglevel=4 --ip_port=*.*.*.*:*
./rpc_press -proto="./echo.proto" -method=example.EchoService.Echo -server=*.*.*.*:* -input=.'{"message":"hello"} {"message":"world"}' -qps=0 -timeout_ms=3000 -thread_num=200

[Latency]
avg 531 us
50% 521 us
70% 597 us
90% 757 us
95% 817 us
97% 846 us
99% 907 us
99.9% 1132 us
99.99% 1360 us
max 2250 us

[Latency]
avg 585 us
50% 569 us
70% 649 us
90% 797 us
95% 843 us
97% 877 us
99% 982 us
99.9% 1142 us
99.99% 1291 us
max 2706 us

[Latency]
avg 606 us
50% 590 us
70% 680 us
90% 814 us
95% 880 us
97% 926 us
99% 1033 us
99.9% 1184 us
99.99% 1935 us
max 3452 us

绑核 connection_type=single

numactl --cpunodebind=0 --membind=0 ./echo_server -minloglevel=4 --ip_port=*.*.*.*:* --cpu_set=0-79
numactl --cpunodebind=0 --membind=0 ./rpc_press -proto="./echo.proto" -method=example.EchoService.Echo -server=*.*.*.*:* -input=.'{"message":"hello"} {"message":"world"}' -qps=0 -timeout_ms=3000 -thread_num=200 --cpu_set=0-10
[Latency]
avg 511 us
50% 505 us
70% 576 us
90% 697 us
95% 795 us
97% 843 us
99% 891 us
99.9% 1146 us
99.99% 2124 us
max 7936 us
[Latency]
avg 522 us
50% 516 us
70% 585 us
90% 725 us
95% 807 us
97% 830 us
99% 899 us
99.9% 1145 us
99.99% 1726 us
max 4164 us
[Latency]
avg 516 us
50% 504 us
70% 576 us
90% 703 us
95% 792 us
97% 823 us
99% 871 us
99.9% 1255 us
99.99% 2252 us
max 8437 us

不绑核 connection_type=pooled

./echo_server -minloglevel=4 --ip_port=*.*.*.*:*
./rpc_press -proto="./echo.proto" -method=example.EchoService.Echo -server=*.*.*.*:* -input=.'{"message":"hello"} {"message":"world"}' -qps=0 -timeout_ms=3000 -thread_num=200
[Latency]
avg 1151 us
50% 1135 us
70% 1212 us
90% 1333 us
95% 1403 us
97% 1444 us
99% 1545 us
99.9% 1636 us
99.99% 4142 us
max 5198 us

[Latency]
avg 1322 us
50% 1281 us
70% 1373 us
90% 1534 us
95% 1619 us
97% 1696 us
99% 1767 us
99.9% 1892 us
99.99% 2074 us
max 6483 us

[Latency]
avg 1494 us
50% 1514 us
70% 1590 us
90% 1718 us
95% 1783 us
97% 1817 us
99% 1945 us
99.9% 2074 us
99.99% 3520 us
max 4119 us

绑核 connection_type=pooled

numactl --cpunodebind=0 --membind=0 ./echo_server -minloglevel=4 --ip_port=*.*.*.*:* --cpu_set=0-79
numactl --cpunodebind=0 --membind=0 ./rpc_press -proto="./echo.proto" -method=example.EchoService.Echo -server=*.*.*.*:* -input=.'{"message":"hello"} {"message":"world"}' -qps=0 -timeout_ms=3000 -thread_num=200 --cpu_set=0-10
[Latency]
avg 810 us
50% 789 us
70% 847 us
90% 956 us
95% 1007 us
97% 1048 us
99% 1119 us
99.9% 1209 us
99.99% 1286 us
max 2332 us

[Latency]
avg 809 us
50% 790 us
70% 849 us
90% 966 us
95% 1020 us
97% 1054 us
99% 1120 us
99.9% 1217 us
99.99% 1249 us
max 1425 us

[Latency]
avg 809 us
50% 788 us
70% 851 us
90% 971 us
95% 1011 us
97% 1054 us
99% 1121 us
99.9% 1230 us
99.99% 1260 us
max 1784 us

@yanglimingcn
Copy link
Copy Markdown
Contributor

在single模式下收益没有pooled下面明显。

@yanglimingcn
Copy link
Copy Markdown
Contributor

修复一下冲突吧

Comment thread src/bthread/task_control.h Outdated
Comment thread src/bthread/task_control.h Outdated
Comment thread src/bthread/task_control.h Outdated
@yanglimingcn
Copy link
Copy Markdown
Contributor

LGTM

1 similar comment
@wasphin
Copy link
Copy Markdown
Member

wasphin commented Oct 31, 2025

LGTM

@wasphin wasphin merged commit fe63d79 into apache:master Oct 31, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants