lua-perf is a performance profiling tool implemented based on eBPF, supporting Lua 5.3, Lua 5.4, and Lua 5.5.
- Provides performance analysis for mixed
CandLuacode, as well as pureCcode. - Uses stack sampling technique with minimal performance impact on the target process, making it suitable for production environments.
- Performs stack backtracing in the kernel space using
eh-frame, eliminating the need for the target process to use the-fno-omit-frame-pointeroption to preserve stack frame pointers. - Automatically detects the Lua version of the target process, no manual specification required.
- Precisely locates the
Lvariable position via DWARF debug information, supporting GCC/Clang O0~O3 optimization levels.
To use lua-perf, make sure you meet the following requirements:
- The installed kernel version needs to be
5.17or above.
To generate flame graphs, you need to use lua-perf in conjunction with the FlameGraph tool. Here's how you can do it:
-
First, run the command
sudo lua-perf -p <pid> -f <HZ>to sample the call stacks of the target process and generate aperf.foldfile in the current directory.<pid>is the process ID of the target process, which can be a process inside a Docker container or a process on the host machine.<HZ>is the stack sampling frequency, with a default value of100(100 samples per second). -
Next, convert the
perf.foldfile to a flame graph by running./FlameGraph/flamegraph.pl perf.folded > perf.svg. -
Finally, you will find the generated flame graph,
perf.svg, in the current directory.
Here's an example flame graph:
In the BPF program, bpf_trace_printk is used to print logs. If you suspect any abnormalities in the performance sampling, you can view the logs using the following commands:
sudo mount -t tracefs nodev /sys/kernel/tracing
sudo cat /sys/kernel/debug/tracing/trace_pipe
lua-perf currently has the following known issues:
- Lack of support for
CFA_expression, which may result in failed stack backtracing in extreme cases. - The analysis of
CFAinstructions does not handlevdsoat the moment, causing stack backtracing failures for function calls invdso. - The process of merging C stacks and Lua stacks uses a heuristic strategy, which may have some flaws in extreme cases (none have been found so far).
The following tasks are planned for lua-perf:
- Support for
CFA_expression - Support for
vdso - Optimization of the merging strategy for C stacks and Lua stacks