fs/inode: add dynamic FD backtrace threshold control by xiaoqizhan · Pull Request #18767 · apache/nuttx

xiaoqizhan · 2026-04-20T09:38:23Z

When CONFIG_FS_BACKTRACE is enabled, collecting a stack trace for every new file descriptor adds avoidable overhead to tasks that use only a small number of FDs.

Add a dynamic threshold option so backtrace capture is enabled only after the per-task FD count reaches the configured limit.

Note: Please adhere to Contributing Guidelines.

Summary

This patch introduces a dynamic backtrace threshold control mechanism for file descriptors.

**Why the change is necessary:**
Currently, when `CONFIG_FS_BACKTRACE` is enabled, the system collects a stack backtrace for every newly allocated file descriptor. However, this imposes unnecessary and avoidable

performance overhead on tasks that only allocate a small number of file descriptors and never exceed typical limits or experience FD leaks.

**What it does and how:**
This change introduces `CONFIG_FS_BACKTRACE_DYNAMIC`. When enabled, it avoids capturing stack traces for normal, small-scale file descriptor allocations. The system tracks the number of

open file descriptors per task (fl_open_count) and only begins filling the backtrace array (f_backtrace) once the per-task file descriptor count reaches a configurable threshold
(CONFIG_FS_BACKTRACE_DEFAULT_THRESHOLD).

This feature significantly improves runtime performance for tasks with normal FD usage while still preserving the powerful diagnostic capabilities of `CONFIG_FS_BACKTRACE` for detecting

resource leaks when they actually occur.

Impact

* **Users:** Improves runtime performance and reduces system overhead by omitting backtrace collection for tasks that use few file descriptors.
* **Build process:** Adds new Kconfig options (`CONFIG_FS_BACKTRACE_DYNAMIC` and `CONFIG_FS_BACKTRACE_DEFAULT_THRESHOLD`).
* **Hardware:** None.
* **Documentation:** No significant impact, but the Kconfig help text has been updated to reflect the new feature.
* **Security:** None.
* **Compatibility:** Backward compatible. The dynamic feature defaults to `n`, preserving the original behavior of `CONFIG_FS_BACKTRACE` unless explicitly enabled.

Testing

Host Machine: Ubuntu Linux (x86_64)
Target: Verified on generic simulators and local boards (e.g. sim:nsh)

**Verification Steps:**
1. Built the system with `CONFIG_FS_BACKTRACE=y` and `CONFIG_FS_BACKTRACE_DYNAMIC=n` to ensure the original backtrace logic works without regression.
2. Built the system with `CONFIG_FS_BACKTRACE_DYNAMIC=y` and set a default threshold (e.g., 60).
3. Created an application that opens multiple files iteratively:
   * Verified that backtraces are **not** captured for the first 59 file descriptors.
   * Verified that once the threshold (60th FD) is reached, the `fl_bt_enabled` flag is set, and backtraces are successfully captured and stored for subsequent file descriptors.
4. Verified that file descriptors closed (`fl_open_count` decremented) do not cause the threshold trigger to oscillate incorrectly (once enabled, it stays enabled for that task's

fdlist).
5. Tested fdlist_copy (task creation) to ensure backtrace thresholds and states are correctly propagated/initialized for child tasks.

When CONFIG_FS_BACKTRACE is enabled, collecting a stack trace for every new file descriptor adds avoidable overhead to tasks that use only a small number of FDs. Add a dynamic threshold option so backtrace capture is enabled only after the per-task FD count reaches the configured limit. Signed-off-by: zhanxiaoqi <zhanxiaoqi@bytedance.com>

xiaoxiang781216 · 2026-04-21T01:50:30Z

  struct fd         fl_prefds[CONFIG_NFILE_DESCRIPTORS_PER_BLOCK];
+
+#if CONFIG_FS_BACKTRACE > 0 && defined(CONFIG_FS_BACKTRACE_DYNAMIC)
+  atomic_t           fl_open_count;   /* Current open file descriptor count */


once FS_BACKTRACE is enabled, fd always reserve the space for backtrace. what's benefit to skip saving the backtrace?

Even after the FS_BACKTRACE macro is enabled, space will still be reserved for backtrace operations. My intention here is not to save memory, but to reduce performance overhead, since capturing backtrace incurs significant performance costs—especially in scenarios involving frequent opening and closing of file.

this is debugging feature, why do you care about the performance but make the code more complex. I would suggest you optimize your code to reduce the frequency of open/close, or switch to frame base backtrace which is fast then backtrace through unwind table.

What I mentioned earlier about frequently opening and closing FDs is just one scenario—it may not apply to my specific application.
Using FP-based stack backtracking can improve the quality of stack backtraces, but it still incurs performance overhead.
Essentially, fdleak focuses on leaked FDs rather than normal FD allocations. The FDs opened by a task at the beginning are usually not leaked; some tasks may open 50 FDs by default, while others may open only 10.
The patch I am adding focuses more on detecting leaked FDs, and each task can configure its own threshold for leaked FDs. This kind of operation was not supported in the original implementation.

have you measure how much(percent) improvement can be got from this patch? I am afraid that this patch can't get the visable(measurable) performance improvement since open is a very slow fs operation.

I acknowledge that the open() system call is indeed slow.

The key difference is: the slowness of open() comes from necessary business logic; whereas the overhead introduced by sched_backtrace is unnecessary debugging cost—especially for normal tasks with no FD leaks.

According to actual test data, when opening actual filesystem files, sched_backtrace accounts for 1% of the total open time in most cases, and occasionally reaches 20%.

FD allocation is not limited to open(). Other operations such as dup, signalfd, timerfd, eventfd, and socket also allocate FDs, and these calls are inherently fast. Adding sched_backtrace to them becomes a critical performance bottleneck. For example, in my local tests on socket FD allocation, stack tracing (with CONFIG_FS_BACKTRACE=6) introduced an average performance overhead of 25% (1/4).

In RTOS, avoiding unnecessary CPU consumption and execution time jitter is extremely critical. The dynamic threshold mechanism strikes a perfect balance between high performance in normal scenarios and effective debugging capability when exceptions occur.

Below is the log print corresponding to part of the test data(with CONFIG_FS_BACKTRACE=6). The value after count= represents the execution time consumed by the function.
[ 5] [240] xqzhan FS_ADD_BACKTRACE count=240 usec
[ 5] [240] xqzhan open:path=/data/test.bin, count=26900 usec

[19] [252] xqzhan FS_ADD_BACKTRACE count=170 usec
[19] [252] xqzhan open:path=/dev/kmsg, count=800 usec

[ 5] [240] xqzhan FS_ADD_BACKTRACE count=280 usec
[ 5] [240] xqzhan open:path=/data/dem_test, count=28600 usec

[ 5] [240] xqzhan FS_ADD_BACKTRACE count=320 usec
[ 5] [240] xqzhan open:path=/data/dem_recovery, count=31000 usec

[17] [224] xqzhan FS_ADD_BACKTRACE count=200 usec
[17] [224] xqzhan socket:count=800 usec

[20] [104] xqzhan FS_ADD_BACKTRACE count=100 usec
[20] [104] xqzhan socket:count=600 usec

[ 9] [250] xqzhan FS_ADD_BACKTRACE count=200 usec
[ 9] [250] xqzhan socket:count=700 usec

[62] [224] xqzhan FS_ADD_BACKTRACE count=300 usec
[62] [224] xqzhan socket:count=1200 usec

if you really want to reduce the cost, I would suggest to add flag(check how TCB_FLAG_HEAP_CHECK done) in task_group_s::tg_flags to disable/enable backtrace for the whole process, since it's more simple and general.

xiaoqizhan requested review from Donny9, pkarashchenko, pussuw and xiaoxiang781216 as code owners April 20, 2026 09:38

github-actions Bot added Area: File System File System issues Size: M The size of the change in this PR is medium labels Apr 20, 2026

acassis approved these changes Apr 20, 2026

View reviewed changes

xiaoxiang781216 requested changes Apr 21, 2026

View reviewed changes

xiaoqizhan requested a review from xiaoxiang781216 April 21, 2026 03:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fs/inode: add dynamic FD backtrace threshold control#18767

fs/inode: add dynamic FD backtrace threshold control#18767
xiaoqizhan wants to merge 1 commit intoapache:masterfrom
xiaoqizhan:local

xiaoqizhan commented Apr 20, 2026

Uh oh!

Uh oh!

Uh oh!

xiaoxiang781216 Apr 21, 2026

Uh oh!

xiaoqizhan Apr 21, 2026

Uh oh!

xiaoxiang781216 Apr 21, 2026

Uh oh!

xiaoqizhan Apr 21, 2026

Uh oh!

xiaoxiang781216 Apr 21, 2026

Uh oh!

xiaoqizhan Apr 21, 2026 •

edited

Loading

Uh oh!

xiaoxiang781216 Apr 21, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xiaoqizhan commented Apr 20, 2026

Summary

Impact

Testing

Uh oh!

Uh oh!

Uh oh!

xiaoxiang781216 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoqizhan Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoxiang781216 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoqizhan Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoxiang781216 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

xiaoqizhan Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xiaoxiang781216 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xiaoqizhan Apr 21, 2026 •

edited

Loading