Skip to content

node_exporter_metrics: Add file path to error messages#11839

Open
rossigee wants to merge 3 commits into
fluent:masterfrom
rossigee:improve-node-exporter-errors
Open

node_exporter_metrics: Add file path to error messages#11839
rossigee wants to merge 3 commits into
fluent:masterfrom
rossigee:improve-node-exporter-errors

Conversation

@rossigee

@rossigee rossigee commented May 22, 2026

Copy link
Copy Markdown

Problem: When the node_exporter_metrics input plugin fails to read hardware sensors (hwmon, thermal zones, cpufreq), the error logs only show:

[error] [ne_utils.c:132 errno=61] No data available
[error] [ne_utils.c:132 errno=6] No such device or address

This is useless for debugging because users have no way to know which file/path is missing.

Solution: Add the file path to error messages in ne_utils_file_read_uint64, ne_utils_file_read_sds, and ne_utils_file_read_lines functions.

Example after fix:

[error] [input:node_exporter_metrics:node_exporter_metrics.0] could not open '/sys/class/hwmon/hwmon4/fan1_label'
[error] [input:node_exporter_metrics:node_exporter_metrics.0] could not read from '/sys/class/thermal/thermal_zone6/temp'

Closes #11838

Summary by CodeRabbit

  • Refactor
    • Improved consistency across node exporter metrics collectors by standardizing filesystem read behavior to provide more context-aware error reporting for CPU, memory, network, disk, processes, thermal, load, and VM statistics.
  • Bug Fixes
    • Fixed load average collection to base its failure path on the actual /proc/loadavg read result rather than a stale/uninitialized value.

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6eb7077e-6e50-4f7b-9cf9-b0086bb42c2e

📥 Commits

Reviewing files that changed from the base of the PR and between 507ee88 and b5eb359.

📒 Files selected for processing (1)
  • plugins/in_node_exporter_metrics/ne_utils.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/in_node_exporter_metrics/ne_utils.h

📝 Walkthrough

Walkthrough

This PR refactors three core filesystem-reading utility functions to accept a context pointer for improved error reporting. All 15 collector modules in the node exporter plugin are updated to pass the context at call sites, enabling better error diagnostics through plugin logging.

Changes

Utility functions context propagation

Layer / File(s) Summary
Core utility function signatures and error reporting
plugins/in_node_exporter_metrics/ne_utils.h, plugins/in_node_exporter_metrics/ne_utils.c
ne_utils_file_read_uint64, ne_utils_file_read_lines, and ne_utils_file_read_sds now accept struct flb_ne *ctx as the first parameter. Error handling uses flb_plg_error(ctx->ins, ...) when context is provided, otherwise falls back to flb_errno().
CPU and thermal zone metrics collection
plugins/in_node_exporter_metrics/ne_cpu_linux.c, plugins/in_node_exporter_metrics/ne_thermalzone_linux.c
Thermal zone reads (core ID, physical package ID, throttle counters) and zone/device numeric value reads are updated to pass ctx to utility functions.
CPU frequency metrics formatting
plugins/in_node_exporter_metrics/ne_cpufreq_linux.c
Blank-line formatting is added before each frequency metric read statement for improved code organization.
Storage and file descriptor metrics
plugins/in_node_exporter_metrics/ne_diskstats_linux.c, plugins/in_node_exporter_metrics/ne_filefd_linux.c, plugins/in_node_exporter_metrics/ne_stat_linux.c, plugins/in_node_exporter_metrics/ne_meminfo_linux.c
File-read return values are explicitly assigned to ret before error checks in diskstats, filefd, and stat; memory info reads are adjusted in configure and update paths.
Network interface and protocol statistics
plugins/in_node_exporter_metrics/ne_netdev_linux.c, plugins/in_node_exporter_metrics/ne_netstat_linux.c, plugins/in_node_exporter_metrics/ne_sockstat_linux.c
Network device (/net/dev), protocol statistics (/net/snmp), and socket counts (/net/sockstat, /net/sockstat6) reads are updated to include context.
Hardware monitoring sensor values
plugins/in_node_exporter_metrics/ne_hwmon_linux.c
Sensor input values, temperature thresholds (max/crit), and chip name reads are updated to pass ctx to utility functions.
Process and system load metrics
plugins/in_node_exporter_metrics/ne_processes_linux.c, plugins/in_node_exporter_metrics/ne_loadavg_linux.c, plugins/in_node_exporter_metrics/ne_vmstat_linux.c
Thread/process stat reads, kernel sysfs values (threads-max, pid_max), load average, and virtual memory stats are updated to pass context; return value assignment is corrected in loadavg.
NVMe device entry values
plugins/in_node_exporter_metrics/ne_nvme_linux.c
NVMe entry value reads are updated to pass ctx as the first argument to ne_utils_file_read_lines.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • fluent/fluent-bit#10719: Updates ne_hwmon_linux.c call sites to match the refactored ne_utils_file_read_* helper signatures.
  • fluent/fluent-bit#10723: Both PRs touch the ne_stat_linux.c stat_update() path for reading /stat—this PR changes the ne_utils_file_read_lines call to pass ctx, while the related PR adds error logging on failure.
  • fluent/fluent-bit#11253: Both PRs modify plugins/in_node_exporter_metrics/ne_utils.c, specifically ne_utils_file_read_lines()—this PR changes its signature and error handling while the related PR adjusts internal buffer sizes.

Suggested labels

backport to v4.0.x

Suggested reviewers

  • edsiper
  • cosmo0920

Poem

🐰 Contexts flow through utilities bright,
Error paths now log the light,
Collectors dance in rhythmic grace,
Context passed from place to place! 🔍

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the primary change: adding file paths to error messages in the node_exporter_metrics plugin for improved debugging.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 44bf2d2665

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread plugins/in_node_exporter_metrics/ne_utils.c
Comment thread plugins/in_node_exporter_metrics/ne_utils.c

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_node_exporter_metrics/ne_utils.c`:
- Around line 135-141: The code frees the SDS string p via flb_sds_destroy(p)
before calling read(fd, ...) and then uses p in the error log
flb_plg_error(ctx->ins, "could not read from '%s'", p), leading to a
use-after-free; to fix, delay calling flb_sds_destroy(p) until after the read()
result is handled (i.e., after the if (bytes == -1) block and any flb_plg_error
that references p), and apply the same change to the second identical block
around the other read() (the one at the 280-286 region) so both read/error paths
log p while it remains valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 780dcfd7-8ada-4a79-b714-4c89712daf30

📥 Commits

Reviewing files that changed from the base of the PR and between f6126eb and 44bf2d2.

📒 Files selected for processing (17)
  • plugins/in_node_exporter_metrics/ne_cpu_linux.c
  • plugins/in_node_exporter_metrics/ne_cpufreq_linux.c
  • plugins/in_node_exporter_metrics/ne_diskstats_linux.c
  • plugins/in_node_exporter_metrics/ne_filefd_linux.c
  • plugins/in_node_exporter_metrics/ne_hwmon_linux.c
  • plugins/in_node_exporter_metrics/ne_loadavg_linux.c
  • plugins/in_node_exporter_metrics/ne_meminfo_linux.c
  • plugins/in_node_exporter_metrics/ne_netdev_linux.c
  • plugins/in_node_exporter_metrics/ne_netstat_linux.c
  • plugins/in_node_exporter_metrics/ne_nvme_linux.c
  • plugins/in_node_exporter_metrics/ne_processes_linux.c
  • plugins/in_node_exporter_metrics/ne_sockstat_linux.c
  • plugins/in_node_exporter_metrics/ne_stat_linux.c
  • plugins/in_node_exporter_metrics/ne_thermalzone_linux.c
  • plugins/in_node_exporter_metrics/ne_utils.c
  • plugins/in_node_exporter_metrics/ne_utils.h
  • plugins/in_node_exporter_metrics/ne_vmstat_linux.c

Comment thread plugins/in_node_exporter_metrics/ne_utils.c Outdated
Improve error logs to indicate which file path failed when reading
metrics, making it easier to debug missing hardware sensors or
permissions issues.

Signed-off-by: Ross Golder <ross@golder.org>
@rossigee rossigee force-pushed the improve-node-exporter-errors branch from 44bf2d2 to 790aeda Compare June 19, 2026 00:40
Signed-off-by: Ross Golder <ross@golder.org>
@rossigee rossigee force-pushed the improve-node-exporter-errors branch from 790aeda to 507ee88 Compare June 19, 2026 00:45

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
plugins/in_node_exporter_metrics/ne_utils.h (1)

44-44: ⚡ Quick win

Wrap this prototype to match the 100-char line-length target.

This declaration exceeds the repository style target; split arguments across lines like the adjacent prototypes for consistency.

As per coding guidelines: "Use 4-space tabs/indentation and target 100 chars per line; 120 chars max."

Suggested diff
-int ne_utils_file_read_lines(struct flb_ne *ctx, const char *mount, const char *path, struct mk_list *list);
+int ne_utils_file_read_lines(struct flb_ne *ctx,
+                             const char *mount,
+                             const char *path,
+                             struct mk_list *list);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_node_exporter_metrics/ne_utils.h` at line 44, The function
prototype ne_utils_file_read_lines exceeds the repository's 100-character
line-length target. Refactor this prototype by splitting the function arguments
across multiple lines, using 4-space indentation, similar to the style of
adjacent prototypes in the file. Ensure the reformatted declaration respects the
100-character target with a maximum of 120 characters per line.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugins/in_node_exporter_metrics/ne_utils.h`:
- Line 44: The function prototype ne_utils_file_read_lines exceeds the
repository's 100-character line-length target. Refactor this prototype by
splitting the function arguments across multiple lines, using 4-space
indentation, similar to the style of adjacent prototypes in the file. Ensure the
reformatted declaration respects the 100-character target with a maximum of 120
characters per line.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f1f2db50-6a07-4f02-a3ab-60e84042ef26

📥 Commits

Reviewing files that changed from the base of the PR and between 44bf2d2 and 790aeda.

📒 Files selected for processing (17)
  • plugins/in_node_exporter_metrics/ne_cpu_linux.c
  • plugins/in_node_exporter_metrics/ne_cpufreq_linux.c
  • plugins/in_node_exporter_metrics/ne_diskstats_linux.c
  • plugins/in_node_exporter_metrics/ne_filefd_linux.c
  • plugins/in_node_exporter_metrics/ne_hwmon_linux.c
  • plugins/in_node_exporter_metrics/ne_loadavg_linux.c
  • plugins/in_node_exporter_metrics/ne_meminfo_linux.c
  • plugins/in_node_exporter_metrics/ne_netdev_linux.c
  • plugins/in_node_exporter_metrics/ne_netstat_linux.c
  • plugins/in_node_exporter_metrics/ne_nvme_linux.c
  • plugins/in_node_exporter_metrics/ne_processes_linux.c
  • plugins/in_node_exporter_metrics/ne_sockstat_linux.c
  • plugins/in_node_exporter_metrics/ne_stat_linux.c
  • plugins/in_node_exporter_metrics/ne_thermalzone_linux.c
  • plugins/in_node_exporter_metrics/ne_utils.c
  • plugins/in_node_exporter_metrics/ne_utils.h
  • plugins/in_node_exporter_metrics/ne_vmstat_linux.c
✅ Files skipped from review due to trivial changes (7)
  • plugins/in_node_exporter_metrics/ne_sockstat_linux.c
  • plugins/in_node_exporter_metrics/ne_meminfo_linux.c
  • plugins/in_node_exporter_metrics/ne_thermalzone_linux.c
  • plugins/in_node_exporter_metrics/ne_filefd_linux.c
  • plugins/in_node_exporter_metrics/ne_netstat_linux.c
  • plugins/in_node_exporter_metrics/ne_cpufreq_linux.c
  • plugins/in_node_exporter_metrics/ne_netdev_linux.c
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/in_node_exporter_metrics/ne_nvme_linux.c
  • plugins/in_node_exporter_metrics/ne_vmstat_linux.c
  • plugins/in_node_exporter_metrics/ne_cpu_linux.c
  • plugins/in_node_exporter_metrics/ne_processes_linux.c
  • plugins/in_node_exporter_metrics/ne_hwmon_linux.c
  • plugins/in_node_exporter_metrics/ne_utils.c

Signed-off-by: Ross Golder <ross@golder.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

node_exporter_metrics: Error messages don't include file path, making debugging impossible

1 participant