Skip to content

in_ebpf: Implement dns trace#11735

Open
cosmo0920 wants to merge 2 commits intomasterfrom
cosmo0920-implement-dns-trace
Open

in_ebpf: Implement dns trace#11735
cosmo0920 wants to merge 2 commits intomasterfrom
cosmo0920-implement-dns-trace

Conversation

@cosmo0920
Copy link
Copy Markdown
Contributor

@cosmo0920 cosmo0920 commented Apr 22, 2026

In this PR, I implemented DNS query eBPF trace.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
sudo bin/fluent-bit -i ebpf -ptrace=trace_dns -o stdout -v
  • Debug log output from testing the change
Fluent Bit v5.0.4
* Copyright (C) 2015-2026 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____ 
|  ___| |                | |   | ___ (_) |         |  ___||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   _|___ \ | |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \|  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V //\__/ /\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)\___/


[2026/04/22 13:08:57.904] [ info] Configuration:
[2026/04/22 13:08:57.925] [ info]  flush time     | 1.000000 seconds
[2026/04/22 13:08:57.930] [ info]  grace          | 5 seconds
[2026/04/22 13:08:57.931] [ info]  daemon         | 0
[2026/04/22 13:08:57.931] [ info] ___________
[2026/04/22 13:08:57.931] [ info]  inputs:
[2026/04/22 13:08:57.931] [ info]      ebpf
[2026/04/22 13:08:57.932] [ info] ___________
[2026/04/22 13:08:57.932] [ info]  filters:
[2026/04/22 13:08:57.932] [ info] ___________
[2026/04/22 13:08:57.932] [ info]  outputs:
[2026/04/22 13:08:57.932] [ info]      stdout.0
[2026/04/22 13:08:57.933] [ info] ___________
[2026/04/22 13:08:57.933] [ info]  collectors:
[2026/04/22 13:08:58.002] [ info] [fluent bit] version=5.0.4, commit=dfdc57cc69, pid=355581
[2026/04/22 13:08:58.011] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2026/04/22 13:08:58.016] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2026/04/22 13:08:58.017] [ info] [simd    ] SSE2
[2026/04/22 13:08:58.017] [ info] [cmetrics] version=2.1.2
[2026/04/22 13:08:58.017] [ info] [ctraces ] version=0.7.1
[2026/04/22 13:08:58.033] [ info] [input:ebpf:ebpf.0] initializing
[2026/04/22 13:08:58.034] [ info] [input:ebpf:ebpf.0] storage_strategy='memory' (memory only)
[2026/04/22 13:08:58.035] [debug] [ebpf:ebpf.0] created event channels: read=21 write=22
[2026/04/22 13:08:58.036] [debug] [input:ebpf:ebpf.0] initializing eBPF input plugin
[2026/04/22 13:08:58.041] [debug] [input:ebpf:ebpf.0] processing trace: trace_dns
[2026/04/22 13:08:58.041] [debug] [input:ebpf:ebpf.0] setting up trace configuration for: trace_dns
[2026/04/22 13:09:03.453] [debug] [input:ebpf:ebpf.0] attaching BPF program for trace: trace_dns
[2026/04/22 13:09:03.463] [debug] [input:ebpf:ebpf.0] registering trace handler for: trace_dns
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: dns_connect_pending
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: dns_sockets
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: dns_queries
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: events
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: dns_recv
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: gadget_heap
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: gadget_mntns_filter_map
[2026/04/22 13:09:03.464] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: trace_dn.rodata
[2026/04/22 13:09:03.465] [trace] [input:ebpf:ebpf.0 at /home/cosmo/GitHub/fluent-bit/plugins/in_ebpf/in_ebpf.c:49] found BPF map: trace_dn.bss
[2026/04/22 13:09:03.468] [ info] [input:ebpf:ebpf.0] registered trace handler for: trace_dns
[2026/04/22 13:09:03.468] [ info] [input:ebpf:ebpf.0] trace configuration completed for: trace_dns
[2026/04/22 13:09:03.469] [debug] [input:ebpf:ebpf.0] setting up collector with poll interval: 1000 ms
[2026/04/22 13:09:03.471] [ info] [input:ebpf:ebpf.0] eBPF input plugin initialized successfully
[2026/04/22 13:09:03.475] [debug] [stdout:stdout.0] created event channels: read=49 write=50
[2026/04/22 13:09:03.512] [ info] [sp] stream processor started
[2026/04/22 13:09:03.514] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2026/04/22 13:09:03.520] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:03.527] [ info] [output:stdout:stdout.0] worker #0 started
[2026/04/22 13:09:03.664] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:03.924] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:03.936] [debug] [input:ebpf:ebpf.0] collecting events from ring buffers
[2026/04/22 13:09:03.937] [debug] [input:ebpf:ebpf.0] consuming events from ring buffer trace_dns
[2026/04/22 13:09:03.938] [debug] [input:ebpf:ebpf.0] successfully consumed events from ring buffer trace_dns
[2026/04/22 13:09:03.939] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:04.164] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:04.414] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:04.664] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:04.914] [debug] [input:ebpf:ebpf.0] collecting events from ring buffers
[2026/04/22 13:09:04.915] [debug] [input:ebpf:ebpf.0] consuming events from ring buffer trace_dns
[2026/04/22 13:09:04.964] [trace] [input chunk] update output instances with new chunk size diff=186, records=1, input=ebpf.0
[2026/04/22 13:09:04.968] [trace] [input chunk] update output instances with new chunk size diff=190, records=1, input=ebpf.0
[2026/04/22 13:09:04.968] [debug] [input:ebpf:ebpf.0] successfully consumed events from ring buffer trace_dns
[2026/04/22 13:09:04.968] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.164] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.414] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.414] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.664] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.923] [trace] [task 0x6653660] created (id=0)
[2026/04/22 13:09:05.929] [debug] [task] created task=0x6653660 id=0 OK
[2026/04/22 13:09:05.930] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[0] ebpf.0: [[1776830944.927134652, {}], {"event_type"=>"dns", "pid"=>4719, "tid"=>343834, "comm"=>"DNS Res~ver #52", "query"=>"sync-1-us-west1-g.sync.services.mozilla.com", "query_type"=>1, "txid"=>11237, "response"=>0, "rcode"=>0, "latency_ns"=>0, "error_raw"=>0}]
[1] ebpf.0: [[1776830944.966563667, {}], {"event_type"=>"dns", "pid"=>4719, "tid"=>343834, "comm"=>"DNS Res~ver #52", "query"=>"sync-1-us-west1-g.sync.services.mozilla.com", "query_type"=>1, "txid"=>11237, "response"=>1, "rcode"=>0, "latency_ns"=>10443380, "error_raw"=>0}]
[2026/04/22 13:09:05.945] [debug] [out flush] cb_destroy coro_id=0
[2026/04/22 13:09:05.946] [trace] [coro] destroy coroutine=0x833dbf0 data=0x833dc10
[2026/04/22 13:09:05.946] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:05.931] [debug] [input:ebpf:ebpf.0] collecting events from ring buffers
[2026/04/22 13:09:05.946] [debug] [input:ebpf:ebpf.0] consuming events from ring buffer trace_dns
[2026/04/22 13:09:05.947] [debug] [input:ebpf:ebpf.0] successfully consumed events from ring buffer trace_dns
[2026/04/22 13:09:05.948] [trace] [engine] [task event] task_id=0 out_id=0 return=OK
[2026/04/22 13:09:05.954] [debug] [task] destroy task=0x6653660 (task_id=0)
[2026/04/22 13:09:05.958] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.164] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.414] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.664] [trace] [sched] 0 timer coroutines destroyed
^C[2026/04/22 13:09:06] [engine] caught signal (SIGINT)
[2026/04/22 13:09:06.819] [trace] [engine] flush enqueued data
[2026/04/22 13:09:06.819] [ warn] [engine] service will shutdown in max 5 seconds
[2026/04/22 13:09:06.820] [ info] [engine] pausing all inputs..
[2026/04/22 13:09:06.821] [ info] [input] pausing ebpf.0
[2026/04/22 13:09:06.822] [debug] [input:ebpf:ebpf.0] collector paused
[2026/04/22 13:09:06.823] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.914] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.920] [ info] [engine] service has stopped (0 pending tasks)
[2026/04/22 13:09:06.920] [ info] [input] pausing ebpf.0
[2026/04/22 13:09:06.921] [debug] [input:ebpf:ebpf.0] collector paused
[2026/04/22 13:09:06.923] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2026/04/22 13:09:06.923] [trace] [sched] 0 timer coroutines destroyed
[2026/04/22 13:09:06.930] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2026/04/22 13:09:07.144] [ info] [input:ebpf:ebpf.0] eBPF input plugin exited

  • Attached Valgrind output that shows no leaks or memory corruption was found
==355581== 
==355581== HEAP SUMMARY:
==355581==     in use at exit: 0 bytes in 0 blocks
==355581==   total heap usage: 3,784 allocs, 3,784 frees, 17,106,722 bytes allocated
==355581== 
==355581== All heap blocks were freed -- no leaks are possible
==355581== 

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features
    • Added DNS tracing: correlates queries and responses, measures latency, records response codes and query type, captures raw query data, and automatically tracks socket lifecycle for accurate pairing.
  • Chores
    • Updated the "Trace" configuration description to list supported trace examples (dns, exec, malloc, signal, vfs, tcp) and note multi-use support.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds complete DNS query/response tracing to the eBPF input plugin. It introduces kernel-space syscall probes (connect, sendto, recvfrom) to capture and correlate DNS transactions, user-space event decoding that extracts query names and response metadata into Fluent Bit log records.

Changes

DNS eBPF Trace Support

Layer / File(s) Summary
Event Data Shape
plugins/in_ebpf/traces/includes/common/events.h, plugins/in_ebpf/traces/includes/common/encoder.h
Added struct dns_event with DNS fields and buffers; added DNS_* size macros; extended the event union to include DNS; mapped EVENT_TYPE_DNS to "dns".
eBPF Kernel Tracing
plugins/in_ebpf/traces/dns/bpf.c
Added helpers and probes: validate/parse DNS headers, trace connect/sendto/recvfrom, track socket eligibility and in-flight queries in BPF maps, correlate responses by (pid,fd,txid), compute latency, emit query and response events, and added license string.
User-space Event Handler
plugins/in_ebpf/traces/dns/handler.h, plugins/in_ebpf/traces/dns/handler.c
Added handler header and implementation: decode_dns_query_name(), encode_dns_event() to build Fluent Bit records with query, query_type, txid, response, rcode, latency_ns, error_raw, and trace_dns_handler() to validate/dispatch events and append encoded logs.
Trace Registration & Config
plugins/in_ebpf/traces/traces.h, plugins/in_ebpf/in_ebpf.c
Included DNS trace skeleton and handler, added DEFINE_GET_BPF_OBJECT(trace_dns), registered trace_dns in trace_table, and updated plugin Trace option help text to list DNS among examples.

Sequence Diagram

sequenceDiagram
  participant App as Application
  participant Kernel as eBPF Kernel
  participant Maps as BPF Maps
  participant Handler as Fluent Bit DNS Handler
  participant FLB as Fluent Bit Input

  App->>Kernel: connect/sendto/recvfrom syscalls
  Kernel->>Maps: store pending connect / dns_queries / dns_recv
  Kernel->>Handler: emit EVENT_TYPE_DNS (query/response)
  Handler->>Handler: decode query name, assemble record
  Handler->>FLB: flb_input_log_append(record)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • edsiper

Poem

🐰 I nibble bytes where txids play,

Queries hop and latencies sway,
Kernel whispers, maps keep score,
Userland logs what came before,
Fluent Bit hums—DNS ballet.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'in_ebpf: Implement dns trace' accurately and concisely describes the main change: adding DNS tracing functionality to the eBPF input plugin.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cosmo0920-implement-dns-trace

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@cosmo0920 cosmo0920 force-pushed the cosmo0920-implement-dns-trace branch from 2cc2b16 to 56d1e77 Compare May 7, 2026 06:20
@cosmo0920 cosmo0920 marked this pull request as ready for review May 7, 2026 06:41
@cosmo0920 cosmo0920 requested a review from edsiper as a code owner May 7, 2026 06:41
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 56d1e77459

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/in_ebpf/traces/dns/bpf.c
Comment thread plugins/in_ebpf/traces/dns/bpf.c Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
plugins/in_ebpf/traces/dns/bpf.c (2)

58-63: dns_queries entries are never expired for unanswered or response-missed queries.

Entries are added in trace_dns_sendto_enter and removed in trace_dns_recvfrom_exit. If a DNS query receives no response (UDP loss, timeout, or the response is received via a non-recvfrom syscall such as read/recv), the entry persists indefinitely. With max_entries: 16384, a workload with bursts of unanswered queries can saturate the map and silently drop new DNS query tracking.

Consider implementing a periodic cleanup mechanism, or supplementing with a recvmsg/read tracepoint to broaden response capture.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/dns/bpf.c` around lines 58 - 63, dns_queries map
entries (map dns_queries) are never removed when a DNS query never gets a
recvfrom response, which can saturate the map added in trace_dns_sendto_enter
and only removed in trace_dns_recvfrom_exit; change the strategy by either (A)
switching the map type from BPF_MAP_TYPE_HASH to BPF_MAP_TYPE_LRU_HASH (adjust
dns_queries declaration and max_entries remains) so old/unanswered entries are
automatically evicted, or (B) implement expiry logic by storing a timestamp in
struct dns_query_state and adding a periodic cleanup path (or extend probes) to
remove stale entries, and/or add additional probes (trace_dns_recvmsg_exit /
trace_dns_read_exit) to remove entries when responses arrive via other syscalls;
update dns_queries usage in trace_dns_sendto_enter and trace_dns_recvfrom_exit
accordingly.

157-174: is_dns_destination only matches IPv4 (AF_INET); IPv6 DNS traffic is silently ignored.

Applications using AF_INET6 resolvers (including dual-stack hosts querying ::1 or a v6 DNS server) will not have their sockets registered in dns_sockets, and their sendto calls won't be intercepted unless an explicit addr is passed. This means DNS traces will silently produce incomplete records on IPv6-capable systems.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/dns/bpf.c` around lines 157 - 174, is_dns_destination
currently only handles AF_INET; modify it to detect and handle AF_INET6 as well
by first reading the sa_family from user addr (using bpf_probe_read_user on the
family/first bytes) and then branching: for AF_INET, read a struct sockaddr_in
and check dst.sin_family == AF_INET && bpf_ntohs(dst.sin_port) == DNS_PORT; for
AF_INET6, read a struct sockaddr_in6 and check dst6.sin6_family == AF_INET6 &&
bpf_ntohs(dst6.sin6_port) == DNS_PORT. Ensure you validate addrlen before each
full read (>= sizeof(struct sockaddr_in) or >= sizeof(struct sockaddr_in6)) and
keep existing AF_INET behavior intact; use the same DNS_PORT, AF_INET6, struct
sockaddr_in6, is_dns_destination symbols to locate and update the logic.
plugins/in_ebpf/traces/includes/common/events.h (1)

136-146: ⚡ Quick win

char query[DNS_NAME_MAX] is never written or read — dead field.

The BPF programs (bpf.c) only populate query_raw and query_raw_len; query is never written. The handler (handler.c) decodes into a local query_name buffer and never reads ev->details.dns.query. This 128-byte field is dead weight in every DNS ring-buffer slot.

🔧 Proposed fix
 struct dns_event {
     __u16 txid;
     __u16 query_type;
     __u8 rcode;
     __u8 response;
     __u16 query_raw_len;
     __u64 latency_ns;
     int error_raw;
-    char query[DNS_NAME_MAX];
     __u8 query_raw[DNS_QUERY_RAW_MAX];
 };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/includes/common/events.h` around lines 136 - 146, The
struct dns_event contains a dead field char query[DNS_NAME_MAX] that is never
populated by bpf.c nor read by handler.c; remove that unused field from the
struct dns_event definition in events.h to reduce ring-buffer slot size, and
then rebuild to ensure no remaining references rely on dns.query; verify bpf.c
continues to populate query_raw and query_raw_len and handler.c continues to
decode into its local query_name buffer (update any tests or comments mentioning
dns.query if present).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_ebpf/traces/dns/handler.c`:
- Around line 202-217: The encoder is only reset on the success path, so ensure
flb_log_event_encoder_reset(encoder) is always called before returning from the
handler: after a failing encode_dns_event (ret == -1) and after a failing
flb_input_log_append (ret == -1). Update the control flow in the function that
calls encode_dns_event and flb_input_log_append (references: encode_dns_event,
flb_input_log_append, flb_log_event_encoder_reset, event_ctx->ins, encoder) to
perform a single cleanup/exit path (e.g., a local label or common return block)
that calls flb_log_event_encoder_reset(encoder) and then returns the appropriate
error code so the encoder cannot retain stale state across invocations.

---

Nitpick comments:
In `@plugins/in_ebpf/traces/dns/bpf.c`:
- Around line 58-63: dns_queries map entries (map dns_queries) are never removed
when a DNS query never gets a recvfrom response, which can saturate the map
added in trace_dns_sendto_enter and only removed in trace_dns_recvfrom_exit;
change the strategy by either (A) switching the map type from BPF_MAP_TYPE_HASH
to BPF_MAP_TYPE_LRU_HASH (adjust dns_queries declaration and max_entries
remains) so old/unanswered entries are automatically evicted, or (B) implement
expiry logic by storing a timestamp in struct dns_query_state and adding a
periodic cleanup path (or extend probes) to remove stale entries, and/or add
additional probes (trace_dns_recvmsg_exit / trace_dns_read_exit) to remove
entries when responses arrive via other syscalls; update dns_queries usage in
trace_dns_sendto_enter and trace_dns_recvfrom_exit accordingly.
- Around line 157-174: is_dns_destination currently only handles AF_INET; modify
it to detect and handle AF_INET6 as well by first reading the sa_family from
user addr (using bpf_probe_read_user on the family/first bytes) and then
branching: for AF_INET, read a struct sockaddr_in and check dst.sin_family ==
AF_INET && bpf_ntohs(dst.sin_port) == DNS_PORT; for AF_INET6, read a struct
sockaddr_in6 and check dst6.sin6_family == AF_INET6 && bpf_ntohs(dst6.sin6_port)
== DNS_PORT. Ensure you validate addrlen before each full read (>= sizeof(struct
sockaddr_in) or >= sizeof(struct sockaddr_in6)) and keep existing AF_INET
behavior intact; use the same DNS_PORT, AF_INET6, struct sockaddr_in6,
is_dns_destination symbols to locate and update the logic.

In `@plugins/in_ebpf/traces/includes/common/events.h`:
- Around line 136-146: The struct dns_event contains a dead field char
query[DNS_NAME_MAX] that is never populated by bpf.c nor read by handler.c;
remove that unused field from the struct dns_event definition in events.h to
reduce ring-buffer slot size, and then rebuild to ensure no remaining references
rely on dns.query; verify bpf.c continues to populate query_raw and
query_raw_len and handler.c continues to decode into its local query_name buffer
(update any tests or comments mentioning dns.query if present).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f49fe630-76c2-49e6-958f-ec90c541ae9a

📥 Commits

Reviewing files that changed from the base of the PR and between 54a9ebc and 56d1e77.

📒 Files selected for processing (7)
  • plugins/in_ebpf/in_ebpf.c
  • plugins/in_ebpf/traces/dns/bpf.c
  • plugins/in_ebpf/traces/dns/handler.c
  • plugins/in_ebpf/traces/dns/handler.h
  • plugins/in_ebpf/traces/includes/common/encoder.h
  • plugins/in_ebpf/traces/includes/common/events.h
  • plugins/in_ebpf/traces/traces.h

Comment thread plugins/in_ebpf/traces/dns/handler.c Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/in_ebpf/traces/dns/bpf.c`:
- Around line 55-59: The code currently stores the user pointer in
dns_connect_pending at sys_enter_connect and reads user memory at
sys_exit_connect, leading to TOCTOU; instead, in sys_enter_connect (where struct
dns_connect_args and dns_connect_pending are used) perform a single safe copy of
the needed destination fields (socket family and port) from userspace into
kernel/BPF-managed storage and store only those scalar values (e.g., family,
port, and fd) in dns_connect_pending; then in sys_exit_connect consume only
those stored scalar values in sys_exit_connect (do not dereference the original
userspace addr pointer). Update references to dns_connect_pending,
sys_enter_connect, and sys_exit_connect so exit logic uses the copied
family/port values rather than reading user memory.
- Around line 61-67: The dns_queries BPF map is a plain BPF_MAP_TYPE_HASH and
can retain stale entries when responses are lost, leading to saturation; change
the map to use an eviction-capable type (e.g., BPF_MAP_TYPE_LRU_HASH) or similar
LRU/TTL-backed map and keep the same key/value types and max_entries so stale
entries are evicted automatically; update the dns_queries definition (symbol:
dns_queries) accordingly and ensure the existing explicit removal logic (used
around the response handling code paths) remains compatible with the new map
type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8f24cbc2-d99c-4dee-b951-666d551e3bd5

📥 Commits

Reviewing files that changed from the base of the PR and between 56d1e77 and 2dbc4d5.

📒 Files selected for processing (2)
  • plugins/in_ebpf/traces/dns/bpf.c
  • plugins/in_ebpf/traces/dns/handler.c

Comment thread plugins/in_ebpf/traces/dns/bpf.c
Comment thread plugins/in_ebpf/traces/dns/bpf.c
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
plugins/in_ebpf/traces/dns/bpf.c (2)

367-374: 💤 Low value

Optional: make raw_len upper-bound explicit for BPF-verifier clarity.

raw_len is clamped to DNS_QUERY_RAW_MAX via the ternary on Line 367, but the subsequent cast to __u16 can obscure that bound from some older BPF verifiers. An explicit clamp after the cast makes the scalar range unambiguous without changing behaviour.

🔧 Proposed guard
     raw_len = (__u16) (len > DNS_QUERY_RAW_MAX ? DNS_QUERY_RAW_MAX : len);
+    if (raw_len > DNS_QUERY_RAW_MAX) {
+        raw_len = DNS_QUERY_RAW_MAX;
+    }
     state.query_raw_len = raw_len;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/dns/bpf.c` around lines 367 - 374, The ternary cast to
__u16 for raw_len may not make the upper bound obvious to older BPF verifiers;
explicitly clamp raw_len to DNS_QUERY_RAW_MAX after the cast so its scalar range
is unambiguous. Update the code around raw_len and state.query_raw_len (the
computation that currently uses DNS_QUERY_RAW_MAX and the subsequent assignment
to state.query_raw_len) to add an explicit check like "if (raw_len >
DNS_QUERY_RAW_MAX) raw_len = DNS_QUERY_RAW_MAX" before using raw_len in
bpf_probe_read_user(state.query_raw, raw_len, payload), leaving behavior
unchanged but making the verifier happy. Ensure references remain to raw_len,
DNS_QUERY_RAW_MAX, state.query_raw_len, and bpf_probe_read_user.

287-303: 💤 Low value

trace_dns_close_enter – redundant double cast on Line 298.

(__s32) ((int) ctx->args[0]) applies two equivalent 32-bit signed casts. int and __s32 are the same width; the inner (int) cast is superfluous.

🔧 Proposed simplification
-    key.fd = (__s32) ((int) ctx->args[0]);
+    key.fd = (int) ctx->args[0];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/in_ebpf/traces/dns/bpf.c` around lines 287 - 303, In
trace_dns_close_enter, the fd assignment does an unnecessary double cast;
replace key.fd = (__s32) ((int) ctx->args[0]); with a single 32-bit cast (e.g.,
key.fd = (__s32) ctx->args[0]; or cast to (int32_t)) to remove the redundant
(int) cast while keeping the intended conversion for dns_socket_key.key.fd and
the bpf_map_delete_elem(&dns_sockets, &key) call intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@plugins/in_ebpf/traces/dns/bpf.c`:
- Around line 367-374: The ternary cast to __u16 for raw_len may not make the
upper bound obvious to older BPF verifiers; explicitly clamp raw_len to
DNS_QUERY_RAW_MAX after the cast so its scalar range is unambiguous. Update the
code around raw_len and state.query_raw_len (the computation that currently uses
DNS_QUERY_RAW_MAX and the subsequent assignment to state.query_raw_len) to add
an explicit check like "if (raw_len > DNS_QUERY_RAW_MAX) raw_len =
DNS_QUERY_RAW_MAX" before using raw_len in bpf_probe_read_user(state.query_raw,
raw_len, payload), leaving behavior unchanged but making the verifier happy.
Ensure references remain to raw_len, DNS_QUERY_RAW_MAX, state.query_raw_len, and
bpf_probe_read_user.
- Around line 287-303: In trace_dns_close_enter, the fd assignment does an
unnecessary double cast; replace key.fd = (__s32) ((int) ctx->args[0]); with a
single 32-bit cast (e.g., key.fd = (__s32) ctx->args[0]; or cast to (int32_t))
to remove the redundant (int) cast while keeping the intended conversion for
dns_socket_key.key.fd and the bpf_map_delete_elem(&dns_sockets, &key) call
intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1b540f4c-4fc1-40b3-8b06-f0a440df756b

📥 Commits

Reviewing files that changed from the base of the PR and between 2dbc4d5 and ba73369.

📒 Files selected for processing (2)
  • plugins/in_ebpf/traces/dns/bpf.c
  • plugins/in_ebpf/traces/dns/handler.c
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/in_ebpf/traces/dns/handler.c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant