Fix missed DHCP responses caused by pcap timeout race#1
Open
konurnordberg wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adjusts libpcap capture configuration in dhcp-probe to reduce the chance of missing DHCP/BOOTP replies due to buffering and the SIGALRM/pcap_breakloop() timing window.
Changes:
- Introduces a helper to derive a shorter
pcap_open_live()packet buffer timeout fromresponse_wait_time. - Switches
pcap_open_live()to use the derived timeout instead of the full response window value.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+1316
to
+1321
| int pcap_timeout; | ||
|
|
||
| if (pcap_timeout < PCAP_PACKET_BUFFER_TIMEOUT_MIN_MS) | ||
| pcap_timeout = PCAP_PACKET_BUFFER_TIMEOUT_MIN_MS; | ||
|
|
||
| return pcap_timeout; |
Author
There was a problem hiding this comment.
Comment is outdated, disregard
dhcp-probe uses response_wait_time as the total window for listening for DHCP/BOOTP replies, enforced by alarm() and pcap_breakloop(). The timeout passed to pcap_open_live(), however, is a packet buffer timeout. When both values are the same, replies can remain buffered inside libpcap until dhcp_probe's alarm fires. In that case pcap_dispatch() can return due to pcap_breakloop() before the buffered reply is delivered to process_response(), causing the alert path to be missed. This is an undesired race condition resulting in the dhcp-probe tool to fail to detect DHCP servers nearly 100 percent of the time (at least on certian networks). Use a shorter pcap buffer timeout derived from response_wait_time so pcap_dispatch() wakes periodically and processes captured replies before the overall response window expires. Change-Id: If4b9d6076447fcecdbe069277325d067dfb0fd55
d092d83 to
ee46e7f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
dhcp-probe uses
response_wait_timeas the total window for receivingDHCP/BOOTP replies after sending a probe packet. That window is enforced
with
alarm()andpcap_breakloop().The same value was also passed to
pcap_open_live()as the libpcaptimeout. That timeout is a packet buffer timeout, not the total
application-level response window. In practice, replies can be captured
by pcap but remain buffered until the alarm fires. When the alarm calls
pcap_breakloop(),pcap_dispatch()may return before delivering thebuffered packet to
process_response(). This is an undesirable race condition.The result is that rogue DHCP replies can be missed and the configured
alert program is not called. In the test network described below, replies were
missed a vast majority of the time. Only about 1-3% of illegal DHCP responses
were successfully processed by the tool.
Fix
Use a shorter pcap packet buffer timeout derived from
response_wait_timeso
pcap_dispatch()wakes periodically and processes any buffered packetsbefore dhcp_probe's overall response window expires.
Specifically, set the pcap packet buffer timeout to 1/20 of
response_wait_time.This was determined experimentally to be a good balance of waking
pcap_dispatch()frequently enough to process responses and avoidingresource overuse.
Testing
Tested on a network consisting of a legal DHCP server (with its IP address
defined as a
legal_server), an intentionally added third-party DHCP server,and a switch connecting both devices. The dhcp-probe tool was run from
both the switch and the legal DHCP server, both experiments resulted the
same.
The configuration file set
response_wait_timeto be 10 seconds,cycle_timeto 30 seconds,legal_serverto the IP address of thelegal server, and
alert_program_name2to a shell script which would writeto a log file being observed in real time.
Before this change, replies were visible in pcap stats but alerts were
only triggered intermittently and rarely. After this change, replies are
delivered to
process_response()and the alert program is triggeredconsistently.
Alternative considered
A more direct solution on newer libpcap versions would be to use immediate
mode via
pcap_create(),pcap_set_immediate_mode(), andpcap_activate(),so packets are delivered as soon as they arrive.
This patch keeps the change smaller and preserves the existing
pcap_open_live()capture setup, while still avoiding the race betweenlibpcap's packet buffer timeout and dhcp_probe's response alarm.