Skip to content

Amazon S3: Mismatch when reading HTTP header from GCS#8791

Open
gouyelliot wants to merge 2 commits into
fluent:masterfrom
gouyelliot:bug/http-header-reading
Open

Amazon S3: Mismatch when reading HTTP header from GCS#8791
gouyelliot wants to merge 2 commits into
fluent:masterfrom
gouyelliot:bug/http-header-reading

Conversation

@gouyelliot
Copy link
Copy Markdown

@gouyelliot gouyelliot commented May 3, 2024

This commit includes the HTTP separator when looking for the header values in the response payload, to avoid mismatching header ending with the same name. See the linked issue for more info and example.

Fixes #8790


Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes
    • Improved HTTP header parsing so Transfer-Encoding, Content-Length, and Connection headers are reliably detected only at line boundaries, preventing false matches and ensuring correct extraction of header values.

Review Change Stack

@vg-meesho
Copy link
Copy Markdown

Hey @PettitWesley, can we get this PR reviewed and merged? Currently, there is no GCS Output Plugin but this fix could help as a workaround to upload to GCS bucket meanwhile.

Copy link
Copy Markdown

@sunhubs sunhubs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why such a bug fix for function is not merged as soon as possible

@vg-meesho
Copy link
Copy Markdown

@PettitWesley @sparrc @singholt @swapneils

Bumping up for review

@vladst3f
Copy link
Copy Markdown

still relevant and needed !

@Daniel-Vaz
Copy link
Copy Markdown

+1 for this

@gouyelliot gouyelliot force-pushed the bug/http-header-reading branch from 4f44e42 to 64b59cb Compare April 20, 2026 00:51
@gouyelliot gouyelliot requested a review from cosmo0920 as a code owner April 20, 2026 00:51
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

Caution

Review failed

The head commit changed during the review from ffff2dc to b739cb6.

📝 Walkthrough

Walkthrough

Fixed HTTP header matching in the HTTP client to constrain header detection to line boundaries by prefixing search patterns with carriage return and line feed (\r\n). This prevents partial header name matches (e.g., x-goog-stored-content-length matching Content-Length) from causing socket read timeouts.

Changes

HTTP Client Header Lookup Fix

Layer / File(s) Summary
Pointer advancement in header parsing
src/flb_http_client.c
header_lookup() now advances the match pointer by header_len immediately on match so EOL scanning and value length computation start at the header value.
Restrict header pattern matching to CRLF-prefixed lines
src/flb_http_client.c
Search patterns for Transfer-Encoding, Content-Length, and Connection were changed to include a leading \r\n, ensuring matches occur only at line boundaries.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A curious bug hopped right into view,
Headers matching when they shouldn't do!
With line breaks added, the fix is clear,
No more mistaken headers to fear!
Content-Length now finds the right place,
At line boundaries, at proper pace! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly relates to the main change: fixing HTTP header mismatch issues when reading from GCS responses, which is precisely what the code modifications address.
Linked Issues check ✅ Passed The code changes fully implement the fix described in #8790 by searching for headers with CRLF prefixes to prevent suffix-matching, resolving the x-goog-stored-content-length false match that caused timeouts.
Out of Scope Changes check ✅ Passed All changes in src/flb_http_client.c are directly scoped to fixing the header lookup function as described in the linked issue; no out-of-scope modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gouyelliot
Copy link
Copy Markdown
Author

I can confirm that this bug is still present in 5.0.3. I rebased my branch on the commit of the release.

@gouyelliot gouyelliot force-pushed the bug/http-header-reading branch from 64b59cb to b739cb6 Compare May 19, 2026 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Amazon S3: Mismatch when reading HTTP header from GCS

8 participants