Skip to content

Passing http request headers in the event metadata for http source#6671

Merged
dlvenable merged 1 commit into
opensearch-project:mainfrom
divbok:main
Apr 8, 2026
Merged

Passing http request headers in the event metadata for http source#6671
dlvenable merged 1 commit into
opensearch-project:mainfrom
divbok:main

Conversation

@divbok

@divbok divbok commented Mar 24, 2026

Copy link
Copy Markdown
Collaborator

Description

Adds a new metadata_headers configuration option to the HTTP source plugin that allows users to extract HTTP request headers and attach them as event metadata. This enables downstream processors and sinks to route, filter, or enrich events based on HTTP headers.

How it works

  • Whitelist mode: Configure specific header names to extract:
    metadata_headers: ["X-Tenant-Id", "X-Region"]
  • Disabled (default): When metadata_headers is not set or null, no headers are extracted.

Header key normalization

  • All header keys are lowercased in event metadata regardless of how they were sent by the client or configured in the YAML. This is because Armeria (the underlying HTTP server) normalizes all header names to lowercase per the HTTP/2 specification. For example, a header sent as X-Tenant-Id is stored in metadata as x-tenant-id.
  • Sensitive header filtering
    A hardcoded blocklist of sensitive headers is always filtered out, regardless of configuration. This includes authorization, cookie, x-api-key, x-amz-security-token, x-amz-credential, and others.
  • Multi-value headers
    Headers with multiple values (e.g., X-Forwarded-For appearing twice) are stored as a List. Single-value headers are stored as a plain String.
  • Current limitations
    This feature is currently only supported for the in-memory buffer (BlockingBuffer). The byte buffer / Kafka buffer path does not propagate metadata headers yet.

Example pipeline

log-pipeline:
source:
  http:
    port: 2021
    path: "/log/ingest"
    metadata_headers: ["X-Tenant-Id", "X-Region"]
route:
  - tenant_alpha: 'getMetadata("headers/x-tenant-id") == "team-alpha"'
  - us_west: 'getMetadata("headers/x-region") == "us-west-2"'
sink:
  - file:
      path: "/tmp/alpha.log"
      routes: ["tenant_alpha"]
  - file:
      path: "/tmp/us_west.log"
      routes: ["us_west"]
curl -X POST http://localhost:2021/log/ingest \
  -H "Content-Type: application/json" \
  -H "X-Tenant-Id: team-alpha" \
  -H "X-Region: us-west-2" \
  -d '[{"message": "hello"}]'

Issues Resolved

Resolves #6239

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@divbok divbok changed the title Passing http request headers as metadata in the event for http source Passing http request headers in the event metadata for http source Mar 24, 2026

@dlvenable dlvenable left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @divbok for this contribution!

return Collections.emptyMap();
}

final boolean includeAll = metadataHeaders.size() == 1 && "*".equals(metadataHeaders.get(0));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about supporting * without supporting regex in general. I think if we wanted this, we should add metadata_headers_regex instead. This would be consistent with other processors that support literals and regex.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the regex config in a follow up PR

final Map<String, Object> headers = new HashMap<>();
for (String headerName : headerNames) {
if (isSensitiveHeader(headerName)) {
LOG.warn("Skipping sensitive header '{}' from metadata_headers configuration", headerName);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need any logging here. Maybe a TRACE if you want.

return headers;
}

static boolean isSensitiveHeader(final String headerName) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in its own class and test it there.

public void processRequestAttachesHeadersToEventMetadata() throws Exception {
final String tenantId = UUID.randomUUID().toString();
final String region = UUID.randomUUID().toString();
BlockingBuffer<Record<Log>> blockingBuffer = new BlockingBuffer<>(TEST_BUFFER_CAPACITY, 8, "test-pipeline");

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a mocked Buffer instead. Then you don't need to use read. Verify that it is called and use an ArgumentCaptor.

return HttpResponse.of(HttpStatus.OK);
}

private Map<String, Object> extractHeaders(final AggregatedHttpRequest aggregatedHttpRequest) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think that this should go into it's own class. You can unit test this more easily. Then verify an interaction with the mock in the unit tests for LogHTTPService.

return new Record<>(log);
.getThis();
if (!headerAttributes.isEmpty()) {
builder.withEventMetadataAttributes(headerAttributes);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach will put all the headers as metadata in the root of the metadata. I wonder if it would make more sense to have these under headers in the metadata. Do we support nested metadata? If not, then I guess this is the approach we should take.

@github-actions

github-actions Bot commented Mar 26, 2026

Copy link
Copy Markdown

✅ License Header Check Passed

All newly added files have proper license headers. Great work! 🎉

@divbok divbok force-pushed the main branch 2 times, most recently from 79c50d1 to 788c85b Compare March 26, 2026 16:18
private PluginModel codec;

@JsonProperty("metadata_headers")
private List<String> metadataHeaders;

@oeyh oeyh Apr 2, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using empty list as default. Otherwise it's null by default and will need null check wherever it's used.

Suggested change
private List<String> metadataHeaders;
private List<String> metadataHeaders = Collections.emptyList();


HttpResponse processRequest(final AggregatedHttpRequest aggregatedHttpRequest) throws Exception {
final HttpData content = aggregatedHttpRequest.content();
final Map<String, Object> extractedHeaders = httpHeaderExtractor.extractHeaders(aggregatedHttpRequest);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All records in a batch share the exact same extractedHeaders map instance in their metadata. If a processor downstream somehow mutates the map, it will affect other records in the batch, might cause a surprise. We can wrap extractedHeaders in Collections.unmodifiableMap() to prevents accidental in-place mutation.

assertEquals(HTTPSourceConfig.DEFAULT_LOG_INGEST_URI, sourceConfig.getPath());
assertEquals(HTTPSourceConfig.DEFAULT_PORT, sourceConfig.getDefaultPort());
assertEquals(HTTPSourceConfig.DEFAULT_LOG_INGEST_URI, sourceConfig.getDefaultPath());
assertNull(sourceConfig.getMetadataHeaders());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is causing the failure:

HTTPSourceConfigTest > testDefault() FAILED
    org.opentest4j.AssertionFailedError: expected: <null> but was: <[]>

This is now an empty collection by default which is good.


return new Record<>(log);
.getThis();
if (!headerAttributes.isEmpty()) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we consolidate this block with line 138? We seem to have a split in how we create records that could result in discrepancies.

import java.util.stream.Collectors;


/*

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to remove this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it back.

final Buffer<Record<Log>> buffer,
final PluginMetrics pluginMetrics,
final InputCodec codec) {
this(bufferWriteTimeoutInMillis, buffer, pluginMetrics, codec, new HttpHeaderExtractor(null));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use an empty collection for consistency. new HttpHeaderExtractor() or new HttpHeaderExtractor(Collections.emptySet())

private final List<String> metadataHeaders;

public HttpHeaderExtractor(final List<String> metadataHeaders) {
this.metadataHeaders = metadataHeaders;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Require this to be non-null.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

public Map<String, Object> extractHeaders(final AggregatedHttpRequest aggregatedHttpRequest) {
if (metadataHeaders == null || metadataHeaders.isEmpty()) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove metadataHeaders == null after other changes I suggested.


private final List<String> metadataHeaders;

public HttpHeaderExtractor(final List<String> metadataHeaders) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be Collection<String> You don't require ordering.

Signed-off-by: Divyansh Bokadia <dbokadia@amazon.com>

@dlvenable dlvenable left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution @divbok !

@dlvenable dlvenable merged commit 1ac64aa into opensearch-project:main Apr 8, 2026
71 of 72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support ingesting http headers with http source

3 participants