Skip to content

Commit 1bbb508

Browse files
xrmxherin049
andauthored
urllib3: add support for capturing client headers (open-telemetry#4050)
* urllib3: add support for capture of client headers * Add tests * Add changelog * Please pylint * Apply suggestions from code review Co-authored-by: Lukas Hering <40302054+herin049@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Lukas Hering <40302054+herin049@users.noreply.github.com> * Fix formatting * Update the docstring --------- Co-authored-by: Lukas Hering <40302054+herin049@users.noreply.github.com>
1 parent 8ab7c50 commit 1bbb508

3 files changed

Lines changed: 487 additions & 0 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
3131
([#3959](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/3959))
3232
- `opentelemetry-instrumentation-httpx`: add ability to capture custom headers
3333
([#4047](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4047))
34+
- `opentelemetry-instrumentation-urllib3`: add ability to capture custom headers
35+
([#4050](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4050))
3436
- `opentelemetry-instrumentation-urllib`: add ability to capture custom headers
3537
([#4051](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/4051))
3638

instrumentation/opentelemetry-instrumentation-urllib3/src/opentelemetry/instrumentation/urllib3/__init__.py

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,97 @@ def response_hook(
8888
8989
will exclude requests such as ``https://site/client/123/info`` and ``https://site/xyz/healthcheck``.
9090
91+
Capture HTTP request and response headers
92+
*****************************************
93+
You can configure the agent to capture specified HTTP headers as span attributes, according to the
94+
`semantic conventions <https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-client-span>`_.
95+
96+
Request headers
97+
***************
98+
To capture HTTP request headers as span attributes, set the environment variable
99+
``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST`` to a comma delimited list of HTTP header names.
100+
101+
For example using the environment variable,
102+
::
103+
104+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST="content-type,custom_request_header"
105+
106+
will extract ``content-type`` and ``custom_request_header`` from the request headers and add them as span attributes.
107+
108+
Request header names in urllib3 are case-insensitive. So, giving the header name as ``CUStom-Header`` in the environment
109+
variable will capture the header named ``custom-header``.
110+
111+
Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example:
112+
::
113+
114+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST="Accept.*,X-.*"
115+
116+
Would match all request headers that start with ``Accept`` and ``X-``.
117+
118+
To capture all request headers, set ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST`` to ``".*"``.
119+
::
120+
121+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST=".*"
122+
123+
The name of the added span attribute will follow the format ``http.request.header.<header_name>`` where ``<header_name>``
124+
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
125+
single item list containing all the header values.
126+
127+
For example:
128+
``http.request.header.custom_request_header = ["<value1>", "<value2>"]``
129+
130+
Response headers
131+
****************
132+
To capture HTTP response headers as span attributes, set the environment variable
133+
``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE`` to a comma delimited list of HTTP header names.
134+
135+
For example using the environment variable,
136+
::
137+
138+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE="content-type,custom_response_header"
139+
140+
will extract ``content-type`` and ``custom_response_header`` from the response headers and add them as span attributes.
141+
142+
Response header names in urllib3 are case-insensitive. So, giving the header name as ``CUStom-Header`` in the environment
143+
variable will capture the header named ``custom-header``.
144+
145+
Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example:
146+
::
147+
148+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE="Content.*,X-.*"
149+
150+
Would match all response headers that start with ``Content`` and ``X-``.
151+
152+
To capture all response headers, set ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE`` to ``".*"``.
153+
::
154+
155+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE=".*"
156+
157+
The name of the added span attribute will follow the format ``http.response.header.<header_name>`` where ``<header_name>``
158+
is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a
159+
list containing the header values.
160+
161+
For example:
162+
``http.response.header.custom_response_header = ["<value1>", "<value2>"]``
163+
164+
Sanitizing headers
165+
******************
166+
In order to prevent storing sensitive data such as personally identifiable information (PII), session keys, passwords,
167+
etc, set the environment variable ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS``
168+
to a comma delimited list of HTTP header names to be sanitized.
169+
170+
Regexes may be used, and all header names will be matched in a case-insensitive manner.
171+
172+
For example using the environment variable,
173+
::
174+
175+
export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS=".*session.*,set-cookie"
176+
177+
will replace the value of headers such as ``session-id`` and ``set-cookie`` with ``[REDACTED]`` in the span.
178+
179+
Note:
180+
The environment variable names used to capture HTTP headers are still experimental, and thus are subject to change.
181+
91182
API
92183
---
93184
"""
@@ -142,8 +233,15 @@ def response_hook(
142233
)
143234
from opentelemetry.trace import Span, SpanKind, Tracer, get_tracer
144235
from opentelemetry.util.http import (
236+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST,
237+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE,
238+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS,
145239
ExcludeList,
240+
get_custom_header_attributes,
241+
get_custom_headers,
146242
get_excluded_urls,
243+
normalise_request_header_name,
244+
normalise_response_header_name,
147245
parse_excluded_urls,
148246
sanitize_method,
149247
)
@@ -212,6 +310,9 @@ def _instrument(self, **kwargs):
212310
to adding it as a span attribute.
213311
``excluded_urls``: A string containing a comma-delimited
214312
list of regexes used to exclude URLs from tracking
313+
``captured_request_headers``: An optional sequence of header names to capture from the request headers
314+
``captured_response_headers``: An optional sequence of header names to capture from the response headers
315+
``sensitive_headers``: An optional sequence of captured header names to redact
215316
"""
216317
# initialize semantic conventions opt-in if needed
217318
_OpenTelemetrySemanticConventionStability._initialize()
@@ -278,6 +379,7 @@ def _instrument(self, **kwargs):
278379
response_size_histogram_new = (
279380
create_http_client_response_body_size(meter)
280381
)
382+
281383
_instrument(
282384
tracer,
283385
duration_histogram_old,
@@ -295,6 +397,24 @@ def _instrument(self, **kwargs):
295397
else parse_excluded_urls(excluded_urls)
296398
),
297399
sem_conv_opt_in_mode=sem_conv_opt_in_mode,
400+
captured_request_headers=kwargs.get(
401+
"captured_request_headers",
402+
get_custom_headers(
403+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_REQUEST
404+
),
405+
),
406+
captured_response_headers=kwargs.get(
407+
"captured_response_headers",
408+
get_custom_headers(
409+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_CLIENT_RESPONSE
410+
),
411+
),
412+
sensitive_headers=kwargs.get(
413+
"sensitive_headers",
414+
get_custom_headers(
415+
OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS
416+
),
417+
),
298418
)
299419

300420
def _uninstrument(self, **kwargs):
@@ -321,7 +441,11 @@ def _instrument(
321441
url_filter: _UrlFilterT = None,
322442
excluded_urls: ExcludeList = None,
323443
sem_conv_opt_in_mode: _StabilityMode = _StabilityMode.DEFAULT,
444+
captured_request_headers: typing.Optional[list[str]] = None,
445+
captured_response_headers: typing.Optional[list[str]] = None,
446+
sensitive_headers: typing.Optional[list[str]] = None,
324447
):
448+
# pylint: disable=too-many-locals
325449
def instrumented_urlopen(wrapped, instance, args, kwargs):
326450
if not is_http_instrumentation_enabled():
327451
return wrapped(*args, **kwargs)
@@ -345,6 +469,15 @@ def instrumented_urlopen(wrapped, instance, args, kwargs):
345469
)
346470
_set_http_url(span_attributes, url, sem_conv_opt_in_mode)
347471

472+
span_attributes.update(
473+
get_custom_header_attributes(
474+
headers,
475+
captured_request_headers,
476+
sensitive_headers,
477+
normalise_request_header_name,
478+
)
479+
)
480+
348481
with (
349482
tracer.start_as_current_span(
350483
span_name, kind=SpanKind.CLIENT, attributes=span_attributes
@@ -402,6 +535,16 @@ def instrumented_urlopen(wrapped, instance, args, kwargs):
402535
sem_conv_opt_in_mode,
403536
)
404537

538+
if span.is_recording():
539+
span.set_attributes(
540+
get_custom_header_attributes(
541+
response.headers,
542+
captured_response_headers,
543+
sensitive_headers,
544+
normalise_response_header_name,
545+
)
546+
)
547+
405548
return response
406549

407550
wrapt.wrap_function_wrapper(

0 commit comments

Comments
 (0)