Skip to content

Flask instrumentation: http.server.active_requests gauge leaks on exception (missing try/finally) #4431

@liyaka

Description

@liyaka

Description

The Flask instrumentation's _rewrapped_app function increments active_requests_counter before calling wsgi_app() but the decrement is not wrapped in try/finally. If an exception propagates out of wsgi_app(), the counter is never decremented, causing a permanent gauge leak.

This is particularly impactful when using this metric for Kubernetes HPA (Horizontal Pod Autoscaler) — leaked counters cause HPA to see phantom load and refuse to scale down.

Location

instrumentation/opentelemetry-instrumentation-flask/src/opentelemetry/instrumentation/flask/__init__.py

Lines 351–425 in _rewrapped_app:

active_requests_counter.add(1, active_requests_count_attrs)    # L351 - increment
# ...
result = wsgi_app(wrapped_app_environ, _start_response)        # L398 - if this raises...
# ...duration recording (L400-424)...
active_requests_counter.add(-1, active_requests_count_attrs)   # L425 - NEVER REACHED
return result

Expected behavior

The decrement should always execute, regardless of whether wsgi_app() raises an exception.

The WSGI instrumentation already has the correct pattern

instrumentation/opentelemetry-instrumentation-wsgi/src/opentelemetry/instrumentation/wsgi/__init__.py:

try:
    with trace.use_span(span):
        iterable = self.wsgi(environ, start_response)
        return _end_span_after_iterating(iterable, span, token)
except Exception as ex:
    raise
finally:
    self.active_requests_counter.add(-1, active_requests_count_attrs)  # always runs

Suggested fix

Wrap lines 398–425 in try/finally:

active_requests_counter.add(1, active_requests_count_attrs)
request_route = None
# ...
try:
    result = wsgi_app(wrapped_app_environ, _start_response)
    # ...duration histogram recording...
    return result
finally:
    active_requests_counter.add(-1, active_requests_count_attrs)

How to reproduce

  1. Run a Flask app with opentelemetry-instrument and gunicorn
  2. Send requests that trigger exceptions propagating past Flask's error handlers (e.g., OOM-killed gunicorn workers, or middleware errors)
  3. Observe http.server.active_requests gauge — it increments but never decrements for crashed requests
  4. The gauge stays permanently elevated until the process is restarted

Impact

  • HPA scaling: Kubernetes HPA using this metric sees phantom active requests → refuses to scale down → wasted resources
  • Monitoring: Dashboards show incorrect active request counts
  • Alerting: False positives on active request alerts

Environment

  • opentelemetry-instrumentation-flask==0.57b0 (also confirmed present on main / 0.62b0)
  • gunicorn with --workers 1 --threads N --worker-class gthread
  • OTEL SDK 1.36.0, OTLP HTTP exporter, delta temporality

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions