Skip to content

[opentelemetry-instrumentation-grpc] Add support for metrics#4621

Open
lorenzoronzani wants to merge 9 commits into
open-telemetry:mainfrom
lorenzoronzani:feature/add-metrics-to-grpc
Open

[opentelemetry-instrumentation-grpc] Add support for metrics#4621
lorenzoronzani wants to merge 9 commits into
open-telemetry:mainfrom
lorenzoronzani:feature/add-metrics-to-grpc

Conversation

@lorenzoronzani
Copy link
Copy Markdown

@lorenzoronzani lorenzoronzani commented May 22, 2026

Description

I am adding metrics support inside gRPC servers and client.
I am following semantic conventions.

Server & Client:

  • Added possibility to provide a meter_provider to create histogram.
  • Added methods to collect metrics inside the interceptor.
  • Added tests to verify common happy paths and bad paths.

I edited also aio components to integrate new edits.

Fixes # (issue)

Issue 3375

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I am in a system that doesn't allow me to have all required packages, I found this solution to run my tests.

  • Execute uv run --no-sync pytest instrumentation/opentelemetry-instrumentation-grpc/tests/ --ignore=instrumentation/opentelemetry-instrumentation-grpc/tests/protobuf/ -q

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated, I need help on this point

@lorenzoronzani lorenzoronzani marked this pull request as draft May 22, 2026 08:12
@lorenzoronzani lorenzoronzani marked this pull request as ready for review May 22, 2026 09:55
@lorenzoronzani lorenzoronzani requested a review from a team as a code owner May 22, 2026 09:55
@aabmass aabmass requested a review from Copilot May 22, 2026 15:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds OpenTelemetry RPC duration metrics to the opentelemetry-instrumentation-grpc package (sync + asyncio), aligning with the RPC metrics semantic conventions by emitting rpc.client.call.duration and rpc.server.call.duration histograms and validating them via new unit tests.

Changes:

  • Added duration histogram creation and recording to gRPC client/server interceptors (sync + grpc.aio), including metric attributes like rpc.system.name, rpc.method, and rpc.response.status_code.
  • Extended public interceptor factories to accept meter_provider, and plumbed meter/target information into client interceptors for server.address/server.port.
  • Added new metric-focused test suites for sync and asyncio client/server paths, plus a changelog entry.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_server.py Records rpc.server.call.duration with semconv attributes.
instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_client.py Records rpc.client.call.duration, parses channel target for server attributes.
instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_aio_server.py Adds duration recording to aio server interceptor paths.
instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_aio_client.py Adds duration recording hooks for aio client interceptor paths.
instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/init.py Exposes meter_provider/target plumbing for interceptor factories and client instrumentors.
instrumentation/opentelemetry-instrumentation-grpc/tests/test_server_interceptor_metrics.py New sync server duration metric tests (OK + error + streaming).
instrumentation/opentelemetry-instrumentation-grpc/tests/test_client_interceptor_metrics.py New sync client duration metric tests (OK + error + streaming).
instrumentation/opentelemetry-instrumentation-grpc/tests/test_aio_server_interceptor_metrics.py New aio server duration metric tests.
instrumentation/opentelemetry-instrumentation-grpc/tests/test_aio_client_interceptor_metrics.py New aio client duration metric tests.
.changelog/4621.added Changelog entry for adding gRPC RPC duration metrics.
Comments suppressed due to low confidence (3)

instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_client.py:239

  • If invoker(...) raises a non-grpc.RpcError exception (e.g., a local serialization / interceptor error before the RPC completes), status_code remains StatusCode.OK, so the metric will incorrectly report an OK response. Consider mapping non-grpc.RpcError failures to grpc.StatusCode.UNKNOWN (or skip recording) so rpc.response.status_code/error.type reflect failure.
            except Exception as exc:
                if isinstance(exc, grpc.RpcError):
                    status_code = exc.code()
                    span.set_attribute(
                        RPC_GRPC_STATUS_CODE,
                        status_code.value[0],
                    )
                span.set_status(
                    Status(
                        status_code=StatusCode.ERROR,
                        description=f"{type(exc).__name__}: {exc}",
                    )
                )
                span.record_exception(exc)

instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_server.py:402

  • Same as unary path: on an uncaught exception in a streaming handler, the wrapped context’s status code will remain OK, but gRPC will return UNKNOWN. This will cause rpc.server.call.duration to be tagged with rpc.response.status_code=OK. Consider ensuring a non-OK status (e.g., UNKNOWN) is recorded when exceptions escape the handler.
                    self._record_duration(
                        handler_call_details,
                        start_time,
                        context._code,
                    )

instrumentation/opentelemetry-instrumentation-grpc/src/opentelemetry/instrumentation/grpc/_aio_server.py:160

  • In the async streaming server interceptor, uncaught exceptions will leave context._self_code as OK, so the duration metric will be tagged with rpc.response.status_code=OK even though gRPC returns UNKNOWN. Consider updating the wrapped context code on exception before recording metrics.
                    except Exception as error:
                        # pylint:disable=unidiomatic-typecheck
                        if type(error) != Exception:  # noqa: E721
                            span.record_exception(error)
                        raise error

                    finally:
                        self._record_duration(
                            handler_call_details,
                            start_time,
                            context._self_code,
                        )

Comment on lines 242 to +246
if result is None:
span.end()
self._record_duration(
client_info.full_method, start_time, status_code
)
Comment on lines +362 to +366
self._record_duration(
handler_call_details,
start_time,
context._code,
)
Comment on lines 113 to +127
except Exception as error:
# Bare exceptions are likely to be gRPC aborts, which
# we handle in our context wrapper.
# Here, we're interested in uncaught exceptions.
# pylint:disable=unidiomatic-typecheck
if type(error) != Exception: # noqa: E721
span.record_exception(error)
raise error

finally:
self._record_duration(
handler_call_details,
start_time,
context._self_code,
)
Comment on lines 474 to 486
def wrapper_fn(self, original_func, instance, args, kwargs):
channel = original_func(*args, **kwargs)
tracer_provider = kwargs.get("tracer_provider")
request_hook = self._request_hook
response_hook = self._response_hook
target = args[0] if args else None
return intercept_channel(
channel,
client_interceptor(
tracer_provider=tracer_provider,
tracer_provider=self._tracer_provider,
filter_=self._filter,
request_hook=request_hook,
response_hook=response_hook,
request_hook=self._request_hook,
response_hook=self._response_hook,
meter_provider=self._meter_provider,
target=target,
),
Comment on lines 543 to 556
def insecure(*args, **kwargs):
kwargs = self._add_interceptors(tracer_provider, kwargs)

target = args[0] if args else None
kwargs = self._add_interceptors(
tracer_provider, meter_provider, target, kwargs
)
return self._original_insecure(*args, **kwargs)

def secure(*args, **kwargs):
kwargs = self._add_interceptors(tracer_provider, kwargs)

target = args[0] if args else None
kwargs = self._add_interceptors(
tracer_provider, meter_provider, target, kwargs
)
return self._original_secure(*args, **kwargs)

Comment on lines +638 to +642
meter = get_meter(
__name__,
__version__,
meter_provider,
)
Comment on lines +48 to +56
def test_unary_call_records_duration_metric(self):
"""A unary client RPC produces an rpc.client.call.duration histogram."""
simple_method(self._stub)

metrics = self.get_sorted_metrics()
duration_metric = next(
(m for m in metrics if m.name == RPC_CLIENT_CALL_DURATION),
None,
)
@lorenzoronzani
Copy link
Copy Markdown
Author

Just an update:

I am integrating comments, unfortunately I didn't have time to do that before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants