why does `LogRecord` store the entire context it was called with

When my team memory profiled our instrumented Python applications, we noticed that logs generated during a span held on to more allocated memory than the `Span` itself did. While troubleshooting, we noticed that the constructor for a `LogRecord` defined here

https://github.com/open-telemetry/opentelemetry-python/blob/fb94553d6959b4c793a98dd586ca9790a685c06b/opentelemetry-api/src/opentelemetry/_logs/_internal/__init__.py#L99-L121

explicitly holds on to to the `Context` as an attribute of a `LogRecord` (see line 121), after that context has been used to gather the current span's identifiers/flags. Is there a reason for doing this? We're not entirely sure if this is the exact cause of our memory observations yet, but storing the context like this ties that context's reference to a Span, any Baggage on the context, and any other attributes we've written to the context as well, until the `LogRecord` is successfully exported..

Multiple other sdk implementations don't behave like this, only storing the `SpanContext` extracted from `Context`. 

Go: https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/log/logger.go#L87
JS: https://github.com/open-telemetry/opentelemetry-js/blob/356ddeefb599493837596cb710f2f93e3b279e29/experimental/packages/sdk-logs/src/LogRecordImpl.ts#L87

So it appears that Python is the odd man out here. I get that https://github.com/open-telemetry/opentelemetry-python/issues/4328 had wanted to make baggage accessible from a log record, so that might have been the motivation? But why the difference between impls?

At the same time, I think the spec is unclear about what should be done here. I see that https://github.com/open-telemetry/opentelemetry-specification/blob/v1.39.0/specification/logs/api.md#emit-a-logrecord declares that a LogRecord must define an optional Context argument in its constructor that must be 'associated with the Record'. But that doesn't imply that the context must be stored as an attribute on the record. 

	def __init__(
	self,
	*,
	timestamp: Optional[int] = None,
	observed_timestamp: Optional[int] = None,
	context: Optional[Context] = None,
	trace_id: Optional[int] = None,
	span_id: Optional[int] = None,
	trace_flags: Optional["TraceFlags"] = None,
	severity_text: Optional[str] = None,
	severity_number: Optional[SeverityNumber] = None,
	body: AnyValue = None,
	attributes: Optional[_ExtendedAttributes] = None,
	event_name: Optional[str] = None,
	) -> None:
	if not context:
	context = get_current()
	span_context = get_current_span(context).get_span_context()
	self.timestamp = timestamp
	if observed_timestamp is None:
	observed_timestamp = time_ns()
	self.observed_timestamp = observed_timestamp
	self.context = context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why does `LogRecord` store the entire context it was called with #4957

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

why does LogRecord store the entire context it was called with #4957

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

why does `LogRecord` store the entire context it was called with #4957