Unsatisfactory behavior of snapshot selector

There are at least three issues with the mechanism for selecting traces for collecting "snapshots":

- When dealing with distributed traces, the selection for the head service is inconsistent with the other services. Running a test where service A calls B, and choosing selection probability of 10%, I got the following results. Out of 4781 traces, service A was selected 480 times, and service B was selected 448 times. However, the number of traces where both services were selected was only 39.
- Traces which originate from spans different than SERVER or CONSUMER (like resulting from POJO instrumentation) are never selected (however, their downstream calls may still be selected).
- The selection algorithm for downstream services uses the same algorithm as TraceIdRatioBased sampler, which can lead to metrics skew if that sampler is actually used for sampling.

I believe the selection mechanism needs to be redesigned.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsatisfactory behavior of snapshot selector #2689

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unsatisfactory behavior of snapshot selector #2689

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions