Skip to content

Introducing Otel apm service map processor#6409

Closed
san81 wants to merge 35 commits into
opensearch-project:mainfrom
san81:otel-apm-service-map-processor
Closed

Introducing Otel apm service map processor#6409
san81 wants to merge 35 commits into
opensearch-project:mainfrom
san81:otel-apm-service-map-processor

Conversation

@san81

@san81 san81 commented Jan 15, 2026

Copy link
Copy Markdown
Collaborator

Description

Introducing new Otel Application Performance Monitoring Service Map processor

  • New otel-apm-service-map-processor plugin with processor, models, and utilities
  • OpenSearch index templates for APM service map
  • New utility classes in otel-trace-raw-processor
  • Comprehensive test coverage

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

neeraj-nkm and others added 30 commits December 2, 2025 12:33
Implements time-series based APM service map generation from OpenTelemetry traces using three-window sliding architecture with off-heap storage for scalability. Generates service relationship events and performance metrics.
…e instead of an instant milliseconds type

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Implements time-series based APM service map generation from OpenTelemetry traces using three-window sliding architecture with off-heap storage for scalability. Generates service relationship events and performance metrics.
…e instead of an instant milliseconds type

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
…in workerPartition (opensearch-project#6270)

Signed-off-by: Vecheka Chhourn <vecheka@amazon.com>
…ect#6408)

- Add read_timeout field to ClientOptions with default 60s
- Configure NettyNioAsyncHttpClient with read timeout
- Update README with client configuration examples
- Enables configurable read timeout for Lambda function calls

Signed-off-by: Aiswarya Sadananda Rao <aiswarao@amazon.com>
Co-authored-by: Aiswarya Sadananda Rao <aiswarao@amazon.com>
…search-project#6411)

Support building Docker multi-architecture images and releasing these in the GitHub Actions release project. Continues to build the local architecture with the existing docker release task. Resolves opensearch-project#6405, opensearch-project#6410.

Also stops using the Palatir Docker plugin and uses Docker buildx directly. Resolves opensearch-project#5313.

Signed-off-by: David Venable <dlv@amazon.com>
…Config (opensearch-project#6419)

Signed-off-by: Kennedy Onyia <kennedy.onyia@gmail.com>
…search-project#6420)

Fixes a bug with the invalidEventHandles counter in the PipelineRunner. This metric was being counted for any event that is not a default event (ie. for aggregate events). This would happen even if there is no need to discard the event. This change should count this when aggregate events should be released but are not. We probably need some deeper investigation into how we can properly release aggregate events. But, for now this metric will be more accurate.

Also improves some code to reduce unnecessary variables, use final modifiers, and better legibility.

Signed-off-by: David Venable <dlv@amazon.com>
…365 (opensearch-project#6401)

Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
…es when attaching events to the aggregate group. (opensearch-project#6431)

There is a possible synchronization issue in the aggregate processor. It currently calls attachToEventAcknowledgementSet on the aggregate group outside of any locks. It is possible that one thread gets this group. Then thread two gets the closes the group. If thread 1 then attaches the event to that group, thread 2 may still reset it. The solution is to move attachToEventAcknowledgementSet into the locks.

Signed-off-by: David Venable <dlv@amazon.com>
chrisale000 and others added 5 commits January 27, 2026 11:19
…etricsRecorder (opensearch-project#6428)

Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
…project#6432)

This change adds support for configurable lease interval in the crawler
source plugin, allowing users to customize the leader scheduler's lease
interval instead of using a hardcoded value.

Changes:
- Added getLeaseInterval() method to CrawlerSourceConfig interface with
  default value of 1 minute
- Modified CrawlerSourcePlugin to use the configurable lease interval
  from the source configuration

Signed-off-by: Alexander Christensen <alchrisk@amazon.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
…a-prepper into otel-apm-service-map-processor
@dlvenable

Copy link
Copy Markdown
Member

Replaced by #6479.

@dlvenable dlvenable closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants