The app can emit distributed traces to either Elastic APM or Jaeger via OpenTelemetry. Tracing is off by default and activated by one of two Spring profiles, each paired with its own Docker Compose overlay.
👉 Back to the main README.
- Backends — pick one
- Stack components
- Run with Elastic APM
- Run with Jaeger
- Run without tracing (default)
- Ports & URLs
- Exploring traces in Kibana — guided tour
- Same traces in Jaeger
- Useful filters
- Why Axon splits work across multiple traces
| Backend | Spring profile | Compose overlay | Footprint | UI | When to use |
|---|---|---|---|---|---|
| Elastic APM | observability-elastic |
docker-compose.observability-elastic.yaml |
~2 GB RAM | Kibana APM (rich, service map, KQL) | Production-realistic UX, queries, dashboards |
| Jaeger | observability-jaeger |
docker-compose.observability-jaeger.yaml |
~150 MB RAM | Jaeger UI (simple waterfall) | Quick spin-up, demos, CI, constrained machines |
The two profiles are alternatives — pick the backend you want for a given session. The base observability profile (sampling, OTLP/HTTP transport, Axon OpenTelemetrySpanFactory wiring) is inherited by both via Spring's spring.profiles.group, so switching backends is a one-line change.
| Component | Role |
|---|---|
axon-tracing-opentelemetry |
Instruments command/event/query handlers, aggregates, repositories, event store |
micrometer-tracing-bridge-otel |
Spring Boot's official bridge from Micrometer Observation to OpenTelemetry |
opentelemetry-exporter-otlp |
Pushes traces over OTLP/HTTP to the chosen backend |
| Elastic APM 9.x | Receives OTLP, stores in Elasticsearch, visualizes in Kibana |
| Jaeger 2.x | Receives OTLP directly, in-memory storage, lightweight UI |
Activation surface:
- Base profile
observability— common tracing config (sampling, transport, Axon span factory bean). Never activated directly. - Profile
observability-elastic— base + Elastic APM endpoint - Profile
observability-jaeger— base + Jaeger endpoint - Profile groups in
application.yamlmake the two child profiles automatically include the base.
-
Start the base stack + Elastic observability stack:
docker compose -f docker-compose.yaml -f docker-compose.observability-elastic.yaml up -d
Wait ~60s for Elasticsearch and Kibana to be ready.
-
Run the app with the
observability-elasticprofile:SPRING_PROFILES_ACTIVE=observability-elastic ./mvnw spring-boot:run
Or:
./mvnw spring-boot:run -Dspring-boot.run.profiles=observability-elastic. -
Generate some traffic via Swagger UI at http://localhost:3773/swagger-ui/index.html.
-
Open Kibana APM — see the guided tour below.
-
Start the base stack + Jaeger:
docker compose -f docker-compose.yaml -f docker-compose.observability-jaeger.yaml up -d
Jaeger v2 starts in seconds — UI is ready almost immediately.
-
Run the app with the
observability-jaegerprofile:SPRING_PROFILES_ACTIVE=observability-jaeger ./mvnw spring-boot:run
Or:
./mvnw spring-boot:run -Dspring-boot.run.profiles=observability-jaeger. -
Generate some traffic via Swagger UI at http://localhost:3773/swagger-ui/index.html.
-
Open Jaeger UI at http://localhost:16686 — see Same traces in Jaeger below.
docker compose up -d
./mvnw spring-boot:runNo observability-* profile = no traces emitted, no extra containers needed. Useful for daily development.
| Service | Port | URL | Used by profile |
|---|---|---|---|
| Kibana (APM UI) | 5601 | http://localhost:5601 | observability-elastic |
| Kibana → APM → Services | http://localhost:5601/app/apm/services | observability-elastic |
|
| Kibana → APM → Traces | http://localhost:5601/app/apm/traces | observability-elastic |
|
| Elasticsearch | 9200 | http://localhost:9200 | observability-elastic |
| Elastic APM Server (OTLP receiver) | 8200 | http://localhost:8200/v1/traces | observability-elastic |
| Jaeger UI | 16686 | http://localhost:16686 | observability-jaeger |
| Jaeger OTLP receiver (HTTP) | 4318 | http://localhost:4318/v1/traces | observability-jaeger |
| Jaeger OTLP receiver (gRPC) | 4317 | grpc://localhost:4317 | observability-jaeger |
A quick walkthrough of what you can see and where to click. Examples below were captured after running the full chain BuildDwelling → IncreaseAvailableCreatures → RecruitCreature → GetDwellingById through Swagger UI with the observability-elastic profile active.
Kibana → ☰ → Observability → APM → Services — landing page lists every service emitting traces. After firing a few requests you should see heroesofddd with average latency, throughput, and error rate.
Click heroesofddd → Transactions tab. Each row is a distinct "entry point" — both HTTP endpoints (auto-instrumented by Spring Web) and Axon's async boundaries (each command/event/query handler is a top-level transaction because Axon hops across the gRPC bus and async event processors).
Kibana → APM → Traces shows every individual trace tree.
For this project you'll see roots like:
http put /games/{gameId}/dwellings/{dwellingId}— HTTP rootCommandBus.handleDistributedCommand(RecruitCreature)— command handling on the aggregate side after the gRPC hop to Axon ServerStreamingEventProcessor.process(CreatureRecruited)— projector / automationCommandBus.handleCommand(AddCreatureToArmy)— command emitted by theWhenCreatureRecruitedThenAddToArmyautomationQueryBus.processQueryMessage(GetDwellingById)— query side
Click any CommandBus.handleCommand(...) trace and Kibana renders a waterfall like this — the full call path of the Axon command handler, including aggregate loading and event publication:
What you're looking at:
✓ CommandBus.handleDistributedCommand(RecruitCreature) 22 ms ← gRPC server side
└─ CommandBus.dispatchCommand(RecruitCreature) 22 ms
└─ CommandBus.handleCommand(RecruitCreature) 22 ms
├─ Repository.load 10 ms ← event sourcing
│ ├─ Repository.obtainLock 41 μs
│ └─ Repository.initializeState(Dwelling) 1.0 ms ← rehydrate aggregate
├─ Dwelling.decide(RecruitCreature) 3.2 ms ← AGGREGATE business logic
├─ EventBus.publishEvent(CreatureRecruited) 29 μs
└─ EventBus.commitEvents 5.8 ms ← persist to event store
This is exactly the layered shape from Event Sourcing theory — repository → aggregate → event publication — rendered as data, not as a diagram in a slide.
Click any Axon span (e.g. CommandBus.handleCommand(RecruitCreature)) → Metadata tab. The flyout shows OpenTelemetry attributes — including correlation data injected by GameConfiguration.gameDataProvider:
labels.axon_metadata_gameId = scenario-1 ← from gameDataProvider
labels.axon_metadata_playerId = player-1 ← from gameDataProvider
labels.axon_message_id = 2201ae5d-3871-45b2-a661-...
labels.axon_message_name = com.dddheroes…RecruitCreature
labels.axon_message_type = GrpcBackedCommandMessage
labels.axon_payload_type = com.dddheroes…RecruitCreature
This is the practical payoff: filtering traces by labels.axon_metadata_gameId : "scenario-1" in the Kibana search bar isolates every span — across every aggregate, processor and projector — that participated in one game session.
With observability-jaeger active, the same trace data is produced — Jaeger just renders it differently:
- Open http://localhost:16686.
- Service dropdown → pick
heroesofddd. - Operation dropdown → e.g.
CommandBus.handleCommand(RecruitCreature). - Click Find Traces → see the same waterfall tree (
Repository.load,Dwelling.decide,EventBus.publishEvent, …). - Click any span → "Tags" panel shows the same OTel attributes as Kibana labels, with dot-notation:
axon.message.id,axon.message.name,axon.metadata.gameId,axon.metadata.playerId, etc.
Trade-offs vs Kibana APM:
- ✅ Lighter — one container, instant startup, no Elasticsearch index management
- ✅ Simpler — direct trace search by service / operation / tag / duration
- ❌ No service map — Kibana shows topology between services; Jaeger v2 OSS doesn't
- ❌ No KQL — Jaeger uses a simpler tag-equality search (
tag: axon.metadata.gameId=scenario-1) rather than full KQL - ❌ No persistence by default — all-in-one stores traces in memory; restart loses them
- ❌ No metrics/logs correlation — Kibana correlates traces with the rest of the Elastic stack
Paste into the Kibana search bar (KQL) at the top of any APM page:
| Goal | KQL |
|---|---|
| Traces for one game | labels.axon_metadata_gameId : "scenario-1" |
| Only command handlers | transaction.name : "CommandBus.handleCommand*" |
| Only one aggregate's decisions | span.name : "Dwelling.decide(*)" |
| Only automation reactions | span.name : "*Processor.react(*)" |
| Only event publications | span.name : "EventBus.publishEvent(*)" |
In the Jaeger UI search form, the Tags field accepts space-separated key=value pairs:
| Goal | Tag query |
|---|---|
| Traces for one game | axon.metadata.gameId=scenario-1 |
| One specific player's traces | axon.metadata.playerId=player-1 |
| Combine | axon.metadata.gameId=scenario-1 axon.metadata.playerId=player-1 |
(Operation-level filtering — e.g. "only Dwelling.decide spans" — is done via the Operation dropdown, not the Tags field.)
You'll notice that an HTTP request often produces two or three separate trace trees rather than one giant tree. That's expected. Axon hops over async boundaries that don't preserve OpenTelemetry context: the gRPC call to Axon Server (server-side handleDistributedCommand starts a new root) and the asynchronous event processors (each process(Event) is its own root). Inside one boundary, however, the tree is complete — as the waterfall above shows. Behavior is identical in both Kibana APM and Jaeger.
To stitch sessions together end-to-end, use the axon_metadata_gameId (Kibana) / axon.metadata.gameId (Jaeger) tag filter described above.




