feat(cloudwatch-logs): Add Entity config support to sink#6864
feat(cloudwatch-logs): Add Entity config support to sink#6864bagmarnikhil wants to merge 4 commits into
Conversation
✅ License Header Check PassedAll newly added files have proper license headers. Great work! 🎉 |
Add an optional 'entity' configuration block on the CloudWatch Logs sink that attaches CloudWatch Entity metadata (key_attributes + attributes) to every PutLogEvents request, enabling entity-based correlation in CloudWatch. When 'entity' is omitted the sink behaves identically to before. When configured, an SDK Entity is built and passed to the dispatcher via the builder; the dispatcher conditionally sets it on each PLE request. If CloudWatch rejects the entity, the request is still considered successful (events are released), a WARN log records the RejectedEntityInfo error type, and a new 'cloudWatchLogsEntityRejected' counter is incremented as the primary alarmable signal. Validation is intentionally minimal — @notempty on key_attributes only. AWS-owned limits (max entries, allowed key names, length caps) are enforced server-side and surfaced via the rejection metric, avoiding client-side drift when the Entity API contract changes. Tests follow TDD: each test was written before the code that makes it pass. New unit tests cover EntityConfig defaults, deserialization, and @notempty validation; @Valid cascade from CloudWatchLogsSinkConfig; the new counter; dispatcher entity-on-request, no-entity, and rejection paths; and sink wiring of EntityConfig -> Entity -> dispatcher. TestSinkOperationWithEntity integration test exercises the SDK end-to-end. Acceptance criteria from the spec are met: backward compatibility, entity attached when configured, non-fatal rejection with metric and log, all existing tests green, new tests cover new code paths, license headers on new files. Refs: data-prepper-plugins/cloudwatch-logs/specs/entity-config.md Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
- Remove duplicate @test annotations from auto-merge - Add Hamcrest imports for entity assertion matchers - Update EXPECTED_DISPATCHER_ARITY to 11 (entity field added) Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
f18af61 to
60dc039
Compare
…rejection check - Default keyAttributes and attributes to Collections.emptyMap() - Return Collections.unmodifiableMap() from both getters - Move entity rejection metric before releaseEventHandles - Add unmodifiable map and createResources exception tests Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
dlvenable
left a comment
There was a problem hiding this comment.
Thanks @bagmarnikhil for this contribution!
| @NotEmpty | ||
| private Map<String, String> keyAttributes = Collections.emptyMap(); | ||
|
|
||
| @JsonProperty("attributes") |
| dlqObjects = getDlqObjectsFromResponse(putLogEventsResponse); | ||
| } | ||
| if (putLogEventsResponse != null && putLogEventsResponse.rejectedEntityInfo() != null) { | ||
| cloudWatchLogsMetrics.increaseEntityRejectedCounter(1); |
There was a problem hiding this comment.
A misconfigured entity will return rejectedEntityInfo on every successful PutLogEvents. This WARN will fire per batch under load; consider NOISY marker or rate-limited logging like the SDK exception path.
| public class EntityConfig { | ||
|
|
||
| @JsonProperty("key_attributes") | ||
| @NotEmpty |
There was a problem hiding this comment.
Thanks for the suggestion. I intentionally kept validation minimal here — the Entity API has a complex set of constraints (key names restricted to 5 allowed values, min 2 / max 4 entries for keyAttributes, value lengths 1–512, max 10 attributes, etc.).
Replicating a subset client-side (e.g. @notblank on values) creates a false sense of completeness while still drifting if AWS changes the contract. The rejection metric (cloudWatchLogsEntityRejected) + WARN log are designed to be the primary signal for misconfiguration.
This avoids coupling to server-side limits that may evolve.
| private int retryCount; | ||
| private boolean createLogGroup; | ||
| private boolean createLogStream; | ||
| private Entity entity; |
There was a problem hiding this comment.
Inner Uploader declares its private final fields but the outer dispatcher leaves entity (and the older fields above) non-final. Flag if you want the new field to follow the inner class' immutability convention.
There was a problem hiding this comment.
The existing fields in the outer dispatcher (retryCount, createLogGroup, createLogStream, etc.) are all non-final. Making only entity final would be inconsistent with the rest of the class.
Happy to make it final if we want to refactor all the outer fields to final in a follow-up, but keeping it consistent with the existing style for now.
| assertThat(entityConfig.getAttributes(), aMapWithSize(0)); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
GIVEN_entity_config_with_empty_key_attributes… and GIVEN_entity_config_with_default_key_attributes… exercise the same constraint (empty map default vs explicitly-set empty map). Collapse into one parameterized test
Add an optional 'entity' configuration block on the CloudWatch Logs sink that attaches CloudWatch Entity metadata (key_attributes + attributes) to every PutLogEvents request, enabling entity-based correlation in CloudWatch.
When 'entity' is omitted the sink behaves identically to before. When configured, an SDK Entity is built and passed to the dispatcher via the builder; the dispatcher conditionally sets it on each PLE request.
If CloudWatch rejects the entity, the request is still considered successful (events are released), a WARN log records the RejectedEntityInfo error type, and a new 'cloudWatchLogsEntityRejected' counter is incremented as the primary alarmable signal.
Validation is intentionally minimal — @notempty on key_attributes only. AWS-owned limits (max entries, allowed key names, length caps) are enforced server-side and surfaced via the rejection metric, avoiding client-side drift when the Entity API contract changes.
Tests follow TDD: each test was written before the code that makes it pass. New unit tests cover EntityConfig defaults, deserialization, and @notempty validation; @Valid cascade from CloudWatchLogsSinkConfig; the new counter; dispatcher entity-on-request, no-entity, and rejection paths; and sink wiring of EntityConfig -> Entity -> dispatcher. TestSinkOperationWithEntity integration test exercises the SDK end-to-end.
Acceptance criteria from the spec are met: backward compatibility, entity attached when configured, non-fatal rejection with metric and log, all existing tests green, new tests cover new code paths, license headers on new files.
Refs: data-prepper-plugins/cloudwatch-logs/specs/entity-config.md
Description
[Describe what this change achieves]
Issues Resolved
Resolves #6860
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.