Skip to content

Add wrap_entries processor to mutate-event-processors#6665

Merged
kkondaka merged 7 commits into
opensearch-project:mainfrom
nishantKadivar:main
Apr 14, 2026
Merged

Add wrap_entries processor to mutate-event-processors#6665
kkondaka merged 7 commits into
opensearch-project:mainfrom
nishantKadivar:main

Conversation

@nishantKadivar

@nishantKadivar nishantKadivar commented Mar 22, 2026

Copy link
Copy Markdown
Contributor

Add wrap_entries processor to mutate-event-processors

Description

Adds a new wrap_entries processor that wraps each element of a primitive array into an object using a configured key name. This enables downstream processors like add_entries and delete_entries with iterate_on, which require List<Map<String, Object>> and cannot operate on primitive arrays (strings, numbers, booleans).

For example, ["alice", "bob"] becomes [{"name": "alice"}, {"name": "bob"}].

Issues Resolved

#6611

Configuration Options

Option Required Type Default Description
source Yes String Key of the primitive array to transform (JSON Pointer)
target No String source Key to write the result to (defaults to in-place)
key Yes String Key name in each resulting object
exclude_null_empty_values No Boolean false Filter out null/empty elements before wrapping
append_if_target_exists No Boolean false Append to existing target array instead of overwriting
map_entries_when No String null Conditional expression evaluated at root event level
tags_on_failure No List null Tags to add on processing failure

Check List

  • New functionality includes testing
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

End-to-End Test Results

Tested manually using Data Prepper's file source and file sink.

Pipeline Configuration

map-entries-pipeline:
  source:
    file:
      path: "pipelines/input.json"
      format: "json"
      record_type: "event"
  processor:
    - wrap_entries:
        source: "/names"
        key: "name"
        map_entries_when: '/type == "users"'
    - wrap_entries:
        source: "/tags"
        key: "value"
        map_entries_when: '/type == "tagged"'
    - wrap_entries:
        source: "/items"
        target: "/inventory_items"
        key: "product"
        map_entries_when: '/type == "inventory"'
    - wrap_entries:
        source: "/scores"
        key: "score"
        map_entries_when: '/type == "grades"'
  sink:
    - file:
        path: "pipelines/output.json"

Input (5 records)

{"names": ["alice", "bob", "charlie"], "type": "users"}
{"tags": ["critical", "network", "security"], "source": "firewall", "type": "tagged"}
{"list_a": [{"url": "http://api.example.com/v1"}, {"url": "http://api.example.com/v2"}], "list_b": [{"url": "http://cdn.example.com/assets"}], "type": "urls"}
{"items": ["laptop", "monitor", "keyboard", "mouse"], "department": "IT", "type": "inventory"}
{"scores": [95, 82, 78, 91, 88], "subject": "math", "type": "grades"}

Output (actual results)

{"names":[{"name":"alice"},{"name":"bob"},{"name":"charlie"}],"type":"users"}
{"tags":[{"value":"critical"},{"value":"network"},{"value":"security"}],"source":"firewall","type":"tagged"}
{"list_a":[{"url":"http://api.example.com/v1"},{"url":"http://api.example.com/v2"}],"list_b":[{"url":"http://cdn.example.com/assets"}],"type":"urls"}
{"items":["laptop","monitor","keyboard","mouse"],"department":"IT","type":"inventory","inventory_items":[{"product":"laptop"},{"product":"monitor"},{"product":"keyboard"},{"product":"mouse"}]}
{"scores":[{"score":95},{"score":82},{"score":78},{"score":91},{"score":88}],"subject":"math","type":"grades"}

E2E Scenarios

# Scenario Result
1 String array in-place wrap (type=users) PASS
2 String array wrap with different key (type=tagged) PASS
3 Condition not met — no mutation (type=urls) PASS
4 Separate target preserves original (type=inventory) PASS
5 Integer array wrap (type=grades) PASS

Adds a new map_entries processor that wraps each element of a primitive
array into an object using a configured key name. This enables downstream
processors like add_entries and delete_entries with iterate_on, which
require List of Map and cannot operate on primitive arrays.

Example: ["alice", "bob"] -> [{"name": "alice"}, {"name": "bob"}]

Configuration options:
- source (required): key of the primitive array to transform
- target (optional): key to write result to (defaults to source)
- key (required): key name in each resulting object
- exclude_null_empty_values: filter out null/empty elements
- append_if_target_exists: append to existing target array
- map_entries_when: conditional expression
- tags_on_failure: tags on processing failure

Includes 21 unit tests and 3 config tests.

Signed-off-by: Nishant Kadivar <nimahesx@amazon.com>

if (!event.containsKey(source)) {
LOG.warn(EVENT, "Source key [{}] does not exist in event [{}], skipping.", source, event);
addTagsOnFailure(event);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a missing source key really a failure? In a pipeline processing mixed event types, lots of events won't have this key — tagging all of them as failed seems noisy. The other mutate-event processors (SelectEntriesProcessor, FilterListProcessor) treat missing keys as a silent no-op. I'd drop the addTagsOnFailure here and make it a debug-level log instead, reserving failure tags for actual errors like "source exists but isn't a list."

continue;
}
}
final Map<String, Object> wrapped = new HashMap<>(1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this map is never modified after the put, Collections.singletonMap(key, element) would be a better fit — immutable, less memory, and communicates that it's a single-entry map.

"for valid expression syntax", config.getMapEntriesWhen()));
}

if (config.getTagsOnFailure() != null) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: none of the other mutate-event processors validate tag contents like this (AddEntryProcessor, SelectEntriesProcessor, etc.). Not wrong, but it's inconsistent

@nishantKadivar nishantKadivar Mar 27, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with the other processor, I have removed tag content validation

result.add(wrapped);
}

if (result.isEmpty()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When exclude_null_empty_values: true and all elements get filtered out, this early return leaves the original primitive array untouched. But if even one element
survives, the target gets overwritten with an object array. This means the same processor produces different output types depending on the data:

  • Input [null, ""] → output stays [null, ""] (primitives)
  • Input [null, "ok"] → output becomes [{"key": "ok"}] (objects)

Downstream processors like add_entries with iterate_on expect a consistent List — they'll break on the primitive case. I'd remove this early return and let it fall through to event.put(effectiveTarget, result) so the output is always [] when nothing survives filtering.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the if (result.isEmpty()) { return; } check and let it fall through to event.put(effectiveTarget, result). That way the output is always List<Map<String, Object>> — either [] or [{"key": "val"}, ...] — regardless of whether filtering removed everything.

…ty filtered results

- Change missing source key from warn+tags_on_failure to silent debug
  log, consistent with other mutate-event processors
- Remove early return when all elements are filtered out by
  exclude_null_empty_values, ensuring consistent List<Map> output type
- Update test and README to reflect new behavior

Signed-off-by: Nishant Kadivar <nimahesx@amazon.com>
@kkondaka

Copy link
Copy Markdown
Collaborator

Looks good. Not sure about the name of the processor. Looks like it's doing list to list with each list item becoming a map?

@github-actions

github-actions Bot commented Mar 30, 2026

Copy link
Copy Markdown

✅ License Header Check Passed

All newly added files have proper license headers. Great work! 🎉

@nishantKadivar

Copy link
Copy Markdown
Contributor Author

@kkondaka - The name map_entries was chosen because the processor maps each entry in the list into a key-value structure, aligning with the naming of other mutate-event processors.
Open for suggestion

@nishantKadivar

Copy link
Copy Markdown
Contributor Author

The following newly added files are missing required license headers:

data-prepper-plugins/mutate-event-processors/src/test/resources/org/opensearch/dataprepper/plugins/processor/mutateevent/map_entries_when_filters_records.yaml
Please add the appropriate license header to each file and push your changes.

See the license header requirements: https://github.com/opensearch-project/data-prepper/blob/main/CONTRIBUTING.md#license-headers

Regarding this violation, We do not have headers specifically defined in these YAML files, as we observed that they were not included in any of the existing processor files. This was done to maintain consistency across processors. Let me know if that needed to be included.

@dlvenable

Copy link
Copy Markdown
Member

Looks good. Not sure about the name of the processor. Looks like it's doing list to list with each list item becoming a map?

I think this is a good point. "Map entries" sounds like it is running a map function. The word map can be confused with a verb.

@dlvenable dlvenable left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nishantKadivar for this contribution. I have a few comments. We should change the name as well.

return new Record<>(JacksonEvent.builder().withEventType("event").withData(data).build());
}

// --- Constructor validation delegation ---

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid these comments for sections. Remove them or use @Nested classes.

@@ -0,0 +1,10 @@
test-pipeline:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a header.

continue;
}
}
result.add(Collections.singletonMap(key, element));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that JacksonEvent will take this as-is and not remap. The singletonMap is immutable and thus would fail to mutate in downstream processors. Let's create a new map and put that.

@dlvenable

Copy link
Copy Markdown
Member

Looks good. Not sure about the name of the processor. Looks like it's doing list to list with each list item becoming a map?

I think this is a good point. "Map entries" sounds like it is running a map function. The word map can be confused with a verb.

The key verb you use in the README is "wrap." This processor name should start with wrap_ since that is the action it is taking.


@Test
void doExecute_with_string_array_wraps_each_element_into_object_in_place() {
final Record<Event> record = createEvent(Map.of("names", Arrays.asList("alpha", "beta")));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some test cases when the names are not primitives.

e.g.

List.of(Map.of("name": "alpha"), Map.of("name": "beta"))

and

List.of(List.of("alpha1": "beta1"), List.of("alpha2": "beta2"))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added test cases for both the scenarios.

}

@Test
void doExecute_with_map_elements_wraps_each_map_into_object() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the desirable behavior? I'm ok to approve now, but we should follow up and discuss if we want to have this before we release 2.16.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dlvenable ,
I have raised an issue for the discussion about the wrap_entries behaviour #6753.

@nishantKadivar nishantKadivar changed the title Add map_entries processor to mutate-event-processors Add wrap_entries processor to mutate-event-processors Apr 11, 2026
@kkondaka kkondaka merged commit 79455e6 into opensearch-project:main Apr 14, 2026
73 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants