Skip to content

Add substring expression functions#6621

Merged
oeyh merged 1 commit into
opensearch-project:mainfrom
bagmarnikhil:feature/substring-expression-functions
Mar 17, 2026
Merged

Add substring expression functions#6621
oeyh merged 1 commit into
opensearch-project:mainfrom
bagmarnikhil:feature/substring-expression-functions

Conversation

@bagmarnikhil

Copy link
Copy Markdown
Contributor

Description

The expression language has no way to extract a portion of a string by delimiter. Existing string processors mutate fields in-place but cannot produce a value for assignment via value_expression.

Add four new expression functions:

  • substringAfter(s, d): text after the first occurrence of d
  • substringBefore(s, d): text before the first occurrence of d
  • substringAfterLast(s, d): text after the last occurrence of d
  • substringBeforeLast(s, d): text before the last occurrence of d

Both arguments accept JSON Pointers or string literals. If the delimiter is not found, the original string is returned. If the source resolves to null, null is returned.

Issues Resolved

Resolves #6612

Check List

  • [Y ] New functionality includes testing.
  • [ N] New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • [Y ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@dlvenable

Copy link
Copy Markdown
Member

@bagmarnikhil , Please rebase this from the latest main and use the approach that we just added in #6626. This changes how the functions receive arguments. It directly impacts your PR.

throw new RuntimeException("substringAfter() takes exactly two arguments");
}

final String[] strArgs = new String[NUMBER_OF_ARGS];

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have an abstract class for substring functions as they all share the same basic behavior.

abstract AbstractSubstringExpressionFunction implements ExpressionFunction

You can use getFunctionName in the user messages.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR based on the suggested refactor

@bagmarnikhil bagmarnikhil force-pushed the feature/substring-expression-functions branch from 48e3bfb to 0d6c6b3 Compare March 13, 2026 18:23
The expression language has no way to extract a portion of a string
by delimiter. Existing string processors mutate fields in-place but
cannot produce a value for assignment via value_expression.

Add four new expression functions:

- substringAfter(s, d): text after the first occurrence of d
- substringBefore(s, d): text before the first occurrence of d
- substringAfterLast(s, d): text after the last occurrence of d
- substringBeforeLast(s, d): text before the last occurrence of d

Both arguments accept JSON Pointers or string literals. If the
delimiter is not found, the original string is returned. If the
source resolves to null, null is returned.

Resolve opensearch-project#6612

Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
@bagmarnikhil bagmarnikhil force-pushed the feature/substring-expression-functions branch from 0d6c6b3 to e545d87 Compare March 13, 2026 18:29
Comment thread docs/expression_syntax.md
- If the IP address is in the range of any given CIDR blocks, the function evaluates to true; otherwise, the function evaluates to false.
- The function supports both IPv4 and IPv6 addresses.
For example, `cidrContains(/sourceIp,"192.0.2.0/24","10.0.1.0/16")` evaluates to true if the event has `sourceIp` field with value "192.0.2.5".
* `substringAfter()`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a follow up PR to the documentation website here (https://github.com/opensearch-project/documentation-website).

This documentation is what users will use (https://docs.opensearch.org/latest/data-prepper/pipelines/functions)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a pull request for documentation: opensearch-project/documentation-website#12094

bagmarnikhil added a commit to bagmarnikhil/documentation-website that referenced this pull request Mar 13, 2026
Add documentation for four new Data Prepper expression functions:
substringAfter, substringBefore, substringAfterLast, and
substringBeforeLast. These functions extract portions of a string
by delimiter and were added in opensearch-project/data-prepper#6621.

Update the functions index page to include the new functions.

Resolves: opensearch-project/data-prepper#6612
Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
kolchfa-aws added a commit to opensearch-project/documentation-website that referenced this pull request Mar 17, 2026
* Add substring expression function documentation

Add documentation for four new Data Prepper expression functions:
substringAfter, substringBefore, substringAfterLast, and
substringBeforeLast. These functions extract portions of a string
by delimiter and were added in opensearch-project/data-prepper#6621.

Update the functions index page to include the new functions.

Resolves: opensearch-project/data-prepper#6612
Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>

* Doc review

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
opensearch-trigger-bot Bot pushed a commit to opensearch-project/documentation-website that referenced this pull request Mar 17, 2026
* Add substring expression function documentation

Add documentation for four new Data Prepper expression functions:
substringAfter, substringBefore, substringAfterLast, and
substringBeforeLast. These functions extract portions of a string
by delimiter and were added in opensearch-project/data-prepper#6621.

Update the functions index page to include the new functions.

Resolves: opensearch-project/data-prepper#6612
Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>

* Doc review

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
(cherry picked from commit 08ff56d)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@oeyh oeyh merged commit 6692596 into opensearch-project:main Mar 17, 2026
70 of 72 checks passed
@bagmarnikhil bagmarnikhil deleted the feature/substring-expression-functions branch May 21, 2026 21:14
aryasoni98 pushed a commit to aryasoni98/documentation-website that referenced this pull request Jun 29, 2026
…2094)

* Add substring expression function documentation

Add documentation for four new Data Prepper expression functions:
substringAfter, substringBefore, substringAfterLast, and
substringBeforeLast. These functions extract portions of a string
by delimiter and were added in opensearch-project/data-prepper#6621.

Update the functions index page to include the new functions.

Resolves: opensearch-project/data-prepper#6612
Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>

* Doc review

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

---------

Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
Co-authored-by: Fanit Kolchina <kolchfa@amazon.com>
Signed-off-by: Arya Soni <aryasoni98@gmail.com>
kkondaka pushed a commit to kkondaka/kk-data-prepper-f2 that referenced this pull request Jul 1, 2026
The expression language has no way to extract a portion of a string
by delimiter. Existing string processors mutate fields in-place but
cannot produce a value for assignment via value_expression.

Add four new expression functions:

- substringAfter(s, d): text after the first occurrence of d
- substringBefore(s, d): text before the first occurrence of d
- substringAfterLast(s, d): text after the last occurrence of d
- substringBeforeLast(s, d): text before the last occurrence of d

Both arguments accept JSON Pointers or string literals. If the
delimiter is not found, the original string is returned. If the
source resolves to null, null is returned.

Resolve opensearch-project#6612

Signed-off-by: Nikhil Bagmar <nikhilbagmar73@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add substring expression functions (substringAfter, substringBefore, substringAfterLast, substringBeforeLast)

4 participants