Skip to content

ENH: encryption extension#5581

Merged
chenqi0805 merged 110 commits into
opensearch-project:mainfrom
chenqi0805:enh/encryption-extension-3
May 29, 2025
Merged

ENH: encryption extension#5581
chenqi0805 merged 110 commits into
opensearch-project:mainfrom
chenqi0805:enh/encryption-extension-3

Conversation

@chenqi0805
Copy link
Copy Markdown
Collaborator

@chenqi0805 chenqi0805 commented Apr 2, 2025

Description

This PR

  • implements encryption pipeline extension
  • supports kms encryption engine in the extension
  • supports encryption key rotation by
    • periodically polling new key stored in either s3 folder path or local file directory.
    • allowing user to trigger encrypted data key rotation by calling /encryption/rotate API on Data Prepper server

Issues Resolved

Resolves #5335
Resolves #5336

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

chenqi0805 and others added 30 commits April 2, 2025 14:19
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…nsearch-project#5272)

Signed-off-by: RashmiRam <ras.xena@gmail.com>
Signed-off-by: George Chen <qchea@amazon.com>
…t#5344)

* Handling end to end acknowledgement

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* introduced boolean to control end to end Acknowledgment

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* acknowledgments on case

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

---------

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
* add batch size field for jira source

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>

* remove unused config fields

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>

* add interface function to simplify batchSize code

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>

* default batch size comments

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>

---------

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>
Signed-off-by: Maxwell Brown <55033421+Galactus22625@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
…ject#5358)

Signed-off-by: Taylor Gray <tylgry@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…project#5310)

* First working version

Signed-off-by: Hai Yan <oeyh@amazon.com>

* More progress and update existing unit tests

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Add unit tests

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Remove and rename classes

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Remove test code

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Address review comments

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Address minor issues

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Group MySQL and Postgres stream states

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Address more comments

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Fix Java21 build

Signed-off-by: Hai Yan <oeyh@amazon.com>

---------

Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
* Handling end to end acknowledgement

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* checking pointing leader state for every one minute

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* corresponding test cases fix

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

---------

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
…roject#5362)

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Taylor Gray <tylgry@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…arch-project#5320)

* lambda processor should retry for certain class of exceptions

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address Comment on complete codec

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Add retryCondidition to lambda Client

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address comments

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address comments and add UT and IT

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address comment on completeCodec

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

---------

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…partition.assignment.strategy, close consumer on shutdown (opensearch-project#5373)

Signed-off-by: Taylor Gray <tylgry@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Improve Jira logging

Signed-off-by: Maxwell Brown <mxwelwbr@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
* injectable plugin metrics

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* removed an unused parameter

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

* fixing a flaky test

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>

---------

Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…pensearch-project#5361)

* initial refactoring

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* refactored sqs-source to use sqs-common

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* refactored SqsWorker to use the common library

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* minor changes

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* another small fix

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* added unit tests for sqs-common

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

* updated tests

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>

---------

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>
Co-authored-by: Jeremy Michael <jsusanto@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
* schema revisions, add json aliases

Signed-off-by: Katherine Shen <katshen@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…pensearch-project#5375)

* Initial commit

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Update unit tests

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Add more metrics

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Add more tests

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Address review comments

Signed-off-by: Hai Yan <oeyh@amazon.com>

* Address review comments

Signed-off-by: Hai Yan <oeyh@amazon.com>

---------

Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
* Add cloudwatch logs sink

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>

* Addressed review comments

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>

---------

Signed-off-by: Krishna Kondaka <krishkdk@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…bles. (opensearch-project#5417)

Signed-off-by: David Venable <dlv@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…m from the CODEOWNERS, so this keeps these in sync. (opensearch-project#5419)

Signed-off-by: David Venable <dlv@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…DelayTimer Metric for Auto-Scaling (opensearch-project#5409)

Signed-off-by: Jeremy Michael <jsusanto@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Hai Yan <oeyh@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
… messages that have been received many times (opensearch-project#5408)

Signed-off-by: Taylor Gray <tylgry@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
…opensearch-project#5420)

Signed-off-by: David Venable <dlv@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
* Zero Buffer Implementation and Tests

Signed-off-by: Mohammed Aghil Puthiyottil <57040494+MohammedAghil@users.noreply.github.com>

* Moved ZeroBuffer Implementation into data-prepper-core and addressed comments

Signed-off-by: Mohammed Aghil Puthiyottil <57040494+MohammedAghil@users.noreply.github.com>

* Modified ZeroBufferTests to use MockitoExtension and addressed comments

Signed-off-by: Mohammed Aghil Puthiyottil <57040494+MohammedAghil@users.noreply.github.com>

---------

Signed-off-by: Mohammed Aghil Puthiyottil <57040494+MohammedAghil@users.noreply.github.com>
Signed-off-by: George Chen <qchea@amazon.com>
* Fix merge conflict

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address concurrency/synchronization comment

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Fix InMemoryBufferSynchronized and Add IT

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Address timeout threshold comment

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Add IT for timeout threshold

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

* Fix checkstyle

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>

---------

Signed-off-by: Srikanth Govindarajan <srigovs@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
implementation libs.commons.io
implementation libs.caffeine
implementation 'org.hibernate.validator:hibernate-validator:8.0.2.Final'
testImplementation(platform("org.junit:junit-bom:5.10.0"))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need these two lines.

testImplementation("org.junit.jupiter:junit-jupiter")
}

test {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this.

* SPDX-License-Identifier: Apache-2.0
*/

plugins {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this block.


package org.opensearch.dataprepper.model.encryption;

import com.sun.net.httpserver.HttpHandler;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's find a way to keep this out of data-prepper-api.

A simple solution would be to create a coupling between data-prepper-core and extension-plugin. I recommend this over exposing this in data-prepper-api. Longer-term we can have a better approach for extending the server.

this.awsCredentialsConfig = awsCredentialsConfig;
}

public AwsCredentialsProvider getOrDefault() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks similar to what we already support in the aws-plugin.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class was refactored out from kafka-plugins and I did not see similar code in aws-plugin. The plan is to replace those classes in kafka-plugins with the counterparts here.

Or do you think it is better to refactor it into aws-plugin and let encryption-plugin depend on aws-plugin?

Copy link
Copy Markdown
Collaborator

@oeyh oeyh May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar methods are provided in AwsCredentialsSupplier interface, and detailed actions through CredentialsProviderFactory.

Kafka plugins also use AwsCredentialsSupplier in AwsContext

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored AwsContext into aws-plugin-api so that it will be reused by other plugins

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlvenable Can you also take a look? I'm good with other changes. Approving.

Signed-off-by: George Chen <qchea@amazon.com>
@chenqi0805 chenqi0805 requested a review from dlvenable April 15, 2025 15:32
Copy link
Copy Markdown
Collaborator

@oeyh oeyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on minor issues below.

One question: when data key rotates, how does Data Prepper track which data key is used on the encrypted data so it knows which data key to use for decryption?

import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;

class EncryptionContext {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this class is not public?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class does not need to be visible to other plugins.

Signed-off-by: George Chen <qchea@amazon.com>
…on-3

Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
oeyh
oeyh previously approved these changes May 20, 2025
Copy link
Copy Markdown
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @chenqi0805 !


import java.util.UUID;

public class AwsContext {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class seems decoupled from the rest of the AWS plugin.

Maybe the AwsPlugin should provide this in a new provider.

public void apply(final ExtensionPoints extensionPoints) {
  var extensionsProvider = new AwsExtensionProvider(defaultAwsCredentialsSupplier);
  extensionPoints.addExtensionProvider(new AwsExtensionProvider(defaultAwsCredentialsSupplier));
  // new
  extensionPoints.addExtensionProvider(new AwsContext(extensionsProvider));
}

Now, it might look like this:

public interface AwsContext {
  AwsCredentialsProvider getOrDefault()
  Region getRegionOrDefault();
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. As an intermediate step to unblock the PR, I made AwsContext an interface and AwsContextImpl a temporary implementation. Once this issue: #2825 gets resolved, AwsContext bean will be provided by aws-plugin.

Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: George Chen <qchea@amazon.com>
@chenqi0805 chenqi0805 requested a review from dlvenable May 29, 2025 20:39
Copy link
Copy Markdown
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chenqi0805 !

@chenqi0805 chenqi0805 requested a review from oeyh May 29, 2025 20:54
@chenqi0805 chenqi0805 merged commit d568fa4 into opensearch-project:main May 29, 2025
69 of 74 checks passed
@chenqi0805 chenqi0805 deleted the enh/encryption-extension-3 branch May 29, 2025 21:11
jeffreyAaron pushed a commit to jeffreyAaron/data-prepper that referenced this pull request Jun 13, 2025
ADD: encryption extension

Signed-off-by: George Chen <qchea@amazon.com>
Signed-off-by: Jeffrey Aaron Jeyasingh <jeffreyaaron06@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support loading encryption keys from an S3 bucket Encryption extension for client-side encryption