Skip to content

[Feature-18070][Task] Add Amazon EMR Serverless task plugin#18069

Merged
SbloodyS merged 29 commits intoapache:devfrom
norrishuang:dev
Apr 10, 2026
Merged

[Feature-18070][Task] Add Amazon EMR Serverless task plugin#18069
SbloodyS merged 29 commits intoapache:devfrom
norrishuang:dev

Conversation

@norrishuang
Copy link
Copy Markdown
Contributor

@norrishuang norrishuang commented Mar 14, 2026

Was this PR generated or assisted by AI?

YES. The implementation was assisted by AI (Claude) for code generation, with human review, testing and verification on a real AWS EMR Serverless environment.

Purpose of the pull request

Add a new task plugin for Amazon EMR Serverless, enabling users to submit, monitor, and cancel Spark/Hive jobs on EMR Serverless applications directly from DolphinScheduler workflows.
Unlike the existing EMR on EC2 task plugin which manages EC2-based clusters, EMR Serverless is a serverless runtime that requires no cluster infrastructure management and automatically scales compute resources on demand.
Close 18070

Brief change log

Backend (new module: dolphinscheduler-task-emr-serverless)

  • EmrServerlessTask — extends AbstractRemoteTask, implements submit/track/cancel lifecycle via AWS SDK v1 (StartJobRun, GetJobRun, CancelJobRun)
  • EmrServerlessParameters — task parameter model (applicationId, executionRoleArn, jobName, startJobRunRequestJson)
  • EmrServerlessTaskChannel / EmrServerlessTaskChannelFactory — SPI registration via @AutoService, registered as EMR_SERVERLESS
  • EmrServerlessTaskException — dedicated exception class
  • Authentication: reuses aws.emr.* config from aws.yaml, falls back to DefaultAWSCredentialsProviderChain
  • Supports failover recovery via appIds (jobRunId)
    Frontend
  • use-emr-serverless.ts (fields) — form fields for Application Id, Execution Role Arn, Job Name, StartJobRunRequest JSON editor
  • use-emr-serverless.ts (tasks) — task model definition
  • Registered in task type constants, store, format-data, i18n (en_US/zh_CN)
  • Task icon (reuses EMR icon)
    Documentation
  • Chinese doc: docs/docs/zh/guide/task/emr-serverless.md
  • English doc: docs/docs/en/guide/task/emr-serverless.md
  • Includes: overview, task parameters, Spark/Hive JSON examples, AWS auth config, job state transitions, screenshots

Verify this pull request

This change added tests and can be verified as follows:

  • Added EmrServerlessTaskTest with 11 unit tests covering: success/failed/cancelled lifecycle, full state chain, submit error handling, null GetJobRun response, cancel with/without jobRunId, failover recovery, parameter validation, and invalid JSON handling.
  • Manually verified by deploying to an EC2 instance in Standalone mode and successfully submitting a Spark job to a real AWS EMR Serverless application.

@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented Mar 14, 2026

Thanks for opening this pull request! Please check out our contributing guidelines. (https://github.com/apache/dolphinscheduler/blob/dev/docs/docs/en/contribute/join/pull-request.md)

@github-actions github-actions Bot added UI ui and front end related backend test document labels Mar 14, 2026
@norrishuang norrishuang changed the title [Feature][Task] Add Amazon EMR Serverless task plugin [Feature-18070][Task] Add Amazon EMR Serverless task plugin Mar 14, 2026
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add api-test or e2e for this. @norrishuang

@norrishuang
Copy link
Copy Markdown
Contributor Author

Please add api-test or e2e for this. @norrishuang

Comprehensive unit tests have already been included for the EMR Serverless task plugin, covering job submission, state polling, success/failure/cancellation handling, failover recovery, parameter validation, and invalid input scenarios. Since this task plugin depends on AWS EMR Serverless, running api-test or e2e in the CI Docker environment would require AWS credentials and a running EMR Serverless application. I'm happy to add an api-test or e2e if there is a recommended approach for handling AWS authentication in CI. Could you share any guidance on this?

@norrishuang
Copy link
Copy Markdown
Contributor Author

Thank you for the feedback @SbloodyS!

I have enhanced the unit tests to provide comprehensive coverage of the EMR Serverless task plugin. The test suite now includes 15 test cases covering:

  • Full job lifecycle: job submission → state polling → success/failure/cancelled
  • Exception handling: submission failures, polling failures, null job run responses
  • Cancel operation: cancel running job, cancel with empty jobRunId edge case
  • Failover recovery: restore job run ID from appIds after worker restart
  • Parameter validation: missing required fields, invalid JSON input
  • State mapping: all final states → exit code mapping
  • Application ID retrieval: getApplicationIds()

The tests use Mockito to mock EmrServerlessClient, following the same pattern as AliyunServerlessSparkTaskTest in the codebase. Since this plugin depends on AWS EMR Serverless, running actual e2e tests in the CI Docker environment would require AWS credentials and a running EMR Serverless application, which is not feasible in the standard CI setup.

Commit: norrishuang/dolphinscheduler@44f43eb

@SbloodyS
Copy link
Copy Markdown
Member

Unit testing is not enough. You can refer to dolphinscheduler-api-test and dolphinscheduler-e2e modules. @norrishuang

@norrishuang
Copy link
Copy Markdown
Contributor Author

Thank you for the guidance @SbloodyS!

I have added an api-test for the EMR Serverless task plugin. Since this plugin depends on AWS EMR Serverless (a cloud service), running actual e2e tests in CI would require real AWS credentials and a running EMR Serverless application. To solve this, I used WireMock to mock the AWS EMR Serverless HTTP API — it's open-source and works entirely offline.

What was added (commit: norrishuang/dolphinscheduler@b96944c):

  1. EmrServerlessTaskAPITest — api-test that exercises the full task execution flow via DolphinScheduler REST API:

    • Login → create project → import workflow definition → online workflow → trigger execution → assert success
  2. docker-compose.yaml — spins up DolphinScheduler standalone + WireMock:

    • WireMock mocks POST /applications/*/jobruns (StartJobRun) and GET /applications/*/jobruns/* (GetJobRun → SUCCESS)
    • DS connects to WireMock via EMR_SERVERLESS_ENDPOINT=http://wiremock:8080
  3. Fixed deprecated ObjectMapper.configure() calls by switching to JsonMapper.builder() pattern (addresses the CodeQL warning)

Please let me know if any adjustments are needed.

@github-actions github-actions Bot added the e2e e2e test label Mar 23, 2026
@SbloodyS
Copy link
Copy Markdown
Member

SbloodyS commented Mar 23, 2026

Yes. Using WireMock is good for now. You can continue coding. @norrishuang

norrishuang added a commit to norrishuang/dolphinscheduler that referenced this pull request Mar 27, 2026
@norrishuang
Copy link
Copy Markdown
Contributor Author

Hi @SbloodyS, I noticed the OWASP Dependency Check CI has been failing on the dev branch consistently (not just on this PR). Is this a known issue? Do I need to take any action on my side to get this PR reviewed?

@SbloodyS
Copy link
Copy Markdown
Member

Hi @SbloodyS, I noticed the OWASP Dependency Check CI has been failing on the dev branch consistently (not just on this PR). Is this a known issue? Do I need to take any action on my side to get this PR reviewed?

You can just ignore it for now.

@norrishuang norrishuang requested a review from SbloodyS March 31, 2026 22:01
@github-actions github-actions Bot added the CI&CD label Apr 1, 2026
@norrishuang
Copy link
Copy Markdown
Contributor Author

Hi @SbloodyS, thanks for the review comments! I've addressed all three points:

  1. CI registration — Added EmrServerlessTaskAPITest to .github/workflows/api-test.yml

  2. Failed case — Added a failed workflow test case (testEmrServerlessFailedWorkflowInstance) with a dedicated WireMock mapping (get-job-run-failed.json). To avoid URL pattern conflicts between the success and failure scenarios, I used distinct jobRunIds (test-job-run-id-success / test-job-run-id-failed) with precise URL pattern matching, so WireMock routes each request deterministically.

  3. EMR version — Updated releaseLabel to emr-7.12.0 in the WireMock mock responses.

All CI checks are now passing. Please take another look when you have a chance!

@norrishuang norrishuang requested a review from SbloodyS April 7, 2026 01:38
- Replace deprecated PropertyNamingStrategy.UpperCamelCaseStrategy with
  PropertyNamingStrategies.UPPER_CAMEL_CASE (fixes SonarCloud warning)
- Remove redundant applicationId field; read directly from emrServerlessParameters
- Store only jobRunId in appIds (applicationId always available from parameters)
- Simplify failover recovery: jobRunId = getAppIds() directly
- Remove @SInCE dev-SNAPSHOT javadoc tag
- Update test to use new appIds format (jobRunId only)
@norrishuang
Copy link
Copy Markdown
Contributor Author

Hi @SbloodyS, thank you for the detailed review! I've addressed all the feedback in the latest commit:

  1. UpperCamelCaseStrategy deprecated — Replaced with PropertyNamingStrategies.UPPER_CAMEL_CASE (Jackson 2.12+ API)

  2. Remove @since dev-SNAPSHOT — Removed the javadoc tag

  3. Redundant applicationId field — Removed the field; now reading directly from emrServerlessParameters.getApplicationId() throughout

  4. Store jobRunId directly — Changed setAppIds() to store only jobRunId (applicationId is always available from parameters). Updated failover recovery accordingly: jobRunId = getAppIds()

  5. SonarCloud ObjectMapper.configure warning — Already using JsonMapper.builder().configure(...) (the non-deprecated builder API); no ObjectMapper.configure(Feature, boolean) calls exist

  6. Empty suggestion on line 73 — Removed the @since dev-SNAPSHOT tag as suggested

All 15 unit tests pass. Please take another look when you have a chance.

@norrishuang norrishuang requested a review from SbloodyS April 8, 2026 12:25
@SbloodyS
Copy link
Copy Markdown
Member

SbloodyS commented Apr 8, 2026

There are still many unaddressed comment. @norrishuang
#18069 (comment)
#18069 (comment)

- Use JobRunState enum constants in WAITING_STATES set
- Replace switch/case with JobRunState enum in mapStateToExitCode()
- Update EmrServerlessTaskTest to use JobRunState enum
- Apply spotless format fixes
- All 15 unit tests pass
- Remove unnecessary 'public' modifiers from @test and @beforeeach methods (S5786)
- Add no-op comments to empty TaskCallBack methods (S1186)
- Remove unnecessary 'throws Exception' declarations where not needed (S1130)
- All 15 unit tests pass
@norrishuang
Copy link
Copy Markdown
Contributor Author

Hi @SbloodyS, I've addressed all the remaining review comments. Replaced hardcoded state strings with JobRunState enum constants throughout (WAITING_STATES, mapStateToExitCode(), and unit tests).

Could you take another look? Thanks!

@SbloodyS SbloodyS added this to the 3.4.2 milestone Apr 9, 2026
@SbloodyS SbloodyS added the feature new feature label Apr 9, 2026
- Add WorkflowInstancePage to EmrServerlessTaskAPITest
- Poll workflow instance state in success test: assert final state is SUCCESS
- Poll workflow instance state in failed test: assert final state is FAILURE/STOP
- Pattern follows DependentTaskAPITest
@norrishuang norrishuang requested a review from SbloodyS April 9, 2026 12:44
norrishuang added a commit to norrishuang/dolphinscheduler that referenced this pull request Apr 9, 2026
- Remove deprecated PropertyNamingStrategies.UPPER_CAMEL_CASE usage
- Remove empty javadoc line
- Fix infinite loop in API tests with 120s timeout
- Reduce polling interval from 5s to 2s
- Add explicit workflow instance final state assertions

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove deprecated PropertyNamingStrategies.UPPER_CAMEL_CASE usage
- Remove empty javadoc line
- Fix infinite loop in API tests with 120s timeout
- Reduce polling interval from 5s to 2s
- Add explicit workflow instance final state assertions
- Use throws Exception instead of try-catch to propagate errors clearly
- Replace while(true) with for-loop (60 iterations x 2s = 120s timeout)
- Use boolean completed flag with explicit assertTrue assertion
- Fail immediately with state info when unexpected terminal state reached
@norrishuang norrishuang requested a review from SbloodyS April 10, 2026 01:41
Copy link
Copy Markdown
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SbloodyS SbloodyS added the first time contributor First-time contributor label Apr 10, 2026
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 60%)

See analysis details on SonarQube Cloud

@SbloodyS SbloodyS merged commit 8077c2b into apache:dev Apr 10, 2026
122 of 124 checks passed
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented Apr 10, 2026

Awesome work, congrats on your first merged pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend CI&CD document e2e e2e test feature new feature first time contributor First-time contributor test UI ui and front end related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature][Task] Support Amazon EMR Serverless task plugin

3 participants