Skip to content

Commit f59ba03

Browse files
authored
Merge branch 'dev' into fix-18074-sql-parameter-type-passing
2 parents b3268c1 + c9e373e commit f59ba03

126 files changed

Lines changed: 2293 additions & 963 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/ISSUE_TEMPLATE/bug-report.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@ body:
104104
- 3.3.2
105105
- 3.4.0
106106
- 3.4.1
107+
- 3.4.2
107108
validations:
108109
required: true
109110

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,3 +60,4 @@ ds_schema_check_test
6060
docs/superpowers/
6161
.claude/worktrees
6262
CLAUDE.local.md
63+
AGENT.local.md

AGENT.md

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# AGENT.md - Apache DolphinScheduler
2+
3+
Apache DolphinScheduler is a distributed, visual DAG workflow-scheduling platform. This is the monorepo: backend servers (master / worker / api / alert), a Vue 3 frontend, plugin families for tasks / datasources / storage / alerting / scheduling, and the release tooling.
4+
5+
**This file is an agent-facing project index, adapted from `CLAUDE.md`.** Module-specific details currently live in each module's `CLAUDE.md`; use those files as the source of truth and do not duplicate module contents here.
6+
7+
---
8+
9+
## Tech stack (project-wide)
10+
11+
- **Java 1.8** (do not assume 11+ APIs; `dolphinscheduler-api-test` is the only Java 11 island).
12+
- **Spring Boot 2.6.1** across servers, **Jetty** (Tomcat is excluded transitively).
13+
- **MyBatis-Plus** for ORM; **HikariCP** for the metadata DB pool, **Druid** inside user-facing datasource plugins.
14+
- **Quartz** for cron scheduling (via `scheduler-plugin`).
15+
- **Netty / gRPC** for inter-server RPC (see `extract-base`).
16+
- **Vue 3 + Vite + TypeScript + Naive UI** for the frontend.
17+
- **Maven** multi-module reactor (26 modules in root `pom.xml` + 2 test modules).
18+
- **Zookeeper 3.8** by default for the registry (Etcd and JDBC also supported).
19+
20+
## Runnable services
21+
22+
A production deployment runs **four independent services** (plus an external registry and metadata DB). A fifth entry point, `StandaloneServer`, embeds all four in one JVM for development.
23+
24+
| Service | Module | Main class | Default ports |
25+
|---------|--------|------------|---------------|
26+
| **API** | [`dolphinscheduler-api`](dolphinscheduler-api/CLAUDE.md) | `org.apache.dolphinscheduler.api.ApiApplicationServer` | `12345` (HTTP / UI + REST) |
27+
| **Master** | [`dolphinscheduler-master`](dolphinscheduler-master/CLAUDE.md) | `org.apache.dolphinscheduler.server.master.MasterServer` | `5679` (RPC) |
28+
| **Worker** | [`dolphinscheduler-worker`](dolphinscheduler-worker/CLAUDE.md) | `org.apache.dolphinscheduler.server.worker.WorkerServer` | `1235` (RPC) |
29+
| **Alert** | [`dolphinscheduler-alert`](dolphinscheduler-alert/CLAUDE.md) (to `-alert-server`) | `org.apache.dolphinscheduler.alert.AlertServer` | `50053` (HTTP), `50052` (RPC) |
30+
| Standalone (dev only) | [`dolphinscheduler-standalone-server`](dolphinscheduler-standalone-server/CLAUDE.md) | `org.apache.dolphinscheduler.StandaloneServer` | `12345` + `50052` (API + alert; master/worker use in-JVM calls) |
31+
32+
Every service is a `@SpringBootApplication` on Jetty and implements `IStoppable`. Scale Master / Worker / Alert horizontally; coordination happens via the registry (Zookeeper by default). API is stateless and also scales horizontally behind a load balancer.
33+
34+
Ports are overridable via `server.port` / service-specific keys in each service's `application.yaml`.
35+
36+
## Build & run
37+
38+
```bash
39+
# Full build (release profile; produces dist tarball)
40+
./mvnw clean install -Prelease
41+
42+
# Zookeeper 3.4 legacy
43+
./mvnw clean install -Prelease -Dzk-3.4
44+
45+
# Skip UI build (faster iteration on backend only)
46+
./mvnw -pl '!dolphinscheduler-ui' clean install
47+
48+
# Build one module (+ its required siblings)
49+
./mvnw -pl dolphinscheduler-master -am clean install
50+
51+
# Format (Spotless is configured)
52+
./mvnw spotless:apply
53+
54+
# Standalone server (after building)
55+
cd dolphinscheduler-standalone-server/target && ./bin/start.sh
56+
```
57+
58+
Binary artifact: `dolphinscheduler-dist/target/apache-dolphinscheduler-*-bin.tar.gz`.
59+
60+
## Test
61+
62+
```bash
63+
# Unit tests for one module
64+
./mvnw -pl dolphinscheduler-master test
65+
66+
# API integration tests (separate reactor, requires Docker)
67+
mvn -pl dolphinscheduler-api-test/dolphinscheduler-api-test-case test
68+
69+
# E2E browser tests (Selenium + Docker)
70+
mvn -pl dolphinscheduler-e2e/dolphinscheduler-e2e-case test
71+
72+
# Apple Silicon: add -Dm1_chip=true to the Docker-driven suites
73+
```
74+
75+
---
76+
77+
## Module index
78+
79+
Click into a module's `CLAUDE.md` for details. Each description is one line here on purpose.
80+
81+
### Core execution
82+
83+
- [`dolphinscheduler-master`](dolphinscheduler-master/CLAUDE.md) - workflow orchestration engine; consumes `Command`s, runs the DAG state machine, dispatches to workers.
84+
- [`dolphinscheduler-worker`](dolphinscheduler-worker/CLAUDE.md) - runs physical tasks dispatched from master; hosts task plugins.
85+
- [`dolphinscheduler-task-executor`](dolphinscheduler-task-executor/CLAUDE.md) - reusable task-lifecycle framework embedded by the worker.
86+
- [`dolphinscheduler-alert`](dolphinscheduler-alert/CLAUDE.md) - alert server + channel plugins (email, Feishu, DingTalk, ...).
87+
88+
### API layer
89+
90+
- [`dolphinscheduler-api`](dolphinscheduler-api/CLAUDE.md) - REST API server (entry point for UI, Python SDK, external clients).
91+
- [`dolphinscheduler-api-test`](dolphinscheduler-api-test/CLAUDE.md) - integration tests against the REST API (Docker Compose + Testcontainers).
92+
- [`dolphinscheduler-authentication`](dolphinscheduler-authentication/CLAUDE.md) - Actuator-endpoint auth + AWS credential helpers (NOT the main login path).
93+
94+
### Shared libraries
95+
96+
- [`dolphinscheduler-common`](dolphinscheduler-common/CLAUDE.md) - foundation utilities (everything depends on this).
97+
- [`dolphinscheduler-dao`](dolphinscheduler-dao/CLAUDE.md) - MyBatis DAO layer + SQL migration scripts.
98+
- [`dolphinscheduler-service`](dolphinscheduler-service/CLAUDE.md) - business logic between DAO and the servers.
99+
- [`dolphinscheduler-spi`](dolphinscheduler-spi/CLAUDE.md) - Service-Provider Interface root (every plugin depends on this).
100+
- [`dolphinscheduler-extract`](dolphinscheduler-extract/CLAUDE.md) - RPC interface contracts between servers.
101+
- [`dolphinscheduler-eventbus`](dolphinscheduler-eventbus/CLAUDE.md) - in-process event-bus abstractions.
102+
- [`dolphinscheduler-registry`](dolphinscheduler-registry/CLAUDE.md) - pluggable registry (Zookeeper / Etcd / JDBC).
103+
- [`dolphinscheduler-meter`](dolphinscheduler-meter/CLAUDE.md) - metrics (Prometheus) + server load-protection primitives.
104+
105+
### Plugin families
106+
107+
- [`dolphinscheduler-task-plugin`](dolphinscheduler-task-plugin/CLAUDE.md) - task-type plugins (shell, SQL, Spark, Flink, K8s, EMR, ...). 33 concrete plugins.
108+
- [`dolphinscheduler-datasource-plugin`](dolphinscheduler-datasource-plugin/CLAUDE.md) - user-facing datasource plugins (MySQL, Hive, Trino, Snowflake, ...). 28 concrete plugins.
109+
- [`dolphinscheduler-storage-plugin`](dolphinscheduler-storage-plugin/CLAUDE.md) - resource storage (S3, HDFS, OSS, GCS, ABS, OBS, COS).
110+
- [`dolphinscheduler-scheduler-plugin`](dolphinscheduler-scheduler-plugin/CLAUDE.md) - cron scheduler (Quartz today).
111+
- [`dolphinscheduler-dao-plugin`](dolphinscheduler-dao-plugin/CLAUDE.md) - metadata-DB dialect support (MySQL / PostgreSQL / H2).
112+
113+
### Build, ops, tools
114+
115+
- [`dolphinscheduler-bom`](dolphinscheduler-bom/CLAUDE.md) - Maven BOM; central dependency version pinning.
116+
- [`dolphinscheduler-dist`](dolphinscheduler-dist/CLAUDE.md) - assembles the release tarball + Docker images.
117+
- [`dolphinscheduler-standalone-server`](dolphinscheduler-standalone-server/CLAUDE.md) - all-in-one JVM with H2 (dev / smoke tests).
118+
- [`dolphinscheduler-tools`](dolphinscheduler-tools/CLAUDE.md) - CLIs for schema upgrade + resource / lineage migration.
119+
- [`dolphinscheduler-microbench`](dolphinscheduler-microbench/CLAUDE.md) - JMH micro-benchmarks.
120+
- [`dolphinscheduler-yarn-aop`](dolphinscheduler-yarn-aop/CLAUDE.md) - AspectJ weaver capturing YARN ApplicationIds.
121+
122+
### Frontend & E2E
123+
124+
- [`dolphinscheduler-ui`](dolphinscheduler-ui/CLAUDE.md) - Vue 3 frontend.
125+
- [`dolphinscheduler-e2e`](dolphinscheduler-e2e/CLAUDE.md) - Selenium browser tests.
126+
127+
---
128+
129+
## Architecture overview
130+
131+
A **user** hits the UI, which calls the API server. The API server writes to the **metadata DB** and, for runtime operations (start / kill / pause workflow), talks to the **master** over RPC. The master consumes `t_ds_command` rows, runs the workflow state machine, and dispatches tasks to **workers**. Workers execute task plugins (shell, SQL, Spark, ...) and stream lifecycle events back to master. Failures and SLA breaches flow to the **alert server**, which fans out through alert plugins. **Registry** (Zookeeper / Etcd / JDBC) provides service discovery, leader election, and distributed locks. **Storage plugins** back the resource center and distributed-task artifacts. **Quartz** (via scheduler plugin) fires scheduled workflows, which become new `Command` rows.
132+
133+
## Where things live (quick lookup)
134+
135+
| Looking for... | Start here |
136+
|----------------|------------|
137+
| A REST endpoint | `dolphinscheduler-api/src/main/java/.../api/controller/` |
138+
| Workflow execution logic | `dolphinscheduler-master/src/main/java/.../server/master/engine/` |
139+
| Task execution logic | `dolphinscheduler-worker` + the specific `task-plugin/<type>` |
140+
| How "X" is stored | `dolphinscheduler-dao/src/main/java/.../dao/entity/` |
141+
| SQL schema / upgrade | `dolphinscheduler-dao/src/main/resources/sql/` |
142+
| RPC contract between servers | `dolphinscheduler-extract/dolphinscheduler-extract-<role>` |
143+
| UI page source | `dolphinscheduler-ui/src/views/<feature>/` |
144+
| API call in the UI | `dolphinscheduler-ui/src/service/modules/<resource>.ts` |
145+
| Version of a dependency | `dolphinscheduler-bom/pom.xml` |
146+
147+
## Project-wide conventions
148+
149+
- **Formatting**: Run `./mvnw spotless:apply` before every commit/push. Spotless covers Java sources, `pom.xml`, and Markdown files; CI runs `./mvnw spotless:check` and will fail PRs that are not formatted. Java imports are ordered; license headers are enforced.
150+
- **Commit style**: `[Type-ISSUE_ID][Scope] Subject`, e.g. `[Fix-18168][Worker] ...`. All types except `Chore` require an issue ID. See [commit-message.md](docs/docs/en/contribute/join/commit-message.md) for the full convention.
151+
- **Branching**: `dev` is the main integration branch (not `main`/`master`).
152+
- **PRs must link a GitHub issue** and keep their scope tight: one module / one concern. For `Chore` commits, no issue ID is required by the commit convention.
153+
- **Do not break wire / DB compatibility** silently. Changes to `extract-*` RPC interfaces, `dao` entities, enum values, and `spi.DbType` ripple to deployed clusters mid-upgrade.
154+
- **Only one registry / storage / DB dialect is active at runtime**. Code paths that check "which one" belong inside the plugin SPI, not sprinkled through services.
155+
156+
## External references
157+
158+
- Release docs (version-specific): https://dolphinscheduler.apache.org/en-us/docs
159+
- GitHub issues: https://github.com/apache/dolphinscheduler/issues
160+
- Python SDK: https://dolphinscheduler.apache.org/python/main/index.html
161+
- Contribution guide: [`docs/docs/en/contribute/join/contribute.md`](docs/docs/en/contribute/join/contribute.md)

AGENTS.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!--
2+
SPDX-License-Identifier: Apache-2.0
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
https://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
-->
16+
17+
# Agent Guide for dolphinscheduler
18+
19+
This file is read by automated agents (security scanners, code
20+
analyzers, AI assistants) operating on this repository.
21+
22+
## Security
23+
24+
Security model: [SECURITY.md](./SECURITY.md)
25+
26+
Agents that scan this repository should consult `SECURITY.md` and the
27+
threat model it links before reporting issues.
28+
29+
DolphinScheduler's security model defines user roles, trust boundaries, and known non-findings.

SECURITY.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Security Policy
2+
3+
## Reporting a Vulnerability
4+
5+
`apache/dolphinscheduler` follows the [Apache Software Foundation security process](https://www.apache.org/security/). Please report suspected
6+
vulnerabilities privately to `security@apache.org`; do not open public
7+
GitHub issues or pull requests for security reports.
8+
9+
## Threat Model
10+
11+
What the project treats as in scope and out of scope, the security
12+
properties it provides and disclaims, the adversary model, and how
13+
findings are triaged are documented in https://github.com/apache/dolphinscheduler/blob/dev/docs/docs/en/contribute/join/security-model.md.

docs/configs/index.md.jsx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@ import docs330Config from '../../../site_config/docs3-3-0-alpha';
6868
import docs331Config from '../../../site_config/docs3-3-1';
6969
import docs332Config from '../../../site_config/docs3-3-2';
7070
import docs340Config from '../../../site_config/docs3-4-0';
71+
import docs341Config from '../../../site_config/docs3-4-1';
72+
import docs342Config from '../../../site_config/docs3-4-2';
7173
import docsDevConfig from '../../../site_config/docsdev';
7274

7375
const docsSource = {
@@ -113,6 +115,7 @@ const docsSource = {
113115
'3.3.2': docs332Config,
114116
'3.4.0': docs340Config,
115117
'3.4.1': docs341Config,
118+
'3.4.2': docs342Config,
116119
dev: docsDevConfig,
117120
};
118121

docs/configs/site.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ export default {
2424
port: 8080,
2525
domain: 'dolphinscheduler.apache.org',
2626
copyToDist: ['asset', 'img', 'file', '.asf.yaml', 'sitemap.xml', '.nojekyll', '.htaccess', 'googled0df7b96f277a143.html'],
27-
docsLatest: '3.4.1',
27+
docsLatest: '3.4.2',
2828
defaultSearch: 'google', // default search engine
2929
defaultLanguage: 'en-us',
3030
'en-us': {
@@ -45,7 +45,7 @@ export default {
4545
children: [
4646
{
4747
key: 'docs0',
48-
text: 'latest(3.4.1)',
48+
text: 'latest(3.4.2)',
4949
link: '/en-us/docs/latest/user_doc/about/introduction.html',
5050
},
5151
{
@@ -173,7 +173,7 @@ export default {
173173
children: [
174174
{
175175
key: 'docs0',
176-
text: '最新版本latest(3.4.1)',
176+
text: '最新版本latest(3.4.2)',
177177
link: '/zh-cn/docs/latest/user_doc/about/introduction.html',
178178
},
179179
{

docs/docs/en/faq.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ export PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$PYTHON_LAUNCHER:$JAVA_HOME/bin:$HI
238238

239239
## Q:Worker Task will generate a child process through sudo -u tenant sh xxx.command, will kill when kill
240240

241-
A We will add the kill task in 1.0.4 and kill all the various child processes generated by the task.
241+
A: We will add the kill task in 1.0.4 and kill all the various child processes generated by the task.
242242

243243
---
244244

@@ -370,7 +370,7 @@ A: The license of mysql jdbc connector is not compatible with apache v2 license,
370370
<p align="center">
371371
<img src="https://user-images.githubusercontent.com/16174111/81312485-476e9380-90b9-11ea-9aad-ed009db899b1.png" width="60%" />
372372
</p>
373-
A This bug have fix in dev and in Requirement/TODO list.
373+
A: This bug has been fixed in dev and is in the Requirement/TODO list.
374374

375375
---
376376

@@ -636,7 +636,7 @@ sed -i 's/Defaults requirett/#Defaults requirett/g' /etc/sudoers
636636

637637
## Q:Deploy for multiple YARN clusters
638638

639-
ABy deploying different worker in different yarn clustersthe steps are as follows(eg: AWS EMR):
639+
A: By deploying different worker in different yarn clusters, the steps are as follows(eg: AWS EMR):
640640

641641
1. Deploying the worker server on the master node of the EMR cluster
642642

@@ -648,7 +648,7 @@ A:By deploying different worker in different yarn clusters,the steps are as
648648

649649
## Q:Update process definition error: Duplicate key TaskDefinition
650650

651-
ABefore DS 2.0.4 (after 2.0.0-alpha), there may be a problem of duplicate keys TaskDefinition due to version switching, which may cause the update workflow to fail; you can refer to the following SQL to delete duplicate data, taking MySQL as an example: (Note: Before operating, be sure to back up the original data, the SQL from pr[#8408](https://github.com/apache/dolphinscheduler/pull/8408))
651+
A: Before DS 2.0.4 (after 2.0.0-alpha), there may be a problem of duplicate keys TaskDefinition due to version switching, which may cause the update workflow to fail; you can refer to the following SQL to delete duplicate data, taking MySQL as an example: (Note: Before operating, be sure to back up the original data, the SQL from pr[#8408](https://github.com/apache/dolphinscheduler/pull/8408))
652652

653653
```SQL
654654
DELETE FROM t_ds_process_task_relation_log WHERE id IN
@@ -736,7 +736,7 @@ DELETE FROM t_ds_task_definition_log WHERE id IN
736736

737737
## Q:Upgrade from 2.0.1 to 2.0.5 using PostgreSQL database failed
738738

739-
AThe repair can be completed by executing the following SQL in the database:
739+
A: The repair can be completed by executing the following SQL in the database:
740740

741741
```SQL
742742
update t_ds_version set version='2.0.1';
@@ -746,7 +746,7 @@ update t_ds_version set version='2.0.1';
746746

747747
## Q:Can not find python-gateway-server in distribute package
748748

749-
AAfter version 3.0.0-alpha, Python gateway server integrate into API server, and Python gateway service will start when you
749+
A: After version 3.0.0-alpha, Python gateway server integrate into API server, and Python gateway service will start when you
750750
start API server. If you want disabled when Python gateway service you could change API server configuration in path
751751
`api-server/conf/application.yaml` and change attribute `python-gateway.enabled : false`.
752752

0 commit comments

Comments
 (0)