diff --git a/public/llms.txt b/public/llms.txt index 26d6bc51616..f7463771037 100644 --- a/public/llms.txt +++ b/public/llms.txt @@ -28,6 +28,7 @@ This section covers two distinct things: (1) guidance for AI agents interacting - [AI Agents](https://kestra.io/docs/ai-tools/ai-agents.md): Build autonomous orchestration patterns where agents decide which tasks to run based on runtime context - [AI Workflows](https://kestra.io/docs/ai-tools/ai-workflows.md): Patterns for building AI-native workflows — LLM calls, tool use, and multi-step inference pipelines — using Kestra tasks - [RAG Workflows](https://kestra.io/docs/ai-tools/ai-rag-workflows.md): Retrieval-augmented generation patterns — indexing, chunking, embedding, and querying — orchestrated as Kestra flows +- [MCP Server](https://kestra.io/docs/ai-tools/mcp-server.md): Expose Kestra flows as MCP tools for AI agents — configure MCP servers, connect Claude Desktop/Code/Cursor, and understand OSS vs EE auth options ## Authoring flows @@ -38,7 +39,8 @@ Use this section when writing, editing, or understanding the structure of a Kest - [Flowable tasks](https://kestra.io/docs/workflow-components/tasks/flowable-tasks.md): Control flow primitives — `Sequential`, `Parallel`, `ForEach`, `ForEachItem`, `Switch`, `If`, `DAG`, `LoopUntil`, `Subflow`, `AllowFailure`, `Pause`, `WorkingDirectory` - [Inputs](https://kestra.io/docs/workflow-components/inputs.md): Typed runtime parameters (`STRING`, `INT`, `BOOLEAN`, `FILE`, `JSON`, `ARRAY`, `ENUM`, `DATETIME`, etc.) with optional defaults and validation - [Outputs](https://kestra.io/docs/workflow-components/outputs.md): Reference task outputs with `{{ outputs.task_id.attribute }}`, dynamic task outputs with `{{ outputs.task_id[taskrun.value].attribute }}`, and sibling outputs inside loops -- [Triggers](https://kestra.io/docs/workflow-components/triggers.md): Start flows automatically — Schedule (cron), Flow (react to another flow's completion), Webhook, Polling, and Realtime triggers +- [Triggers](https://kestra.io/docs/workflow-components/triggers.md): Start flows automatically — Schedule (cron), Flow (react to another flow's completion), Webhook, Polling, Realtime, and MCP Tool triggers +- [MCP Tool Trigger](https://kestra.io/docs/workflow-components/triggers/mcp-tool-trigger.md): Register a flow as a named MCP tool — `toolName`, `title`, `toolDescription`, `mcpServer`, and `annotations` properties; flow inputs/outputs auto-mapped to JSON schema - [Variables](https://kestra.io/docs/workflow-components/variables.md): Flow-level named values referenced as `{{ vars.name }}`; useful for values reused across multiple tasks - [Subflows](https://kestra.io/docs/workflow-components/subflows.md): Call another flow as a task, pass inputs, wait for completion, and consume its outputs - [Errors](https://kestra.io/docs/workflow-components/errors.md): `errors` block for flow-level error handling tasks; `AllowFailure` for marking individual tasks as non-fatal @@ -197,6 +199,7 @@ Every page in the Kestra documentation. Use this section to enumerate all availa - [AI Copilot in Kestra – Generate and Edit Flows](https://kestra.io/docs/ai-tools/ai-copilot.md) - [RAG Workflows in Kestra – Retrieval-Augmented Generation](https://kestra.io/docs/ai-tools/ai-rag-workflows.md) - [AI Workflows in Kestra: Orchestrate with Any LLM](https://kestra.io/docs/ai-tools/ai-workflows.md) +- [MCP Server in Kestra – Expose Flows as AI Tools](https://kestra.io/docs/ai-tools/mcp-server.md) - [API Reference: Enterprise and Open Source Editions](https://kestra.io/docs/api-reference.md) - [Cloud & Enterprise API Reference for Kestra](https://kestra.io/docs/api-reference/enterprise.md) - [SDK Language Clients for the Kestra API](https://kestra.io/docs/api-reference/kestra-sdk.md) @@ -654,4 +657,5 @@ Every page in the Kestra documentation. Use this section to enumerate all availa - [Realtime Trigger in Kestra – Millisecond Eventing](https://kestra.io/docs/workflow-components/triggers/realtime-trigger.md) - [Schedule Trigger in Kestra – Cron-Based Scheduling](https://kestra.io/docs/workflow-components/triggers/schedule-trigger.md) - [Webhook Trigger in Kestra – Start Flows via HTTP](https://kestra.io/docs/workflow-components/triggers/webhook-trigger.md) +- [MCP Tool Trigger in Kestra – Expose Flows as AI Tools](https://kestra.io/docs/workflow-components/triggers/mcp-tool-trigger.md) - [Variables in Kestra – Reuse Values Across Flows](https://kestra.io/docs/workflow-components/variables.md) diff --git a/src/contents/blogs/release-1-2/index.md b/src/contents/blogs/release-1-2/index.md index c5804754e81..d388e22e441 100644 --- a/src/contents/blogs/release-1-2/index.md +++ b/src/contents/blogs/release-1-2/index.md @@ -360,7 +360,7 @@ id: vm_provisioning namespace: company.team checks: - - condition: "{{ kv('VMs') | length < 2 }}" + - when: "{{ kv('VMs') | length < 2 }}" message: "You have provisioned too many VMs" style: ERROR behavior: BLOCK_EXECUTION diff --git a/src/contents/docs/03.tutorial/04.triggers/index.md b/src/contents/docs/03.tutorial/04.triggers/index.md index b26b8d325ff..1da4f273f45 100644 --- a/src/contents/docs/03.tutorial/04.triggers/index.md +++ b/src/contents/docs/03.tutorial/04.triggers/index.md @@ -40,10 +40,9 @@ triggers: - id: flow_trigger type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionFlow + dependsOn: + - flowId: first_flow namespace: company.team - flowId: first_flow ``` :::alert{type="info"} diff --git a/src/contents/docs/03.tutorial/05.flowable/index.md b/src/contents/docs/03.tutorial/05.flowable/index.md index 86c8fdd5f18..d0b33b4c7e6 100644 --- a/src/contents/docs/03.tutorial/05.flowable/index.md +++ b/src/contents/docs/03.tutorial/05.flowable/index.md @@ -20,7 +20,7 @@ For example, you can use the [If task](/plugins/core/flow/io.kestra.plugin.core. The example below redesigns the flow to use a `SELECT` input for product category rather than a `STRING` URI, while still calling [dummyjson](https://dummyjson.com). An API request is made based on the selected category — `beauty` or `notebooks` (one does not exist). -The `check_products` If task has a `condition` of `"{{ json(outputs.api.body).products | length > 0 }}"` (i.e., checking whether the API body is not empty and contains at least one product). The log message then depends on whether the actual product category exists or not. The `then` property defines the action for a true condition, and the `else` property defines the action for a false result. +The `check_products` If task has a `condition` of `"{{ fromJson(outputs.api.body).products | length > 0 }}"` (i.e., checking whether the API body is not empty and contains at least one product). The log message then depends on whether the actual product category exists or not. The `then` property defines the action for a true condition, and the `else` property defines the action for a false result. ```yaml id: getting_started @@ -41,11 +41,11 @@ tasks: - id: check_products type: io.kestra.plugin.core.flow.If - condition: "{{ json(outputs.api.body).products | length > 0 }}" + condition: "{{ fromJson(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log - message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" + message: "Found {{ fromJson(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim @@ -87,31 +87,27 @@ Execute the flow twice, once with `beauty` and once with `notebooks` to examine A common orchestration pattern is operating on a set of values. Kestra offers several approaches depending on your use case. The standalone examples below demonstrate each type. -### ForEach +### Loop -The **ForEach** flowable task executes a group of tasks for each value in the list. There are many ways to implement ForEach for complex looping operations, possibly incorporating conditional flowable tasks or subtasks. See more examples in the [ForEach documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach). +The `Loop` flowable task iterates over a list of values and runs child tasks for each item. Each iteration runs as an isolated sub-execution. Access the current value with `{{ item.value }}` and the zero-based index with `{{ item.index }}`. -As an introduction to the feature, the below example demonstrates using ForEach to make an API call to [OpenLibrary](https://openlibrary.org/dev/docs/api/search) to get a list of associated titles for each author in the list. The values are defined as a JSON string or an array, i.e., a list of string values `["value1", "value2"]` or a list of key-value pairs `[{"key": "value1"}, {"key": "value2"}]`. - -You can access the current iteration value using the variable `{{ taskrun.value }}`: +Values can be a static list, a JSON array string, a map, or an ION file URI. The example below makes an API call for each author in the list: ```yaml -id: for_loop_example +id: loop_example namespace: tutorial tasks: - - id: for_each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: ["pynchon", "dostoyevsky", "hedayat"] tasks: - id: api type: io.kestra.plugin.core.http.Request - uri: "https://openlibrary.org/search.json?author={{ taskrun.value }}&sort=new" + uri: "https://openlibrary.org/search.json?author={{ item.value }}&sort=new" ``` -After execution, the Gantt view shows separate runs for each of the three listed authors in the task. - -![forEach example](./for-each-author.png) +After execution, the Gantt view shows a separate task group for each author. See the [Loop documentation](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop) for output collection, nested loops, error handling, and map-reduce patterns. ### LoopUntil @@ -147,11 +143,11 @@ This flow checks an HTTP endpoint every 30 seconds and stops either when it retu A common orchestration requirement is executing independent processes **in parallel**. For example, you can process data for each partition in parallel. This can significantly speed up the processing time. -The flow below uses the `ForEach` flowable task to execute a list of `tasks` in parallel. +The flow below uses the `Loop` flowable task with `concurrencyLimit: 0` to process all partitions simultaneously. -1. The `concurrencyLimit` property with value `0` makes the list of `tasks` to execute in parallel. +1. The `concurrencyLimit` property set to `0` removes the cap on parallel iterations. 2. The `values` property defines the list of items to iterate over. -3. The `tasks` property defines the list of tasks to execute for each item in the list. You can access the iteration value using the `{{ taskrun.value }}` variable. +3. The `tasks` property defines the child tasks for each iteration. Access the iteration value with `{{ item.value }}`. ```yaml id: python_partitions @@ -171,7 +167,7 @@ tasks: Kestra.outputs({'partitions': partitions}) - id: processPartitions - type: io.kestra.plugin.core.flow.ForEach + type: io.kestra.plugin.core.flow.Loop concurrencyLimit: 0 values: '{{ outputs.getPartitions.vars.partitions }}' tasks: @@ -186,7 +182,7 @@ tasks: import time from kestra import Kestra - filename = '{{ taskrun.value }}' + filename = '{{ item.value }}' print(f"Reading and processing partition {filename}") nr_rows = random.randint(1, 1000) processing_time = random.randint(1, 20) diff --git a/src/contents/docs/03.tutorial/06.errors/index.md b/src/contents/docs/03.tutorial/06.errors/index.md index 219c35ce6e1..69ceb5b11b7 100644 --- a/src/contents/docs/03.tutorial/06.errors/index.md +++ b/src/contents/docs/03.tutorial/06.errors/index.md @@ -71,11 +71,11 @@ tasks: - id: check_products type: io.kestra.plugin.core.flow.If - condition: "{{ json(outputs.api.body).products | length > 0 }}" + condition: "{{ fromJson(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log - message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" + message: "Found {{ fromJson(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim @@ -141,14 +141,9 @@ tasks: triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - FAILED - - WARNING - - type: io.kestra.plugin.core.condition.ExecutionNamespace - namespace: company.team - prefix: true + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company.team') }}" ``` Adding this flow ensures you receive a Slack alert for any flow failure in the `company.team` namespace. @@ -242,11 +237,11 @@ tasks: - id: check_products type: io.kestra.plugin.core.flow.If - condition: "{{ json(outputs.api.body).products | length > 0 }}" + condition: "{{ fromJson(outputs.api.body).products | length > 0 }}" then: - id: log_status type: io.kestra.plugin.core.log.Log - message: "Found {{ json(outputs.api.body).products | length }} products for category {{ inputs.category }}" + message: "Found {{ fromJson(outputs.api.body).products | length }} products for category {{ inputs.category }}" - id: python type: io.kestra.plugin.scripts.python.Script containerImage: python:slim diff --git a/src/contents/docs/05.workflow-components/01.tasks/00.flowable-tasks/index.md b/src/contents/docs/05.workflow-components/01.tasks/00.flowable-tasks/index.md index 1b9500672ac..bc5bbbaac8e 100644 --- a/src/contents/docs/05.workflow-components/01.tasks/00.flowable-tasks/index.md +++ b/src/contents/docs/05.workflow-components/01.tasks/00.flowable-tasks/index.md @@ -140,139 +140,325 @@ tasks: For more details, check out the [If Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.if). -### ForEach +### Loop -This task executes a group of tasks for each value in the list. +The `Loop` task iterates over a set of values and runs child tasks for each item. Unlike `ForEach`, each iteration runs in an isolated sub-execution with its own context. -In the following example, the variable is static, but it could also be generated from a previous task output, starting any number of subtasks. +`values` accepts a list, a JSON array string, a map, or an ION file URI. When `values` is a URI, Kestra performs one iteration per line of the file. ```yaml -id: foreach_example +id: loop-basic namespace: company.team tasks: - - id: for_each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: ["value 1", "value 2", "value 3"] tasks: - - id: before_if - type: io.kestra.plugin.core.debug.Return - format: "Before if {{ taskrun.value }}" - - id: if - type: io.kestra.plugin.core.flow.If - condition: '{{ taskrun.value == "value 2" }}' - then: - - id: after_if - type: io.kestra.plugin.core.debug.Return - format: "After if {{ parent.taskrun.value }}" + - id: log + type: io.kestra.plugin.core.log.Log + message: "index={{ item.index }} value={{ item.value }}" ``` -In this execution, you can access: +Inside each iteration, use the `item` variable to access the iteration context: + +| Expression | Description | +|---|---| +| `{{ item.index }}` | Zero-based iteration index | +| `{{ item.value }}` | Current iteration value | +| `{{ item.key }}` | Current map key when `values` is a map; not set for list or URI values | +| `{{ item.parent.index }}` | Index of the nearest enclosing loop (nested loops only) | +| `{{ item.parent.value }}` | Value of the nearest enclosing loop (nested loops only) | +| `{{ item.parents[n].value }}` | Value of the nth ancestor loop, counting from innermost | -- The iteration value i.e., the index of a loop (the loop index starts at 0) using the syntax `{{ taskrun.iteration }}` -- The output of a sibling task using the syntax `{{ outputs.sibling[taskrun.value].value }}` +For more details on `item`, see [loop iteration context](../../../expressions/index.mdx#loop-iteration-context) in the expressions reference. -This example shows how to run tasks in parallel for each value in the list. All child tasks of the parallel task run in parallel. However, due to the `concurrencyLimit` property set to 2, only two parallel task groups run at any given time. +#### Iterating over objects + +When `values` contains a list of objects, each `item.value` is a JSON string. Use `fromJson(item.value).field` to read fields — `item.value.field` does not work. ```yaml -id: parallel_tasks_example -namespace: company.team +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: + - { id: 101, email: "a@example.com" } + - { id: 102, email: "b@example.com" } + fetchType: AUTO + outputs: + - id: user_id + type: INT + value: "{{ fromJson(item.value).id }}" + - id: email + type: STRING + value: "{{ fromJson(item.value).email }}" + tasks: + - id: log_user + type: io.kestra.plugin.core.log.Log + message: "User {{ fromJson(item.value).id }} -> {{ fromJson(item.value).email }}" +``` + +#### Concurrent execution +By default (`concurrencyLimit: 1`), iterations run one at a time in order. Set `concurrencyLimit` to a higher value to run multiple iterations simultaneously, or `0` for no limit. + +```yaml tasks: - - id: for_each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] - concurrencyLimit: 2 + concurrencyLimit: 0 tasks: - id: parallel type: io.kestra.plugin.core.flow.Parallel tasks: - - id: log - type: io.kestra.plugin.core.log.Log - message: Processing {{ parent.taskrun.value }} - - id: shell - type: io.kestra.plugin.scripts.shell.Commands - commands: - - sleep {{ parent.taskrun.value }} + - id: log + type: io.kestra.plugin.core.log.Log + message: "Processing {{ item.value }}" + - id: shell + type: io.kestra.plugin.scripts.shell.Commands + commands: + - "echo done {{ item.value }}" ``` -For more information on handling outputs generated from `ForEach`, check out the [dedicated loop how-to guide](../../../15.how-to-guides/loop/index.md) and the [Best Practices for ForEach and ForEachItem](../../../14.best-practices/11.foreach-and-foreachitem/index.md) guide, including how to access [sibling task outputs correctly](../../../14.best-practices/11.foreach-and-foreachitem/index.md#example-use-sibling-outputs-correctly-inside-foreach) inside the loop. +#### Failure propagation -For processing items, or forwarding processing to a subflow, [ForEachItem](#foreachitem) is better suited. +By default (`transmitFailed: true`), a failed iteration causes the Loop task itself to fail. Set `transmitFailed: false` to let the loop continue even when individual iterations fail. -:::alert{type="info"} -For more details, refer to the [ForEach Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach). -::: +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["ok", "fail", "ok"] + transmitFailed: false + tasks: + - id: maybe_fail + type: io.kestra.plugin.core.flow.If + condition: '{{ item.value == "fail" }}' + then: + - id: do_fail + type: io.kestra.plugin.core.execution.Fail + else: + - id: success + type: io.kestra.plugin.core.log.Log + message: "OK: {{ item.value }}" +``` -### ForEachItem +#### Error handling per iteration -This task iterates over a list of items and runs a subflow for each item, or for each batch of items. +Use `errors:` to run tasks when an iteration fails, and `finally:` to run a block once after all iterations complete regardless of outcome. `errors:` requires `transmitFailed: false` — with the default `transmitFailed: true`, a failed iteration stops the loop before `errors:` can run. `finally:` always runs regardless of the `transmitFailed` setting. ```yaml - - id: each - type: io.kestra.plugin.core.flow.ForEachItem - items: "{{ inputs.file }}" # could be also an output variable {{ outputs.extract.uri }} - inputs: - file: "{{ taskrun.items }}" # items of the batch - batch: - rows: 4 - namespace: company.team - flowId: subflow - revision: 1 # optional (default: latest) - wait: true # wait for the subflow execution - transmitFailed: true # fail the task run if the subflow execution fails - labels: # optional labels to pass to the subflow to be executed - key: value +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: + - ok + - boom + - ok + transmitFailed: false + tasks: + - id: maybe_fail + type: io.kestra.plugin.scripts.shell.Commands + commands: + - | + if [ "{{ item.value }}" = "boom" ]; then + echo "failing on {{ item.value }}" >&2 + exit 1 + fi + echo "ok {{ item.value }}" + errors: + - id: handle_error + type: io.kestra.plugin.core.log.Log + message: "Iteration {{ item.index }} ({{ item.value }}) failed" + finally: + - id: cleanup + type: io.kestra.plugin.core.log.Log + message: "Loop completed (with or without failures)" ``` -This executes the subflow `company.team.subflow` for each batch of items. -To pass the batch of items to a subflow, you can use inputs. The example above uses an input of `FILE` type called `file` that takes the URI of an internal storage file containing the batch of items. +#### Nested loops -The next example shows you how to access the outputs from each subflow executed. The ForEachItem automatically merges the URIs of the outputs from each subflow into a single file. The URI of this file is available through the `subflowOutputs` output. +Loops can be nested to any depth. Because `item` is bound to the loop execution rather than individual task runs, flowable tasks nested inside a loop can access `item` directly without a `parent.` prefix. + +`item.parents[0]` is the immediate parent loop (same as `item.parent`), `item.parents[1]` is the next outer loop, and so on. ```yaml -id: for_each_item +tasks: + - id: outer + type: io.kestra.plugin.core.flow.Loop + values: ["bucket1", "bucket2"] + tasks: + - id: middle + type: io.kestra.plugin.core.flow.Loop + values: [2025, 2026] + tasks: + - id: inner + type: io.kestra.plugin.core.flow.Loop + values: ["Jan", "Feb", "Mar"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "bucket={{ item.parents[1].value }} year={{ item.parent.value }} month={{ item.value }}" +``` + +#### Loop outputs + +By default, task outputs produced inside a loop are not accessible to tasks that run after the loop. Use the `outputs` property on the Loop task to explicitly declare which values to expose. + +```yaml +id: loop-outputs namespace: company.team tasks: - - id: generate - type: io.kestra.plugin.scripts.shell.Script - script: | - for i in $(seq 1 10); do echo "$i" >> data; done - outputFiles: - - data - - - id: for_each_item - type: io.kestra.plugin.core.flow.ForEachItem - items: "{{ outputs.generate.outputFiles.data }}" - batch: - rows: 4 - wait: true - flowId: my_subflow - namespace: company.team - inputs: - value: "{{ taskrun.items }}" + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["a", "b", "c"] + fetchType: AUTO + outputs: + - id: result + type: STRING + value: "{{ outputs.process.value }}" + tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "processed {{ item.value }}" - - id: for_each_outputs + - id: summary type: io.kestra.plugin.core.log.Log - message: "{{ outputs.forEachItem_merge.subflowOutputs }}" # Log the URI of the file containing the URIs of the outputs from each subflow + message: "Loop ran {{ outputs.loop.iterationCount }} iterations" ``` -:::alert{type="info"} -For more details, refer to the [ForEachItem Task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreachitem). -::: +The loop also exposes monitoring outputs regardless of whether `outputs` is declared: + +| Output | Description | +|---|---| +| `iterationCount` | Total number of iterations | +| `runningIterations` | Iterations still in progress | +| `terminatedIterations` | Iterations that have finished | + +The `fetchType` property controls how iteration outputs are collected: `FETCH` returns them inline in the execution context (suitable for small iteration counts), `STORE` writes them to internal storage and exposes a URI (preferred for large iteration counts), and `AUTO` (the default) chooses based on whether `values` is a URI. + +#### Processing large files + +When `values` is a list of URIs from a [`Split`](/plugins/core/storage/io.kestra.plugin.core.storage.split) task, each iteration receives one chunk URI as `item.value`. Combine `Split`, `Loop`, and `Concat` to implement a map-reduce pattern: split a large file into chunks, process each chunk in parallel, then merge the per-chunk outputs into a single result. + +Passing `values: "{{ outputs.split.uris }}"` where `outputs.split.uris` is a **list** is different from passing a single file URI. When `values` is a list, each `item.value` is one element of that list. When `values` is a single URI string, Kestra iterates line-by-line through the file. + +```yaml +id: map-reduce +namespace: company.team -#### `ForEach` vs `ForEachItem` +tasks: + - id: download + type: io.kestra.plugin.core.http.Download + uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv + + - id: to_ion + type: io.kestra.plugin.serdes.csv.CsvToIon + from: "{{ outputs.download.uri }}" + + - id: split + type: io.kestra.plugin.core.storage.Split + from: "{{ outputs.to_ion.uri }}" + rows: 25 + + - id: per_chunk + type: io.kestra.plugin.core.flow.Loop + values: "{{ outputs.split.uris }}" + concurrencyLimit: 4 + fetchType: FETCH + outputs: + - id: data + type: STRING + value: "{{ outputs.aggregate.uri }}" + tasks: + - id: aggregate + type: io.kestra.plugin.transform.Aggregate + from: "{{ item.value }}" + outputType: STORE + groupBy: [customer_email] + aggregates: + orders: + expr: count() + type: INT + revenue: + expr: sum(todecimal(total)) + type: DECIMAL + + - id: concat + type: io.kestra.plugin.core.storage.Concat + files: "{{ loopOutputs(outputs.per_chunk.outputs, 'data') }}" + extension: .ion + + - id: reduce + type: io.kestra.plugin.transform.Aggregate + from: "{{ outputs.concat.uri }}" + outputType: STORE + groupBy: [data.customer_email] + aggregates: + orders: + expr: sum(orders) + type: INT + revenue: + expr: sum(revenue) + type: DECIMAL +``` + +Use `fetchType: FETCH` to collect per-iteration output URIs inline, then pass them to `Concat` via `loopOutputs(outputs.per_chunk.outputs, 'data')`. + +#### Accessing loop outputs in a script task + +The following example runs a Python task inside a loop to compute a value, then reads the collected results in a subsequent Python task using the monitoring output and the Kestra Python SDK. + +```yaml +id: loop-python-outputs +namespace: company.team + +tasks: + - id: process_items + type: io.kestra.plugin.core.flow.Loop + values: [1, 2, 3, 4, 5] + outputs: + - id: squared + type: INT + value: "{{ outputs.compute.vars.result }}" + tasks: + - id: compute + type: io.kestra.plugin.scripts.python.Script + dependencies: + - kestra + script: | + from kestra import Kestra + n = {{ item.value }} + Kestra.outputs({"result": n * n}) + + - id: analyze + type: io.kestra.plugin.scripts.python.Script + dependencies: + - kestra + script: | + from kestra import Kestra + + iteration_count = {{ outputs.process_items.iterationCount }} + + # outputs.process_items.outputs is a list of iteration results: + # [{"item": {"value": "1", "iteration": 1}, "outputs": {"squared": 1}}, ...] + all_outputs = {{ outputs.process_items.outputs | toJson }} + + squared_values = [iteration["outputs"]["squared"] for iteration in all_outputs] + + print(f"Processed {iteration_count} items") + print(f"Squared values: {squared_values}") + print(f"Sum of squares: {sum(squared_values)}") + + Kestra.outputs({"total": sum(squared_values)}) +``` -Both `ForEach` and `ForEachItem` are similar, but there are specific use cases that suit one over the other: -- `ForEach` generates a lot of [Task Runs](../02.taskruns/index.md) which can impact performance. -- `ForEachItem` generates separate executions using [Subflows](../../10.subflows/index.md) for the group of tasks. This scales better for larger datasets. +`outputs.process_items.iterationCount` is always available after the loop finishes. `outputs.process_items.outputs` is a list of iteration results — each entry contains an `item` object (with `value`, `iteration`, and `key`) and an `outputs` map of the declared output values. To access the first iteration's output in an expression, use `outputs.process_items.outputs[0].outputs.squared`. To extract one output across all iterations as a list, use the `loopOutputs()` function: `{{ loopOutputs(outputs.process_items.outputs, 'squared') }}`. -Read more about performance optimization in our [best practices guides](../../../14.best-practices/0.flows/index.md#tasks-in-the-same-execution). +For more details, see the [Loop task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.loop). -
- -
### LoopUntil diff --git a/src/contents/docs/05.workflow-components/01.tasks/02.taskruns/index.md b/src/contents/docs/05.workflow-components/01.tasks/02.taskruns/index.md index ce6dbb36caa..7dbab795105 100644 --- a/src/contents/docs/05.workflow-components/01.tasks/02.taskruns/index.md +++ b/src/contents/docs/05.workflow-components/01.tasks/02.taskruns/index.md @@ -70,123 +70,33 @@ The logs show the following: } ``` -## Task run values +## Loop iteration context -Some [Flowable tasks](../00.flowable-tasks/index.md), such as [ForEach](../00.flowable-tasks/index.md) and [ForEachItem](../00.flowable-tasks/index.md#foreachitem), group tasks together. You can use `{{ taskrun.value }}` to access the value of a specific task run. - -In the example below, `foreach` iterates twice over the values `[1, 2]`: - -```yaml -id: loop -namespace: company.team - -tasks: - - id: foreach - type: io.kestra.plugin.core.flow.ForEach - values: [1, 2] - tasks: - - id: log - type: io.kestra.plugin.core.log.Log - message: - - "{{ taskrun }}" - - "{{ taskrun.value }}" - - "{{ taskrun.id }}" - - "{{ taskrun.startDate }}" - - "{{ taskrun.attemptsCount }}" - - "{{ taskrun.parentId }}" - - "{{ taskrun.iteration }}" -``` -This produces two separate log entries, one with `1` and the other with `2`. - -### Parent task run values - -You can also use the `{{ parent.taskrun.value }}` expression to access a task run value from a parent task within nested flowable child tasks: +Inside a [Loop](../00.flowable-tasks/index.md#loop) task, each iteration runs as an isolated sub-execution. Use `{{ item.value }}` and `{{ item.index }}` to access the current iteration value and zero-based index from any task inside that sub-execution, including tasks nested inside `If`, `Parallel`, or other flowable tasks. ```yaml id: loop namespace: company.team tasks: - - id: foreach - type: io.kestra.plugin.core.flow.ForEach - values: [1, 2] + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: [1, 2, 3] tasks: - id: log type: io.kestra.plugin.core.log.Log - message: "{{ taskrun.value }}" - - id: if - type: io.kestra.plugin.core.flow.If - condition: "{{ true }}" - then: - - id: log_parent - type: io.kestra.plugin.core.log.Log - message: "{{ parent.taskrun.value }}" + message: | + value={{ item.value }} + index={{ item.index }} + taskrun.id={{ taskrun.id }} + taskrun.startDate={{ taskrun.startDate }} + taskrun.attemptsCount={{ taskrun.attemptsCount }} + taskrun.parentId={{ taskrun.parentId }} ``` -This iterates through the `log` and `if` tasks twice as there are two items in `values` property. The `log_parent` task logs the parent task run value as `1` and then `2`. - -### Parent vs. parents in nested Flowable tasks - -With nested [Flowable tasks](../00.flowable-tasks/index.md), only the immediate parent is available through `taskrun.value`. To access a parent task higher up the tree, you can use the `parent` and the `parents` expressions. - -The following flow shows a more complex example with nested flowable parent tasks: - -```yaml -id: each_switch -namespace: company.team - -tasks: - - id: simple - type: io.kestra.plugin.core.log.Log - message: - - "{{ task.id }}" - - "{{ taskrun.startDate }}" - - - id: hierarchy_1 - type: io.kestra.plugin.core.flow.ForEach - values: ["caseA", "caseB"] - tasks: - - id: hierarchy_2 - type: io.kestra.plugin.core.flow.Switch - value: "{{ taskrun.value }}" - cases: - caseA: - - id: hierarchy_2_a - type: io.kestra.plugin.core.debug.Return - format: "{{ task.id }}" - caseB: - - id: hierarchy_2_b_first - type: io.kestra.plugin.core.debug.Return - format: "{{ task.id }}" - - - id: hierarchy_2_b_second - type: io.kestra.plugin.core.flow.ForEach - values: ["case1", "case2"] - tasks: - - id: switch - type: io.kestra.plugin.core.flow.Switch - value: "{{ taskrun.value }}" - cases: - case1: - - id: switch_1 - type: io.kestra.plugin.core.log.Log - message: - - "{{ parents[0].taskrun.value }}" - - "{{ parents[1].taskrun.value }}" - case2: - - id: switch_2 - type: io.kestra.plugin.core.log.Log - message: - - "{{ parents[0].taskrun.value }}" - - "{{ parents[1].taskrun.value }}" - - id: simple_again - type: io.kestra.plugin.core.log.Log - message: - - "{{ task.id }}" - - "{{ taskrun.startDate }}" -``` +For nested loops, `{{ item.parent.value }}` accesses the immediate enclosing loop's value, and `{{ item.parents[n].value }}` accesses deeper ancestors (`[0]` = immediate parent, `[1]` = grandparent, and so on). -The `parent` variable gives direct access to the first parent, while the `parents[INDEX]` gives you access to the parent higher up the tree. +See [Loop iteration context](../../../expressions/01.context/index.mdx#loop-iteration-context) in the expressions reference for the full `item` variable table. :::collapse{title="Task Run JSON Object Example"} ```json diff --git a/src/contents/docs/05.workflow-components/04.variables/index.md b/src/contents/docs/05.workflow-components/04.variables/index.md index 37ea6c61d44..e84f360b316 100644 --- a/src/contents/docs/05.workflow-components/04.variables/index.md +++ b/src/contents/docs/05.workflow-components/04.variables/index.md @@ -229,11 +229,7 @@ triggers: backfill: start: 2023-11-11T00:00:00Z cron: "0 11 * * MON" # at 11:00 every Monday - conditions: # only first Monday of the month - - type: io.kestra.plugin.core.condition.DayWeekInMonth - date: "{{ trigger.date }}" - dayOfWeek: "MONDAY" - dayInMonth: "FIRST" + when: "{{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }}" # only first Monday of the month ``` diff --git a/src/contents/docs/05.workflow-components/05.inputs/index.md b/src/contents/docs/05.workflow-components/05.inputs/index.md index 347bef63a17..dbd6fd4b910 100644 --- a/src/contents/docs/05.workflow-components/05.inputs/index.md +++ b/src/contents/docs/05.workflow-components/05.inputs/index.md @@ -170,7 +170,7 @@ Here is the list of supported data types: - `TIME`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) time without the timezone from a text string such as `10:15:30`. - `DURATION`: Must be a valid full [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration from a text string such as `PT5M6S`. - `FILE`: Either a file uploaded at execution time as `Content-Type: multipart/form-data` with `Content-Disposition: form-data; name=""; filename=""` (where `` is the input name and `` is the original filename of the file being uploaded), or a default file referenced via the universal file protocol using `nsfile:///path/to/file` (namespace file) or `file:///path/to/file` (local file from an allowed path). `FILE` type inputs also have the `allowedFileExtensions` property to control which types of files can be uploaded. -- `JSON`: Must be a valid JSON string and will be converted to a typed form. +- `JSON`: Must be a valid JSON string and will be converted to a typed form. Accepts an optional `jsonSchema` property (JSON Schema Draft 2020-12) to validate the structure of the input value at execution time. - `YAML`: Must be a valid YAML string. - `URI`: Must be a valid URI and will be kept as a string. - `SECRET`: Encrypted string stored in the database. It is decrypted at runtime and can be used in all tasks. The value of a `SECRET` input is masked in the UI and in the execution context. Note that you need to set the [encryption key](../../configuration/05.security-and-secrets/index.md) in your [Kestra configuration](../../configuration/index.mdx) before using it. @@ -197,13 +197,43 @@ Below is the list of available properties for all inputs regardless of their typ Kestra validates the `type` of each input. In addition to the type validation, some input types can be configured with validation rules that are enforced at execution time. -- `STRING`: A `validator` property allows the addition of a validation [regex](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html). +- `STRING`: A `validator` property allows the addition of a validation [regex](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html). Validator patterns are subject to a 10-second timeout; executions that exceed it are rejected with an error. The timeout is configurable via [`kestra.regex.timeout`](../../configuration/05.security-and-secrets/index.md#regex-timeout). +- `SECRET`: Supports the same `validator` regex property as `STRING`, with the same 10-second timeout applied before the value is encrypted. This ensures the secret is never stored if the pattern is unsafe. - `INT`: `min` and `max` define the allowed range. - `FLOAT`: `min` and `max` define the allowed range. - `DURATION`: `min` and `max` define the allowed range. - `DATE`: `after` and `before` properties help you ensure that the input value is within the allowed date range. - `TIME`: `after` and `before` properties help you ensure that the input value is within the allowed time range. - `DATETIME`: `after` and `before` properties help you ensure that the input value is within the allowed date and time range. +- `JSON`: A `jsonSchema` property accepts a JSON Schema Draft 2020-12 string. If provided, the input value is validated against the schema at execution time. If the value does not conform, the execution is rejected before it starts. + +### Example: use JSON schema validation + +```yaml +id: json_schema_validation +namespace: company.team + +inputs: + - id: payload + type: JSON + jsonSchema: | + { + "$schema": "https://json-schema.org/draft/2020-12/schema", + "type": "object", + "required": ["name"], + "properties": { + "name": { "type": "string" } + }, + "additionalProperties": false + } + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Hello, {{ inputs.payload.name }}!" +``` + +If you pass `{"name": 42}`, the execution will be rejected with a constraint violation before any task runs. If you pass `{"name": "Alice"}`, the flow proceeds normally. ### Example: use input validators in your flows diff --git a/src/contents/docs/05.workflow-components/06.outputs/index.md b/src/contents/docs/05.workflow-components/06.outputs/index.md index 5b9c6d1490d..ce3f08f3da1 100644 --- a/src/contents/docs/05.workflow-components/06.outputs/index.md +++ b/src/contents/docs/05.workflow-components/06.outputs/index.md @@ -200,169 +200,134 @@ outputs: Note how the Ternary Operator `{{ condition ? value_if_true : value_if_false }}` is used in the output expression `{{ tasks.main.state != 'SKIPPED' ? outputs.main.value : outputs.fallback.value }}` to return the output of the `main` task if it is not skipped, otherwise, it returns the output of the `fallback` task. -## Dynamic variables (Each tasks) +## Loop outputs and iteration context -### Current taskrun value +### Current iteration value -In dynamic flows (for example, with an **Each** loop), variables are passed to tasks dynamically. You can access the current taskrun value with `{{ taskrun.value }}` like this: +Inside a [Loop](../01.tasks/00.flowable-tasks/index.md#loop) task, each iteration runs as an isolated sub-execution. Use `{{ item.value }}` to access the current value and `{{ item.index }}` for the zero-based position. ```yaml -id: taskrun_value_example +id: loop_value_example namespace: company.team tasks: - - id: each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: ["alpha", "beta", "gamma"] tasks: - id: inner type: io.kestra.plugin.core.debug.Return - format: "{{ task.id }} > {{ taskrun.value }} > {{ taskrun.startDate }}" + format: "{{ task.id }} > {{ item.value }} > {{ item.index }}" ``` -The **Outputs** tab contains the output for each of the inner task. - -![taskrun_value_example](./taskrun_value_example.png) +The **Outputs** tab shows the output for each iteration of the inner task. ### Loop over a list of JSON objects -Within the loop, the `value` is always a JSON string, so the `{{ taskrun.value }}` is the current element as JSON string. To access properties, you need to wrap it in the `fromJson()` function to have a JSON object allowing to access each property easily. +When `values` contains objects, each `item.value` is a JSON string. Use `fromJson(item.value).field` to access properties — `item.value.field` does not work. ```yaml -id: loop_sequentially_over_list +id: loop_json_objects namespace: company.team tasks: - - id: each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: - {"key": "my-key", "value": "my-value"} - {"key": "my-complex", "value": {"sub": 1, "bool": true}} tasks: - id: inner type: io.kestra.plugin.core.debug.Return - format: "{{ fromJson(taskrun.value).key }} > {{ fromJson(taskrun.value).value }}" + format: "{{ fromJson(item.value).key }} > {{ fromJson(item.value).value }}" ``` +### Access outputs from loop iterations -### Specific outputs for dynamic tasks - -Dynamic tasks are tasks that run other tasks a certain number of times. A dynamic task runs multiple iterations of a set of sub-tasks. - -For example, **ForEach** produces other tasks dynamically depending on its `values` property. - -It is possible to reach each iteration output of dynamic tasks by using the following syntax: +By default, task outputs produced inside a loop are not visible to tasks that run after it. Declare an `outputs:` block on the Loop task to surface values explicitly. ```yaml -id: output_sample +id: loop_outputs namespace: company.team tasks: - - id: each - type: io.kestra.plugin.core.flow.ForEach + - id: loop + type: io.kestra.plugin.core.flow.Loop values: ["s1", "s2", "s3"] + fetchType: AUTO + outputs: + - id: result + type: STRING + value: "{{ outputs.sub.value }}" tasks: - id: sub type: io.kestra.plugin.core.debug.Return - format: "{{ task.id }} > {{ taskrun.value }} > {{ taskrun.startDate }}" + format: "{{ task.id }} > {{ item.value }}" - id: use type: io.kestra.plugin.core.debug.Return - format: "Previous task produced output: {{ outputs.sub.s1.value }}" + format: "First result: {{ outputs.loop.outputs[0].outputs.result }}" ``` -The `outputs.sub.s1.value` variable reaches the `value` of the `sub` task of the `s1` iteration. - -### Previous task lookup - -It is also possible to locate a specific dynamic task by its `value`: - -```yaml -id: dynamic_looping -namespace: company.team - -tasks: - - id: each - type: io.kestra.plugin.core.flow.ForEach - values: ["alpha", "beta", "gamma"] - tasks: - - id: inner - type: io.kestra.plugin.core.debug.Return - format: "{{ taskrun.value }}" - - - id: end - type: io.kestra.plugin.core.debug.Return - format: "{{ task.id }} > {{ outputs.inner['alpha'].value }}" -``` - -It uses the format `outputs.TASKID[VALUE].ATTRIBUTE`. The special bracket `[]` in `[VALUE]` is called the subscript notation; it enables using special chars like space or '-' in task identifiers or output attributes. - -### Lookup in sibling tasks +After the loop, `outputs..outputs` is a list of per-iteration results — each entry has an `item` object (with `value`, `index`, and `key`) and an `outputs` map of the declared output values. -Sometimes it is useful to access outputs from other tasks in the same task tree, known as sibling tasks. +- Access one iteration by index: `outputs..outputs[n].outputs.` +- Extract one field across all iterations as a list: `{{ loopOutputs(outputs..outputs, '') }}` -If the task tree is static, for example when using the [Sequential](/plugins/core/flow/io.kestra.plugin.core.flow.sequential) task, you can use the `{{ outputs.task_id.value }}` notation where `task_id` is the identifier of the sibling task, as you would outside of the task tree. +### Sibling task outputs inside a loop -For example: +Inside a Loop iteration, sibling task outputs are accessed with the plain `outputs.task_id.attribute` notation — each iteration runs in its own isolated sub-execution, so there is no ambiguity about which iteration's output you are reading. ```yaml -id: sibling_tasks +id: loop_with_sibling_tasks namespace: company.team tasks: - - id: sequential - type: io.kestra.plugin.core.flow.Sequential + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["alpha", "beta", "gamma"] tasks: - id: first type: io.kestra.plugin.core.output.OutputValues values: - data: "hello from task 1" + data: "First value: {{ item.value }}" - id: second type: io.kestra.plugin.core.output.OutputValues values: data: "{{ outputs.first.values.data }}" - - - id: log_siblings - type: io.kestra.plugin.core.log.Log - message: "{{ outputs.second.values.data }}" ``` -If the task tree is dynamic, for example when using the [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) task, you need to use `{{ outputs.task_id[taskrun.value] }}` to access the current tree task. `taskrun.value` is a special variable that holds the current value of the ForEach task. - -For example: +For static task trees using [Sequential](/plugins/core/flow/io.kestra.plugin.core.flow.sequential), the same `{{ outputs.task_id.value }}` notation applies outside of a loop. ```yaml -id: loop_with_sibling_tasks +id: sibling_tasks namespace: company.team tasks: - - id: foreach - type: io.kestra.plugin.core.flow.ForEach - values: ["alpha", "beta", "gamma"] + - id: sequential + type: io.kestra.plugin.core.flow.Sequential tasks: - id: first type: io.kestra.plugin.core.output.OutputValues values: - data: "First value: {{ taskrun.value }}" + data: "hello from task 1" - id: second type: io.kestra.plugin.core.output.OutputValues values: - data: "{{ outputs.first[taskrun.value].values.data }}" + data: "{{ outputs.first.values.data }}" - - id: log_output_from_foreach + - id: log_siblings type: io.kestra.plugin.core.log.Log - message: "{{ outputs.second['alpha'].values.data }}" + message: "{{ outputs.second.values.data }}" ``` -You can also use the `currentEachOutput` function to access the current tree task. See [Function Reference](../../expressions/04.functions/index.mdx) for more details. - :::alert{type="warning"} -Accessing sibling task outputs is impossible on [Parallel](/plugins/core/flow/io.kestra.plugin.core.flow.parallel) as it runs tasks in parallel. +Accessing sibling task outputs is impossible in [Parallel](/plugins/core/flow/io.kestra.plugin.core.flow.parallel) as tasks run simultaneously. ::: -For more examples and guidance on accessing sibling outputs inside `ForEach`, including how to read them both inside and outside the loop, see [Best Practices for ForEach and ForEachItem](../../14.best-practices/11.foreach-and-foreachitem/index.md#example-use-sibling-outputs-correctly-inside-foreach). +For more output patterns including map-reduce and large-file processing, see [Loop best practices](../../14.best-practices/11.loop/index.md). ## Outputs preview diff --git a/src/contents/docs/05.workflow-components/07.checks/index.md b/src/contents/docs/05.workflow-components/07.checks/index.md index 98102289bff..07594deb341 100644 --- a/src/contents/docs/05.workflow-components/07.checks/index.md +++ b/src/contents/docs/05.workflow-components/07.checks/index.md @@ -1,17 +1,15 @@ --- title: Checks in Kestra – Pre-Execution Validations h1: Validate Inputs Before Any Task Runs with Checks -description: Implement Checks in Kestra for pre-execution validation. Guard your workflows by enforcing conditions on inputs before any task begins execution. +description: Use checks to enforce conditions on inputs before any task runs, blocking or failing executions that don't meet your criteria. sidebarTitle: Checks icon: /src/contents/docs/icons/flow.svg version: ">= 1.2.0" --- -Add pre-execution validations that can block or fail an execution before any tasks run. +Checks are pre-execution validations that block or fail an execution before any tasks run. -## Add checks to validate inputs before execution - -`checks` are flow-level assertions evaluated when validating inputs and before creating a new execution. Each check defines a boolean `condition` and a `message` shown when the condition is false. You can choose how Kestra reacts (block, fail, or still create the execution) and how the message is styled in the UI. +`checks` are flow-level assertions evaluated when validating inputs and before creating a new execution. Each check defines a boolean `when` expression and a `message` shown when the expression evaluates to false. You can choose how Kestra reacts (block, fail, or still create the execution) and how the message is styled in the UI. Checks are useful to enforce business rules on inputs (e.g., allowed values, date windows, required flags) or to nudge users with warnings before they launch a run. @@ -19,21 +17,19 @@ Checks are useful to enforce business rules on inputs (e.g., allowed values, dat Each item in `checks` supports the following properties: -- `condition` *(required)*: Pebble expression that must evaluate to a boolean. For example, you can design checks against Inputs, Key-Value pairs, or other [expression](../../expressions/index.mdx) accessible workflow components. +- `when` *(required)*: Pebble expression that must evaluate to a boolean. Checks can reference inputs, key-value pairs, and other components accessible via [expressions](../../expressions/index.mdx). - `message` *(required)*: Text displayed when the condition is false. - `style` *(optional, default `INFO`)*: Visual style for the message. One of `ERROR`, `SUCCESS`, `WARNING`, `INFO`. - `behavior` *(optional, default `BLOCK_EXECUTION`)*: How the flow should react when the condition is false. One of: - `BLOCK_EXECUTION`: Do not create the execution. - `FAIL_EXECUTION`: Create the execution immediately in a failed state. - - `CREATE_EXECUTION`: Allow execution creation even if the check fails. + - `CREATE_EXECUTION`: Create the execution even when the check fails. -When clicking **Execute**, with an `ERROR` message display set in the flow code, the modal will display the `message` as soon as an input is set that doesn't satisfy the check like below: +When you click **Execute**, the modal displays the `message` as soon as an input fails a check: ![Failed Check](./checks-fail.png) ---- - -### Multiple checks +## Multiple checks If several checks fail, the most restrictive behavior wins in this priority order: `BLOCK_EXECUTION` → `FAIL_EXECUTION` → `CREATE_EXECUTION`. This lets you mix hard stops with softer warnings in the same flow. @@ -53,7 +49,7 @@ inputs: checks: - message: "Sorry, this flow can only be executed with 'Kestra'" - condition: "{{ (inputs.name | upper) == 'KESTRA' }}" + when: "{{ (inputs.name | upper) == 'KESTRA' }}" style: ERROR behavior: BLOCK_EXECUTION @@ -86,13 +82,13 @@ inputs: checks: # Block risky prod runs outside the allowed window - message: "Prod runs are only allowed between 06:00 and 22:00 UTC" - condition: "{{ inputs.environment != 'prod' or (inputs.run_date | date('HH') | number >= 6 and inputs.run_date | date('HH') | number < 22) }}" + when: "{{ inputs.environment != 'prod' or (inputs.run_date | date('HH') | number >= 6 and inputs.run_date | date('HH') | number < 22) }}" style: ERROR behavior: BLOCK_EXECUTION # Warn if the payload is not the approved source - message: "Non-approved source detected. Use https://dummyjson.com when possible." - condition: "{{ inputs.payload_url | startsWith('https://dummyjson.com') }}" + when: "{{ inputs.payload_url | startsWith('https://dummyjson.com') }}" style: WARNING behavior: CREATE_EXECUTION diff --git a/src/contents/docs/05.workflow-components/07.triggers/01.schedule-trigger/index.md b/src/contents/docs/05.workflow-components/07.triggers/01.schedule-trigger/index.md index b2ce0151149..b8d56800ff0 100644 --- a/src/contents/docs/05.workflow-components/07.triggers/01.schedule-trigger/index.md +++ b/src/contents/docs/05.workflow-components/07.triggers/01.schedule-trigger/index.md @@ -52,11 +52,7 @@ triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "0 11 * * 1" - conditions: - - type: io.kestra.plugin.core.condition.DayWeekInMonth - date: "{{ trigger.date }}" - dayOfWeek: "MONDAY" - dayInMonth: "FIRST" + when: "{{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }}" ``` A schedule that runs daily at midnight US Eastern time: @@ -95,26 +91,15 @@ You can use this expression to make your **manual execution work**: `{{ trigger. ::: -## Schedule conditions +## Refining schedules with `when` -When a `cron` expression alone is not sufficient (e.g., only first Monday of the month, only weekends), you can refine schedules using `conditions`. +When a `cron` expression alone is not sufficient (e.g., only first Monday of the month, only weekends), you can refine schedules using a `when` Pebble expression. -You **must** use the `{{ trigger.date }}` expression on the property `date` of the current schedule. +You can use the `{{ trigger.date }}` expression to access the current schedule date within the `when` expression. The [date and calendar helper functions](/docs/expressions#date-and-calendar-helpers) in the expressions reference cover all available date functions such as `isDayWeekInMonth()`, `dayOfWeek()`, `isWeekend()`, and `isPublicHoliday()`. -This condition will be evaluated and `{{ trigger.previous }}` and `{{ trigger.next }}` will reflect the date **with** the conditions applied. +The `when` expression is evaluated and `{{ trigger.previous }}` and `{{ trigger.next }}` will reflect the date **with** the condition applied. -The list of core conditions that can be used are: - - - [DateTimeBetween](/plugins/core/condition/io.kestra.plugin.core.condition.datetimebetween) - - [DayWeek](/plugins/core/condition/io.kestra.plugin.core.condition.dayweek) - - [DayWeekInMonth](/plugins/core/condition/io.kestra.plugin.core.condition.dayweekinmonth) - - [Not](/plugins/core/condition/io.kestra.plugin.core.condition.not) - - [Or](/plugins/core/condition/io.kestra.plugin.core.condition.or) - - [Weekend](/plugins/core/condition/io.kestra.plugin.core.condition.weekend) - - [PublicHoliday](/plugins/core/condition/io.kestra.plugin.core.condition.publicholiday) - - [TimeBetween](/plugins/core/condition/io.kestra.plugin.core.condition.timebetween) - -Here's an example using the `DayWeek` condition: +Here's an example using a day-of-week check: ```yaml id: conditions @@ -129,9 +114,7 @@ triggers: - id: schedule type: io.kestra.plugin.core.trigger.Schedule cron: "@hourly" - conditions: - - type: io.kestra.plugin.core.condition.DayWeek - dayOfWeek: "THURSDAY" + when: "{{ dayOfWeek(trigger.date) == 'THURSDAY' }}" ``` ## Recover missed schedules diff --git a/src/contents/docs/05.workflow-components/07.triggers/02.flow-trigger/index.md b/src/contents/docs/05.workflow-components/07.triggers/02.flow-trigger/index.md index d39c244a2a2..99e12adf362 100644 --- a/src/contents/docs/05.workflow-components/07.triggers/02.flow-trigger/index.md +++ b/src/contents/docs/05.workflow-components/07.triggers/02.flow-trigger/index.md @@ -10,31 +10,26 @@ Trigger one flow based on the execution of another flow. A Flow trigger runs a flow after another flow completes, enabling event-driven workflows and dependencies across teams. + ```yaml type: io.kestra.plugin.core.trigger.Flow ``` -Kestra can automatically start a flow as soon as another flow ends. This allows you to create dependencies between flows, even when those flows are owned by different teams. +A Flow trigger runs a flow after another flow completes, enabling event-driven workflows and dependencies across teams. This allows you to create dependencies between flows, even when those flows are owned by different teams. Check the [Flow trigger](/plugins/core/trigger/io.kestra.plugin.core.trigger.flow) documentation for the list of all properties. -## Preconditions - -A Flow trigger requires preconditions to filter which upstream executions can trigger the flow, often within a defined time window. - -:::alert{type="info"} -[Pebble expressions](../../../expressions/index.mdx) cannot be used in Flow Trigger (pre)conditions. You must declaratively define any condition variables. -::: +## Upstream flow dependencies -### Filters +The `dependsOn` property is a list of upstream flow entries that must all complete in matching states before the trigger fires. -- `flows`: A list of preconditions to meet, in the form of upstream flows +### Basic single upstream flow -The example below shows a Flow trigger that runs when `flow_a` completes successfully. +The example below triggers `flow_b` when `flow_a` from the `company.team` namespace completes successfully: ```yaml id: flow_b -namespace: kestra.sandbox +namespace: company.team tasks: - id: hello @@ -42,191 +37,320 @@ tasks: message: "Hello World!" triggers: - - id: upstream_dependancy + - id: after_extract type: io.kestra.plugin.core.trigger.Flow - preconditions: - id: flow_trigger - flows: - - namespace: kestra.sandbox - flowId: flow_a - states: [SUCCESS] + dependsOn: + - flowId: extract + namespace: company.team + states: [SUCCESS] ``` -:::alert{type="info"} -It is [best practice](../../../14.best-practices/0.flows/index.md#flow-trigger-on-state-change) when using a flow trigger to use `preconditions.flows.states` rather than the `states` task property when defining state conditions for one specific flow. -::: +### Multiple upstream flows + +List multiple entries under `dependsOn`. All entries must be satisfied before the trigger fires: + +```yaml +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team +``` + +### Entry properties -- `where`: filter executions based on fields like `FLOW_ID`, `NAMESPACE`, `STATE`, and `EXPRESSION`. +| Property | Type | Description | +|-------------|-----------------------|----------------------------------------------------------------------------------------------------------------------| +| `flowId` | `String` | The ID of the upstream flow to match. Omit to match any flow (combine with `when` to narrow the scope). | +| `namespace` | `String` | The namespace of the upstream flow. Exact match only — use `when` for prefix or pattern matching. | +| `states` | `List` | States that satisfy this entry. Defaults to `[SUCCESS, WARNING]`. | +| `labels` | `Map` | Key-value pairs that must all be present on the upstream execution's labels. | +| `when` | `String` | A Pebble expression evaluated against the upstream execution. The entry is satisfied only when this evaluates to true.| -For example, the following Flow Trigger triggers on execution from flows in FAILED or WARNING states in namespaces starting with "company": +### Prefix and pattern matching + +When `namespace` is set, Kestra matches it exactly. To match a range of namespaces or flows, omit `namespace` and use `when` with a Pebble expression: ```yaml triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow - states: - - FAILED - - WARNING - preconditions: - id: company_namespace - where: - - id: company - filters: - - field: NAMESPACE - type: STARTS_WITH - value: company + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company') }}" ``` -### Time Window & SLA +## Conditional guard with `when` -The `timeWindow` property lets you define how Kestra evaluates upstream flow executions over time. It supports several modes: +Like all triggers, the Flow trigger supports a top-level `when` Pebble expression. It is evaluated before `dependsOn` — if it returns a falsy value, the trigger does not fire regardless of upstream state: -- `DURATION_WINDOW`: This is the default type. It uses a start time (windowAdvance) and end time (window) that are moving forward to the next interval whenever the evaluation time reaches the end time, based on the defined duration window. +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + when: "{{ labels.env == 'production' }}" + dependsOn: + - flowId: extract + namespace: company.team +``` + +## Time window -For example, with a 1-day window (`window: PT1D`, the default), SLA conditions are evaluated over a 24-hour period starting at midnight each day. If you set `windowAdvance: PT6H`, the window will start at 6 AM each day. If you set `windowAdvance: PT6H` and you also override `window: PT6H`, the window will start at 6 AM and last for 6 hours — as a result, Kestra will check the SLA conditions during the following time periods: `06:00` to `12:00`, `12:00` to `18:00`, `18:00` to `00:00`, and `00:00` to `06:00`, and so on. +The `window` property controls how long Kestra accumulates upstream executions before evaluating whether all `dependsOn` entries are satisfied. -- `SLIDING_WINDOW`: This option also evaluates SLA conditions over a fixed time window, but it always goes backward from the current time. For example, a sliding window of 1 hour (window: PT1H) will evaluate executions for the past hour (so between now and one hour before now). It uses a default window of 1 day. +### Deadline -For example, the flow below evaluates every hour if the flow `flow_a` is in SUCCESS state. If so, it triggers the `flow_b` passing corresponding inputs (reading `flow_a` outputs). +All upstream flows must complete before a fixed time each day. The deadline string must include a timezone offset: ```yaml -id: flow_b -namespace: kestra.sandbox +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team + window: + deadline: "09:00:00+01:00" +``` -inputs: - - id: value_from_a - type: STRING +### Daily time range -tasks: - - id: hello - type: io.kestra.plugin.core.log.Log - message: "{{ inputs.value_from_a }}" +Only executions that completed within a specific time range each day are counted: +```yaml triggers: - - id: upstream_dep + - id: after_staging type: io.kestra.plugin.core.trigger.Flow - inputs: - value_from_a: "{{ trigger.outputs.return_value }}" - preconditions: - id: test - flows: - - namespace: kestra.sandbox - flowId: flow_a - states: [SUCCESS] - timeWindow: - type: SLIDING_WINDOW - window: PT1H + dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team + window: + from: "06:00:00" + to: "12:00:00" ``` -For reference, below is `flow_a`: +### Fixed interval + +`every` defines the window size and `offset` shifts its start relative to midnight: ```yaml -id: flow_a -namespace: kestra.sandbox +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + window: + every: PT1H + offset: PT30M +``` -tasks: - - id: hello - type: io.kestra.plugin.core.log.Log - message: Hello World! 🚀 +### Lookback -outputs: - - id: return_value - type: STRING - value: "Flow A run succesfully" +Count executions that completed within the past duration, relative to the current evaluation time: + +```yaml +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + window: + lookback: PT1H +``` + +### Fire once per window + +Set `fireOnce: true` to ensure the trigger fires at most once per window, even if conditions are satisfied multiple times: + +```yaml +window: + deadline: "09:00:00+01:00" + fireOnce: true ``` +## Scoped trigger outputs + +When a Flow trigger fires, upstream execution outputs are available under `trigger.outputs`. Outputs are scoped by flow ID to avoid key collisions when multiple upstream flows are involved: -- `DAILY_TIME_DEADLINE`: This option enforces SLA conditions that must be met before a specific cutoff time each day. With the string property deadline, you can configure a daily cutoff for checking conditions. For example, deadline: `09:00:00.00Z` means that the defined SLA conditions should be met from midnight until 9 AM each day; otherwise, the flow will not be triggered. +``` +trigger.outputs.. +``` -For the example, this trigger definition only triggers the flow if `flow_a` is in SUCCESS state before `9:00` AM every day. +For example, to pass an output from an upstream flow named `extract`: ```yaml triggers: - - id: upstream_dep + - id: after_extract type: io.kestra.plugin.core.trigger.Flow - preconditions: - id: should_be_success_by_nine - flows: - - namespace: kestra.sandbox - flowId: flow_a - states: [SUCCESS] - timeWindow: - type: DAILY_TIME_DEADLINE - deadline: "09:00:00.00Z" + inputs: + date: "{{ trigger.outputs.extract.date }}" + dependsOn: + - flowId: extract + namespace: company.team ``` -- `DAILY_TIME_WINDOW`: This option enforces SLA conditions that must be met within a specific daily time range. For example, a window from `startTime: "06:00:00"` to `endTime: "09:00:00"` evaluates executions within that interval each day. This option is particularly useful for declarative definition of freshness conditions when building data pipelines. For example, if you only need one successful execution within a given time range to guarantee that some data has been successfully refreshed in order for you to proceed with the next steps of your pipeline, this option can be more useful than a strict DAG-based approach. Usually, each failure in your flow would block the entire pipeline, whereas with this option, you can proceed with the next steps of the pipeline as soon as the data is successfully refreshed at least once within the given time range. +:::alert{type="warning"} +The output scoping format changed in Kestra 2.0. If you previously used `trigger.outputs.` (a flat map), update your expressions to the new `trigger.outputs..` format. +::: + +## Label-based filtering + +Use the `labels` map on a `dependsOn` entry to restrict which upstream executions are counted. All specified labels must be present on the upstream execution: ```yaml triggers: - - id: upstream_dep + - id: after_prod type: io.kestra.plugin.core.trigger.Flow - inputs: - value_from_a: "{{ trigger.outputs.return_value }}" - preconditions: - id: test - flows: - - namespace: kestra.sandbox - flowId: flow_a - states: [SUCCESS] - timeWindow: - type: DAILY_TIME_WINDOW - startTime: "06:00:00" - endTime: "12:00:00" + dependsOn: + - namespace: company.team + labels: + env: production + states: [SUCCESS] ``` +## Filtering with `when` expressions +Use `when` on a `dependsOn` entry to apply arbitrary Pebble conditions against the upstream execution context. -## Example +Filter on an output value: -This example triggers the `silver_layer` flow once the `bronze_layer` flow finishes successfully by 9 AM. The deadline time string must include the timezone offset. This ensures that no new executions are triggered past the deadline. Here is the `silver_layer` flow: +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: extract + namespace: company.team + when: "{{ outputs.row_count > 0 }}" +``` + +Filter on retry attempts: + +```yaml +triggers: + - id: after_flaky + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flaky_pipeline + namespace: company.team + states: [SUCCESS] + when: "{{ hasRetryAttempt == true }}" +``` + +## Example: data pipeline with SLA deadline + +This example triggers the `silver_layer` flow once the `bronze_layer` flow finishes successfully by 9 AM: ```yaml id: silver_layer namespace: company.team + tasks: - id: transform_data type: io.kestra.plugin.core.log.Log message: deduplication, cleaning, and minor aggregations + triggers: - id: flow_trigger type: io.kestra.plugin.core.trigger.Flow - preconditions: - id: bronze_layer - timeWindow: - type: DAILY_TIME_DEADLINE - deadline: "09:00:00+01:00" - flows: - - namespace: company.team - flowId: bronze_layer - states: [SUCCESS] + dependsOn: + - flowId: bronze_layer + namespace: company.team + states: [SUCCESS] + window: + deadline: "09:00:00+01:00" ``` -## Example: Alerting +## Example: alerting on failure -This example creates a `System Flow` to send a Slack alert on any failure or warning state within the `company` namespace. This example uses the Slack webhook secret to notify the `#general` channel about the failed flow. +This example creates a system flow that sends a Slack alert on any failure or warning state within the `company` namespace: ```yaml id: alert namespace: system + tasks: - id: send_alert type: io.kestra.plugin.slack.notifications.SlackExecution - url: "{{secret('SLACK_WEBHOOK')}}" # format: https://hooks.slack.com/services/xzy/xyz/xyz + url: "{{secret('SLACK_WEBHOOK')}}" channel: "#general" executionId: "{{trigger.executionId}}" + triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow - states: - - FAILED - - WARNING - preconditions: - id: company_namespace - where: - - id: company - filters: - - field: NAMESPACE - type: STARTS_WITH - value: company + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company') }}" +``` + +## Example: mixed success and failure triggers + +You can define multiple Flow triggers on the same flow to react differently to upstream success vs. failure: + +```yaml +triggers: + - id: on_completion + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flow_a + namespace: company.team + states: [SUCCESS] + - id: on_failure + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flow_a + namespace: company.team + states: [FAILED] ``` + +## Example: passing upstream outputs downstream + +Reference upstream outputs using the scoped path `trigger.outputs..`: + +```yaml +id: flow_b +namespace: company.team + +inputs: + - id: value_from_a + type: STRING + +tasks: + - id: hello + type: io.kestra.plugin.core.log.Log + message: "{{ inputs.value_from_a }}" + +triggers: + - id: upstream_dep + type: io.kestra.plugin.core.trigger.Flow + inputs: + value_from_a: "{{ trigger.outputs.flow_a.return_value }}" + dependsOn: + - flowId: flow_a + namespace: company.team + states: [SUCCESS] +``` + +:::alert{type="info"} +`dependsOn` condition IDs are derived from a stable hash of each entry's `namespace`, `flowId`, `when`, `states`, and `labels`. Reordering entries in the list does not reset accumulated window state. +::: + +## Input rendering failures create FAILED executions + +If an `inputs` expression on a Flow trigger fails to render — for example, because an upstream output key does not exist — Kestra creates a `FAILED` execution instead of silently dropping the event. This makes failures visible in the UI and actionable via alerting. + +## Removed: `preconditions` and `conditions` + +The `preconditions` and `conditions` properties are removed in Kestra 2.0. Flows that still use them will fail to parse after upgrading. Migrate to `dependsOn`. + +See the [trigger conditions migration guide](../../../11.migration-guide/v2.0.0/trigger-conditions-redesign/index.md) for a complete before/after reference. diff --git a/src/contents/docs/05.workflow-components/07.triggers/03.webhook-trigger/index.md b/src/contents/docs/05.workflow-components/07.triggers/03.webhook-trigger/index.md index 66f729e865e..ea0fb438007 100644 --- a/src/contents/docs/05.workflow-components/07.triggers/03.webhook-trigger/index.md +++ b/src/contents/docs/05.workflow-components/07.triggers/03.webhook-trigger/index.md @@ -16,7 +16,6 @@ Each webhook URL requires a secret `key` to secure it. This prevents unauthorize type: io.kestra.plugin.core.trigger.Webhook ``` -A Webhook trigger enables triggering a flow from a webhook URL. When you create the trigger, you must provide a `key`. This `key` is embedded in the webhook URL: `/api/v1/main/executions/webhook/{namespace}/{flowId}/{key}`. For security, use a randomly generated string rather than something easy to guess. Kestra accepts `GET`, `POST`, and `PUT` requests on the webhook URL. Both the request body and headers are automatically available as variables inside your flow. @@ -58,7 +57,7 @@ You can also copy the formed Webhook URL from the **Triggers** tab. ## Webhook response -By default, a webhook trigger answers with JSON. When you need the caller to wait for a custom response (e.g., validation handshakes that require `text/plain`), enable `wait` and set the `responseContentType` to `text/plain`. +By default, a webhook trigger responds with JSON. When you need the caller to wait for a custom response (e.g., validation handshakes that require `text/plain`), enable `wait` and set the `responseContentType` to `text/plain`. ```yaml triggers: @@ -70,7 +69,6 @@ triggers: responseContentType: text/plain # optional, defaults to application/json ``` -Behavior: - `wait: true` keeps the HTTP connection open until the flow finishes or hits the trigger’s timeout. - `returnOutputs: true` returns the flow outputs as the HTTP response body (JSON by default). Override with `responseContentType` for plaintext or other formats. @@ -82,6 +80,28 @@ If your flow uses trigger variables (such as `{{ trigger.body }})`, you can test See the [Webhook trigger plugin documentation](/plugins/core/trigger/io.kestra.plugin.core.trigger.webhook) for a full list of properties and outputs. +## Filtering webhook executions with `when` + +Use the `when` property to conditionally fire the trigger based on the request body or headers. The `when` value is a [Pebble expression](../../../expressions/index.mdx) evaluated against the incoming request. If the expression evaluates to a falsy value, Kestra ignores the request and no execution is created. + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: 4wjtkzwVGBM9yKnjm3yv8r + when: "{{ trigger.body.hello == 'world' }}" +``` + +You can combine multiple criteria in a single expression using `and` / `or`: + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: 4wjtkzwVGBM9yKnjm3yv8r + when: "{{ trigger.body.event == 'push' and trigger.headers['x-github-event'] == 'push' }}" +``` + ### Return flow outputs in the webhook response To send task outputs back to the caller in the HTTP response, configure the Webhook trigger to wait for the execution and return outputs. The flow must expose at least one `outputs` entry. diff --git a/src/contents/docs/05.workflow-components/07.triggers/04.polling-trigger/index.md b/src/contents/docs/05.workflow-components/07.triggers/04.polling-trigger/index.md index 92c1625e35a..70e340b3631 100644 --- a/src/contents/docs/05.workflow-components/07.triggers/04.polling-trigger/index.md +++ b/src/contents/docs/05.workflow-components/07.triggers/04.polling-trigger/index.md @@ -12,6 +12,8 @@ Polling triggers repeatedly check an external system at a fixed interval. When n Kestra provides polling triggers for a wide variety of systems, including databases, message queues, cloud storage, and FTP servers. +Polling triggers are not limited to external connectors. Some plugins also provide script-based polling triggers, allowing you to run code on an interval and emit only when a condition matches. For example, the script plugins for Python, Shell, Ruby, Go, and Node provide `ScriptTrigger` and `CommandsTrigger` variants for polling with code or commands. + The polling frequency is controlled by the `interval` property. When triggered, the flow has access to the polling results through the `trigger` variable, making the retrieved data immediately available for downstream tasks. ## Example diff --git a/src/contents/docs/05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md b/src/contents/docs/05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md new file mode 100644 index 00000000000..43427f2c1f6 --- /dev/null +++ b/src/contents/docs/05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md @@ -0,0 +1,101 @@ +--- +title: MCP Tool Trigger in Kestra – Expose Flows as AI Tools +h1: Expose Flows as MCP Tools with the McpToolTrigger +description: Use the McpToolTrigger to expose Kestra flows as tools on an MCP server. AI agents can discover and invoke them automatically, with inputs and outputs mapped to a JSON schema. +sidebarTitle: MCP Tool Trigger +icon: /src/contents/docs/icons/flow.svg +version: ">= 2.0.0" +--- + +Expose a flow as a named tool on a Kestra MCP server. + +The `McpToolTrigger` makes any flow discoverable and callable by MCP-compatible AI agents such as Claude Desktop, Claude Code, and Cursor. Flow inputs are automatically converted to a JSON schema tool spec so the AI agent knows exactly what parameters to pass. Each invocation creates a new flow execution tagged with `system.from:mcp` for observability. + +```yaml +type: io.kestra.plugin.core.trigger.McpToolTrigger +``` + +:::alert{type="info"} +Every tenant has a `default` MCP server provisioned on startup, so the trigger works without creating a server first. See [MCP Server](../../../ai-tools/mcp-server/index.md) to create additional servers and connect AI agent clients. +::: + +## Example + +```yaml +id: hello_world +namespace: company.team + +inputs: + - id: user + type: STRING + defaults: John Doe + description: "The name of the user to greet." + +tasks: + - id: greet + type: io.kestra.plugin.core.output.OutputValues + values: + greeting: "Hello, {{ inputs.user }}!" + +outputs: + - id: greeting + type: STRING + value: "{{ outputs.greet.values.greeting }}" + +triggers: + - id: mcp + type: io.kestra.plugin.core.trigger.McpToolTrigger + toolName: hello_world + title: Hello World greeting tool + toolDescription: Returns a personalised greeting. Call this when the user asks for a greeting. + mcpServer: default +``` + +When deployed, an MCP client connected to the `default` server will discover a tool named `hello_world`. It will accept a `user` parameter (typed as `string` from the flow input) and return a `greeting` string in the tool response. + +## Properties + +| Property | Required | Default | Description | +|---|---|---|---| +| `toolName` | Yes | — | Tool identifier shown to the AI agent. Must match `^[a-z0-9][a-z0-9_-]*$`. | +| `title` | Yes | — | Human-readable name shown to the AI agent. | +| `toolDescription` | No | — | Description of the tool shown to the AI agent, used to decide when to invoke it. | +| `mcpServer` | No | `"default"` | ID of the MCP server to register this tool on. Must match the `id` of an existing [MCP server](../../../ai-tools/mcp-server/index.md). | +| `annotations.readOnly` | No | `false` | Hint that this tool does not modify its environment. | +| `annotations.destructive` | No | `true` | Hint that this tool may perform destructive updates. Only meaningful when `readOnly` is `false`. | +| `annotations.openWorld` | No | `false` | Hint that this tool may interact with entities outside its closed domain. | +| `annotations.idempotent` | No | `false` | Hint that calling the tool repeatedly with the same arguments has no additional effect. Only meaningful when `readOnly` is `false`. | + +Annotations are informational hints for MCP clients. They do not affect execution behavior. + +A flow can be registered on exactly one MCP server at a time via the `mcpServer` property. Multiple flows can share the same server, each appearing as a separate tool. + +### Writing effective tool descriptions + +The `toolDescription` is what the AI agent reads to decide whether to call your tool. Describe *when* and *why* to invoke the flow, not just what it does. For example: + +```yaml +toolDescription: > + Returns a personalised greeting for a named user. + Call this tool whenever the user asks to be greeted or wants a welcome message. +``` + +## Input and output mapping + +Flow inputs and outputs are automatically mapped to the MCP tool's input and output schema. + +- Each flow `input` becomes a tool parameter. The `description` field on the input is passed to the AI agent as the parameter description. +- Flow `outputs` are returned in the tool response. Each output's `displayName` is used as the label in the response. + +To constrain the structure of a `JSON`-type input, use the `jsonSchema` property. See [JSON input validation](../../05.inputs/index.md#input-validation). + +## Observability + +Every execution created via MCP carries two [system labels](../../../06.concepts/system-labels/index.md): + +| Label | Value | +|---|---| +| `system.from` | `mcp` | +| `system.mcpServerId` | The `id` of the MCP server that invoked the tool | + +Filter executions by `system.from: mcp` in the Executions view to see all MCP-triggered runs. diff --git a/src/contents/docs/05.workflow-components/07.triggers/index.mdx b/src/contents/docs/05.workflow-components/07.triggers/index.mdx index 3663edd070d..b7eaffbaeac 100644 --- a/src/contents/docs/05.workflow-components/07.triggers/index.mdx +++ b/src/contents/docs/05.workflow-components/07.triggers/index.mdx @@ -21,23 +21,22 @@ A trigger is a mechanism that automatically starts the execution of a flow. > -Triggers can be either scheduled or event-based, giving you flexibility in how you automate workflow execution. +Triggers can be either scheduled or event-based. ## Trigger types -Kestra supports both **scheduled** and **external** events. - -Kestra supports five core trigger types: +Kestra supports six core trigger types: - [Schedule trigger](./01.schedule-trigger/index.md) allows you to execute your flow on a regular cadence e.g. using a CRON expression and custom scheduling conditions. - [Flow trigger](./02.flow-trigger/index.md) allows you to execute your flow when another flow finishes its execution (based on a configurable list of states). - [Webhook trigger](./03.webhook-trigger/index.md) allows you to execute your flow based on an HTTP request emitted by a webhook. - [Polling trigger](./04.polling-trigger/index.md) allows you to execute your flow by polling external systems for the presence of data. - [Realtime trigger](./05.realtime-trigger/index.md) allows you to execute your flow when events happen with millisecond latency. +- [MCP Tool trigger](./06.mcp-tool-trigger/index.md) allows you to expose your flow as a named tool on a Kestra MCP server, making it callable by AI agents such as Claude Desktop, Claude Code, and Cursor. Many other triggers are available from the plugins, such as triggers based on file detection events, e.g. the [S3 trigger](/plugins/plugin-aws/s3/io.kestra.plugin.aws.s3.trigger), or a new message arrival in a message queue, such as the [SQS](/plugins/plugin-aws/sqs/io.kestra.plugin.aws.sqs.realtimetrigger) or [Kafka trigger](/plugins/plugin-kafka/io.kestra.plugin.kafka.trigger). -### Trigger Common Properties +### Trigger common properties The following properties are common to all triggers: @@ -49,14 +48,15 @@ The following properties are common to all triggers: | `disabled` | Set it to `true` to disable execution of the trigger. | | `allowConcurrent` | Set it to `true` to allow multiple executions from this trigger to run at the same time. | | `workerGroup.key` | To execute this trigger on a specific Worker Group (EE). | +| `when` | A Pebble expression that must evaluate to `true` for the trigger to fire. | ## Trigger variables Triggers expose metadata through expressions. For example: -– `{{ trigger.date }}` returns the current date for the [Schedule trigger](./01.schedule-trigger/index.md) -– `{{ trigger.uri }}` returns the file or message for file detection or message arrival events -– `{{ trigger.rows }}` provides query results for triggers like [PostgreSQL Query](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.trigger) trigger. +- `{{ trigger.date }}` returns the current date for the [Schedule trigger](./01.schedule-trigger/index.md) +- `{{ trigger.uri }}` returns the file or message for file detection or message arrival events +- `{{ trigger.rows }}` provides query results for triggers like the [PostgreSQL Query](/plugins/plugin-jdbc-postgres/io.kestra.plugin.jdbc.postgresql.trigger) trigger This example will log the date when the trigger executes the flow: @@ -97,28 +97,56 @@ triggers: cron: "@hourly" ``` -## Conditions +## `when` and `dependsOn` + +Kestra 2.0 replaces the old `conditions` list on triggers with two composable properties: + +### `when` — Pebble expression (all triggers) + +Every trigger type supports a `when` property. It accepts a [Pebble expression](../../expressions/index.mdx) that is evaluated at trigger time. If the expression evaluates to a falsy value (`false`, `0`, empty string), the trigger does not fire. + +Use `when` to express time-based conditions on Schedule triggers, to filter Webhook payloads, or to add a global guard on any trigger type: + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + when: "{{ not isWeekend(trigger.date) and not isPublicHoliday(trigger.date, 'FR') }}" +``` + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: 4wjtkzwVGBM9yKnjm3yv8r + when: "{{ trigger.body.hello == 'world' }}" +``` -Conditions are criteria that determine when a trigger should create a new execution. Usually, they limit the scope of a trigger to a specific set of cases. +For calendar-based scheduling, helper functions like `isWeekend()`, `isPublicHoliday()`, `isDayWeekInMonth()`, and `dayOfWeek()` are available. See [date and calendar helpers](../../expressions/index.mdx#date-and-calendar-helpers) in the expressions reference. -For example, you can restrict a Flow trigger to a specific namespace prefix or execution status, and you can restrict a Schedule trigger to a specific time of the week or month. +### `dependsOn` — upstream flow dependencies (Flow trigger only) -You can pass a list of conditions; in this case, all the conditions must match to enable the current action. +The [Flow trigger](./02.flow-trigger/index.md) replaces both `conditions` and `preconditions` with a `dependsOn` list. Each entry declares one upstream flow that must complete in a matching state. All entries must be satisfied before the trigger fires. -Available conditions include: +```yaml +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team + window: + deadline: "09:00:00+01:00" +``` -- [HasRetryAttempt](/plugins/core/condition/io.kestra.plugin.core.condition.hasretryattempt) -- [MultipleCondition](/plugins/core) -- [Not](/plugins/core/condition/io.kestra.plugin.core.condition.not) -- [Or](/plugins/core/condition/io.kestra.plugin.core.condition.or) -- [ExecutionFlow](/plugins/core/condition/io.kestra.plugin.core.condition.executionflow) -- [ExecutionNamespace](/plugins/core/condition/io.kestra.plugin.core.condition.executionnamespace) -- [ExecutionLabels](/plugins/core/condition/io.kestra.plugin.core.condition.executionlabels) -- [ExecutionStatus](/plugins/core/condition/io.kestra.plugin.core.condition.executionstatus) -- [ExecutionOutputs](/plugins/core/condition/io.kestra.plugin.core.condition.executionoutputs) -- [Expression](/plugins/core/condition/io.kestra.plugin.core.condition.expression) +See the [Flow trigger documentation](./02.flow-trigger/index.md) for the full `dependsOn` and `window` property reference. -You can also find datetime related conditions [on the Schedule trigger page](./01.schedule-trigger/index.md#schedule-conditions). +:::alert{type="warning"} +The `conditions` list is removed in Kestra 2.0. Flows that still use it will fail to parse after upgrading. See the [trigger conditions migration guide](../../11.migration-guide/v2.0.0/trigger-conditions-redesign/index.md) for before/after examples. +::: ## Unlocking, enabling, and disabling triggers @@ -213,12 +241,12 @@ When you add that flow to Kestra, you'll see that no Executions are created. To ## The `stopAfter` property -Kestra 0.15 introduced a generic `stopAfter` property which is a list of states that will disable the trigger after the flow execution has reached one of the states in the list. +The `stopAfter` property is a list of states that disable the trigger after the flow execution reaches one of those states. This property is most useful with `Schedule` triggers and polling-based triggers such as HTTP, JDBC, or File Detection. :::alert{type="info"} -Note that we don't handle any automatic trigger reenabling logic. After a trigger has been disabled due to the `stopAfter` state condition, you can take some action based on it and manually reenable the trigger. +Kestra does not automatically re-enable a trigger after it has been disabled by `stopAfter`. You must re-enable it manually once you are ready to resume. ::: ### Pause the schedule trigger after a failed execution @@ -268,7 +296,7 @@ triggers: - SUCCESS ``` -Let's break down the above example: +This example works as follows: 1. The HTTP trigger will poll the API endpoint every 30 seconds to check if the price of a product is below $110. 2. If the condition is met, the Execution will be created diff --git a/src/contents/docs/05.workflow-components/09.plugin-defaults/index.md b/src/contents/docs/05.workflow-components/09.plugin-defaults/index.md index e2745ae8726..e69bb48f91c 100644 --- a/src/contents/docs/05.workflow-components/09.plugin-defaults/index.md +++ b/src/contents/docs/05.workflow-components/09.plugin-defaults/index.md @@ -15,7 +15,7 @@ They work like default function arguments, helping you avoid repetition when tas -## Plugin Defaults on a flow-level +## Plugin defaults at the flow level You can define plugin defaults in the `pluginDefaults` section to avoid repeating properties across multiple tasks of the same type. For example: @@ -66,22 +66,11 @@ pluginDefaults: containerImage: python:slim ``` -In this example, Docker and Python configurations are defined once in `pluginDefaults`, instead of being repeated in every task. This approach helps to streamline the configuration process and reduce the chances of errors caused by inconsistent settings across different tasks. - -:::alert{type="info"} -If you move required attributes into `pluginDefaults`, the UI code editor may show warnings about missing arguments, because defaults are only resolved at runtime. As long as `pluginDefaults` contains the relevant arguments, you can save the flow and ignore the warning displayed in the editor. - -![pluginDefaultsWarning](./warning.png) - -::: - -### `forced` attribute in `pluginDefaults` - -Setting `forced: true` in `pluginDefaults` ensures that default values override any properties defined directly in the task. By default, the value of the `forced` attribute is `false`. +In this example, Docker and Python configurations are defined once in `pluginDefaults` rather than repeated in every task. ## Plugin defaults in a global configuration -Plugin defaults can also be defined globally in your Kestra configuration, applying the same values across all flows. This is useful when you want to apply the same defaults across multiple flows. Let's say that you want to centrally manage the default values for the `io.kestra.plugin.aws` plugin to reuse the same credentials and region across all your flows. You can add the following to your Kestra configuration: +Plugin defaults can also be defined globally in your Kestra configuration, applying the same values across all flows. To centrally manage credentials for the `io.kestra.plugin.aws` plugin, add the following to your Kestra configuration: ```yaml kestra: @@ -94,7 +83,13 @@ kestra: region: "us-east-1" ``` -If you want to set defaults only for a specific task, you can do that too: +Global plugin defaults must be configured under `kestra.plugins.defaults`. + +:::alert{type="info"} +The legacy `kestra.tasks.defaults` property is still supported for backward compatibility, but it is deprecated. Use `kestra.plugins.defaults` for all new configurations. +::: + +To set defaults for a specific task type only: ```yaml kestra: @@ -126,17 +121,35 @@ kestra: This is equivalent to writing the same nested structure directly in a task. The `forced: true` attribute ensures these defaults override any values set at the task level. +### Precedence of global, flow, and task values + +Kestra applies plugin defaults in this order: + +1. Global plugin defaults from `kestra.plugins.defaults` +2. Flow-level `pluginDefaults` +3. Properties defined directly on the task + +That means flow-level defaults override global defaults, and task properties override flow-level defaults. + +Global configuration and namespace-level plugin defaults support a `forced` property. Setting `forced: true` on a global or namespace-level default makes it override any value set directly on the task. This is intended for governance use cases — for example, enforcing a specific task runner across all flows in a namespace. + ## Plugin Defaults Enterprise Edition :::alert{type="info"} In the [Enterprise Edition](../../07.enterprise/index.mdx) or [Kestra Cloud](/cloud), plugin defaults can be configured directly in the UI under the **Plugin Defaults** tab of a Namespace. ::: -You can create them via form or directly as YAML code for the Namespace: +You can create them from the Namespace UI using the guided form or YAML editor: ![Plugin Default Form Creation](./plugin-default-creation.png) -Or click on **YAML** and, for example, paste the following: +The add/edit dialog lets you: + +- choose a predefined plugin type or enter a custom plugin type +- switch between a form view and a YAML view +- preview the generated YAML for an existing plugin default + +If you switch to **YAML**, you can paste content such as: ```yaml - type: io.kestra.plugin.aws.s3.Upload @@ -148,8 +161,12 @@ Or click on **YAML** and, for example, paste the following: ### Inherited Plugin Defaults -Plugin Defaults are inherited from the parent Namespace to children Namespaces. In the example above, the image shows the Plugin Default was created in the `kestra.company` Namespace. Navigating to the **Plugin Defaults** tab of a child Namespace, for example `kestra.company.data`, shows the parent Namespace's Plugin Defaults. This avoids having to recreate Plugin Defaults across children Namespaces, but it still allows for the children Namespaces to maintain their own isolated defaults if needed. +Plugin Defaults are inherited from the parent Namespace to children Namespaces. In the example above, the image shows the Plugin Default was created in the `kestra.company` Namespace. Navigating to the **Plugin Defaults** tab of a child Namespace, for example `kestra.company.data`, shows the parent Namespace's Plugin Defaults together with the Namespace they come from. This avoids having to recreate Plugin Defaults across children Namespaces, but it still allows for the children Namespaces to maintain their own isolated defaults if needed. ![Plugin Default Inheritance](./inherited-plugin-defaults.png) +### Import and export + +From the Namespace **Plugin Defaults** tab, you can also export the current Namespace plugin defaults to YAML and import them back into another Namespace. This is useful when promoting a curated set of defaults across environments or teams. +
diff --git a/src/contents/docs/05.workflow-components/09.plugin-defaults/warning.png b/src/contents/docs/05.workflow-components/09.plugin-defaults/warning.png deleted file mode 100644 index 0e6fc02c8cc..00000000000 Binary files a/src/contents/docs/05.workflow-components/09.plugin-defaults/warning.png and /dev/null differ diff --git a/src/contents/docs/05.workflow-components/18.sla/index.md b/src/contents/docs/05.workflow-components/18.sla/index.md index 67f6555e4d5..75e439cd7ec 100644 --- a/src/contents/docs/05.workflow-components/18.sla/index.md +++ b/src/contents/docs/05.workflow-components/18.sla/index.md @@ -109,12 +109,10 @@ tasks: triggers: - id: alert_on_failure type: io.kestra.plugin.core.trigger.Flow - labels: - sla: miss - states: - - FAILED - - WARNING - - CANCELLED + dependsOn: + - labels: + sla: miss + states: [FAILED, WARNING, CANCELLED] ``` :::alert{type="info"} diff --git a/src/contents/docs/06.concepts/02.namespace-files/index.md b/src/contents/docs/06.concepts/02.namespace-files/index.md index d559bb14326..1bad6f506a6 100644 --- a/src/contents/docs/06.concepts/02.namespace-files/index.md +++ b/src/contents/docs/06.concepts/02.namespace-files/index.md @@ -35,7 +35,7 @@ tasks: tasks: - id: return type: io.kestra.plugin.core.debug.Return - format: "{{ json(taskrun.value) }}" + format: "{{ fromJson(taskrun.value) }}" triggers: - id: query_trigger diff --git a/src/contents/docs/06.concepts/05.kv-store/index.md b/src/contents/docs/06.concepts/05.kv-store/index.md index f6abb8e085b..043f26994bf 100644 --- a/src/contents/docs/06.concepts/05.kv-store/index.md +++ b/src/contents/docs/06.concepts/05.kv-store/index.md @@ -132,7 +132,7 @@ tasks: values: my_key: "{{ kv('my_key') }}" simple_string: "{{ kv('simple_string') }}" - favorite_song: "{{ json(kv('json_kv')).song }}" + favorite_song: "{{ fromJson(kv('json_kv')).song }}" ``` You can use the `io.kestra.plugin.core.kv.Set` task to create or modify any KV pair. When modifying existing values, you can leverage the `overwrite` boolean parameter to control whether to overwrite the existing value or fail if a value for that key already exists. By default, the `overwrite` parameter is set to `true` so that the existing value is always updated. @@ -215,9 +215,9 @@ tasks: ### Read and parse JSON-type values from KV pairs -To parse JSON values in Kestra's templated expressions, make sure to wrap the `kv()` call in the `json()` function like the following: `"{{ json(kv('your_json_key')).json_property }}"`. +To parse JSON values in Kestra's templated expressions, wrap the `kv()` call in the `fromJson()` function: `"{{ fromJson(kv('your_json_key')).json_property }}"`. -The following example demonstrates how to parse values from JSON-type KV pairs in a flow: +This example sets a JSON KV pair and reads individual fields using `fromJson()`: ```yaml id: kv_json_flow namespace: company.team @@ -239,10 +239,10 @@ tasks: - id: parse_json_kv type: io.kestra.plugin.core.log.Log message: - - "Author: {{ json(kv('favorite_song')).author }}" - - "Song: {{ json(kv('favorite_song')).song }}" - - "Album name: {{ json(kv('favorite_song')).album.name }}" - - "Album release date: {{ json(kv('favorite_song')).album.release_date }}" + - "Author: {{ fromJson(kv('favorite_song')).author }}" + - "Song: {{ fromJson(kv('favorite_song')).song }}" + - "Album name: {{ fromJson(kv('favorite_song')).album.name }}" + - "Album release date: {{ fromJson(kv('favorite_song')).album.release_date }}" - id: get type: io.kestra.plugin.core.kv.Get @@ -250,7 +250,7 @@ tasks: - id: parse_json_from_kv type: io.kestra.plugin.core.log.Log - message: "Country: {{ json(outputs.get.value).album.name }}" + message: "Album name: {{ fromJson(outputs.get.value).album.name }}" ``` diff --git a/src/contents/docs/06.concepts/system-labels/index.md b/src/contents/docs/06.concepts/system-labels/index.md index ad533e022cb..5755329767b 100644 --- a/src/contents/docs/06.concepts/system-labels/index.md +++ b/src/contents/docs/06.concepts/system-labels/index.md @@ -84,3 +84,17 @@ Once this label is set, the editor for this flow will be disabled in the UI. :::alert{type="info"} In the Enterprise Edition, updating a read-only flow server-side is restricted to service accounts or API keys. ::: + +--- + +### `system.from` + +- Automatically set on every execution created by a Kestra MCP server +- Value is always `mcp` +- Use this label to filter all executions triggered by AI agents via the [McpToolTrigger](../../05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md) + +### `system.mcpServerId` + +- Automatically set on every execution created by a Kestra MCP server +- Value is the `id` of the MCP server that invoked the tool +- Use this label together with `system.from: mcp` to identify which server triggered a specific execution diff --git a/src/contents/docs/07.enterprise/02.governance/07.namespace-management/index.md b/src/contents/docs/07.enterprise/02.governance/07.namespace-management/index.md index c44f11e6220..85b01d00932 100644 --- a/src/contents/docs/07.enterprise/02.governance/07.namespace-management/index.md +++ b/src/contents/docs/07.enterprise/02.governance/07.namespace-management/index.md @@ -70,10 +70,20 @@ When building new flows in a Namespace, Namespace secrets are accessible from th Plugin Defaults can also be defined at the Namespace level. These plugin defaults are then applied for all tasks of the corresponding type defined in the flows under the same Namespace. -On the Namespaces page, select the Namespace where you want to define the plugin defaults and navigate to the **Plugin defaults** tab. You can add the plugin defaults here and save the changes by clicking on the **Save** button at the bottom of the page. +On the Namespaces page, select the Namespace where you want to define the plugin defaults and navigate to the **Plugin Defaults** tab. ![Define Plugin Defaults](./plugindefaults-namespaces.png) +From there, you can: + +- add a plugin default with a guided form +- switch between predefined plugin types and a custom plugin type +- switch between form mode and YAML mode +- preview the YAML for an existing plugin default +- export plugin defaults from the current Namespace +- import plugin defaults from a YAML file +- inspect inherited plugin defaults together with the parent Namespace they come from + You can reference secrets and variables defined with the same Namespace in the plugin defaults. In the example below, you no longer need to add the `password` property for the MySQL query task as it's defined in your Namespace-level `pluginDefaults`: @@ -91,6 +101,8 @@ tasks: fetchOne: true ``` +Namespace-level plugin defaults are inherited by child Namespaces. This makes it possible to define shared defaults once in a parent Namespace and let child Namespaces reuse them while still adding their own overrides when needed. + ### Default service account for SDK plugins Namespaces can now provide **default authentication credentials** that [SDK-based plugins](/plugins/plugin-kestra) use to run tasks such as [List all Namespaces](/plugins/plugin-kestra/kestra-namespaces/io.kestra.plugin.kestra.namespaces.list). This allows tasks relying on the [Kestra SDK](../../../api-reference/kestra-sdk/index.mdx) to call the API without hard-coding credentials inside the flow. @@ -192,7 +204,7 @@ github: token: "{{ secret('GITHUB_TOKEN') }}" ``` -Then, create another file for `task_defaults_marketing.yml`: +Then, create another file for `plugin_defaults_marketing.yml`: ```yaml - type: io.kestra.plugin.aws @@ -213,7 +225,7 @@ resource "kestra_namespace" "marketing" { namespace_id = "marketing" description = "Namespace for the marketing team" variables = file("variables_marketing.yml") - task_defaults = file("task_defaults_marketing.yml") + plugin_defaults = file("plugin_defaults_marketing.yml") } ``` @@ -293,10 +305,7 @@ kestra_password = "your-kestra-password" ``` ## Allowed Namespaces -When you navigate to any Namespace and go to the Edit tab, you can explicitly configure which Namespaces are allowed to access flows and other resources related to that Namespace. By default, all Namespaces are allowed: - -![allowed-namespaces](./allowed-namespaces.png) -However, you can restrict that access if you want only specific Namespaces (or no Namespace at all) to trigger its corresponding resources. +When you navigate to any Namespace and go to the **Edit** tab, you can explicitly configure which Namespaces are allowed to access flows and other resources related to that Namespace. -![allowed-namespaces-2](./allowed-namespaces-2.png) +By default, **all Namespaces** are allowed. To restrict access, **select specific Namespaces** — access automatically extends to each selected namespace's children. diff --git a/src/contents/docs/07.enterprise/02.governance/custom-blueprints/index.md b/src/contents/docs/07.enterprise/02.governance/custom-blueprints/index.md index 833e09c56a9..c97eed779b1 100644 --- a/src/contents/docs/07.enterprise/02.governance/custom-blueprints/index.md +++ b/src/contents/docs/07.enterprise/02.governance/custom-blueprints/index.md @@ -225,3 +225,91 @@ pluginDefaults: password: '{{ secret("ORACLE_USERNAME") }}' ``` ::: + +## Version control for Custom Blueprints + +Custom Blueprints can be version-controlled with Git using two dedicated tasks from the `plugin-ee-git` plugin: + +- [PushBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.PushBlueprints) commits and pushes blueprints from Kestra to a Git repository. +- [SyncBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.SyncBlueprints) syncs blueprints from a Git repository into Kestra, treating Git as the single source of truth. + +These tasks mirror the [PushFlows and SyncFlows patterns](../../../version-control-cicd/04.git/index.md) used for flows, applied to Custom Blueprints. + +### Push blueprints to Git + +Use `PushBlueprints` to export your blueprints from Kestra into a Git repository. This is useful for creating backups, reviewing changes via pull requests, or promoting blueprints across environments. + +Each blueprint is written as a YAML file to the target `gitDirectory` (default: `_blueprints`). Use the `blueprints` property with glob patterns to push only a subset of blueprints. + +```yaml +id: push_blueprints +namespace: system + +tasks: + - id: commit_and_push + type: io.kestra.plugin.ee.git.PushBlueprints + url: https://github.com/your-org/blueprints-repo + username: git_username + password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" + branch: main + commitMessage: "push blueprints from {{ flow.namespace ~ '.' ~ flow.id }}" + +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 * * * *" +``` + +The task outputs a `commitId`, a `commitURL`, and a `blueprints` URI pointing to a diff report that lists the number of lines added, deleted, and changed per file. + +### Sync blueprints from Git + +Use `SyncBlueprints` to pull blueprints from Git into Kestra. This is the recommended pattern when Git is your single source of truth — for example, when platform teams manage approved blueprint libraries centrally and deploy them across multiple Kestra instances. + +By default, `SyncBlueprints` only adds and updates blueprints. Set `delete: true` to also remove any blueprints present in Kestra but absent in Git. + +```yaml +id: sync_blueprints_from_git +namespace: system + +tasks: + - id: git + type: io.kestra.plugin.ee.git.SyncBlueprints + url: https://github.com/your-org/blueprints-repo + branch: main + username: git_username + password: "{{ secret('GITHUB_ACCESS_TOKEN') }}" + delete: true + dryRun: true + +triggers: + - id: every_full_hour + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 * * * *" +``` + +Set `dryRun: true` to preview what would change without applying it. The `blueprints` output URI contains a row-per-blueprint report showing each blueprint's `syncState`: `ADDED`, `UPDATED`, `UNCHANGED`, or `DELETED`. + +Use caution with `delete: true` — it removes all blueprints not present in Git, not just those that differ. + +### Blueprint YAML file format + +Both tasks read and write blueprints as YAML files. Each file represents one blueprint: + +```yaml +id: my-blueprint-id +title: My Blueprint Title +description: Optional description of what this blueprint does +tags: + - tag1 + - tag2 +flow: | + id: my-flow + namespace: company.team + tasks: + - id: hello + type: io.kestra.plugin.core.log.Log + message: Hello World +``` + +The `id` field controls how blueprints are matched on sync: if a blueprint with that ID already exists in Kestra, it is updated; if not, it is created with that ID. If `id` is omitted, a new blueprint is created with an auto-generated ID. diff --git a/src/contents/docs/07.enterprise/02.governance/logshipper/index.md b/src/contents/docs/07.enterprise/02.governance/logshipper/index.md index 136bf2fe8f2..d6b7e62edbf 100644 --- a/src/contents/docs/07.enterprise/02.governance/logshipper/index.md +++ b/src/contents/docs/07.enterprise/02.governance/logshipper/index.md @@ -18,7 +18,7 @@ Manage and distribute logs across your entire infrastructure. Log Shipper can distribute Kestra logs from across your instance to an external logging platform. Log synchronization fetches logs and batches them into optimized chunks automatically. The batch process is done intelligently through defined synchronization points. Once batched, the Log Shipper delivers consistent and reliable data to your monitoring platform. -Log Shipper is built on top of [Kestra plugins](/plugins), ensuring it can integrate with popular logging platforms and expand as more plugins are developed. Supported observability platforms include ElasticSearch, Datadog, New Relic, Azure Monitor, Google Operational Suite, AWS Cloudwatch, Splunk, OpenSearch, and OpenTelemetry. +Log Shipper is built on top of [Kestra plugins](/plugins), ensuring it can integrate with popular logging platforms and expand as more plugins are developed. Supported observability platforms include ElasticSearch, Datadog, New Relic, Azure Monitor, Google Operational Suite, AWS Cloudwatch, Splunk, OpenSearch, OpenTelemetry, and Dash0. ## Log shipper properties @@ -490,6 +490,32 @@ tasks: chunk: 1000 ``` +### Dash0 + +This example exports logs to [Dash0](https://www.dash0.com/) via OTLP/HTTP. Set `endpoint` to the ingestion URL for your Dash0 region. Set `dataset` to route logs to a named dataset, or omit it to use the Dash0 `default` dataset. + +```yaml +id: log_shipper +namespace: company.team + +triggers: + - id: daily + type: io.kestra.plugin.core.trigger.Schedule + cron: "@daily" + +tasks: + - id: log_export + type: io.kestra.plugin.ee.core.log.LogShipper + logLevelFilter: INFO + lookbackPeriod: P1D + logExporters: + - id: dash0LogExporter + type: io.kestra.plugin.ee.dash0.LogExporter + endpoint: https://ingress.eu-west-1.aws.dash0.com/v1/logs + authToken: "{{ secret('DASH0_AUTH_TOKEN') }}" + dataset: my-dataset +``` + ## Audit log shipper To send [Audit Logs](../06.audit-logs/index.md) to an external system, there is the Audit Log Shipper task type. The Audit Log Shipper task extracts logs from the Kestra backend and loads them to desired destinations including Datadog, Elasticsearch, New Relic, OpenTelemetry, AWS CloudWatch, Google Operational Suite, and Azure Monitor. diff --git a/src/contents/docs/07.enterprise/02.governance/read-only-secrets/index.md b/src/contents/docs/07.enterprise/02.governance/read-only-secrets/index.md index 2b725734d7d..43f13b84879 100644 --- a/src/contents/docs/07.enterprise/02.governance/read-only-secrets/index.md +++ b/src/contents/docs/07.enterprise/02.governance/read-only-secrets/index.md @@ -202,11 +202,11 @@ After saving the flow and executing, we can see that Kestra successfully accesse ## Filter secrets by tags -When integrating an external secrets manager in read-only mode, you can filter which secrets are visible in Kestra by matching [tags](../secrets-manager/index.md#default-tags). This is supported for AWS Secrets Manager, Azure Key Vault, and Google Secret Manager. +When integrating an external secrets manager in read-only mode, you can filter which secrets are visible in Kestra by matching [tags](../secrets-manager/index.md#default-tags). Set `read-only: true` and configure `filter-on-tags` with the key/value pairs to match. -- Set `read-only: true` and configure `filter-on-tags.tags` as a map of key/value pairs to match. - -Below are example configurations for AWS Secrets Manager, Azure Key Vault, and Google Secret Manager: +:::alert{type="info"} +AWS Secrets Manager, Azure Key Vault, and Google Secret Manager use a nested `tags` sub-key under `filter-on-tags`. All other providers accept `filter-on-tags` as a flat map of key/value pairs. +::: ```yaml kestra: @@ -241,6 +241,168 @@ kestra: application: kestra-production ``` +```yaml +kestra: + secret: + type: vault + read-only: true + vault: + filter-on-tags: + application: kestra-production +``` + +```yaml +kestra: + secret: + type: cyberark + read-only: true + cyberark: + filter-on-tags: + application: kestra-production +``` + +```yaml +kestra: + secret: + type: doppler + read-only: true + doppler: + filter-on-tags: + application: kestra-production +``` + +```yaml +kestra: + secret: + type: 1password + read-only: true + 1password: + filter-on-tags: + application: kestra-production +``` + +```yaml +kestra: + secret: + type: beyondtrust + read-only: true + beyondtrust: + filter-on-tags: + application: kestra-production +``` + +```yaml +kestra: + secret: + type: delinea + read-only: true + delinea: + filter-on-tags: + application: kestra-production +``` + +## Exclude secrets by tags + +Use `excluded-tags` to hide secrets from Kestra based on their tags. Any secret whose tags match at least one key-value pair in `excluded-tags` is excluded from Kestra's view, even if it would otherwise be included by `filter-on-tags`. This filter applies only when `read-only: true` is set. + +When both `filter-on-tags` and `excluded-tags` are configured, a secret must match all entries in `filter-on-tags` and must not match any entry in `excluded-tags`. + +The following examples exclude secrets tagged `hidden: "true"` for each supported provider: + +```yaml +kestra: + secret: + type: aws-secret-manager + read-only: true + aws-secret-manager: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: azure-key-vault + read-only: true + azure-key-vault: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: google-secret-manager + read-only: true + google-secret-manager: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: vault + read-only: true + vault: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: cyberark + read-only: true + cyberark: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: doppler + read-only: true + doppler: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: 1password + read-only: true + 1password: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: beyondtrust + read-only: true + beyondtrust: + excluded-tags: + hidden: "true" +``` + +```yaml +kestra: + secret: + type: delinea + read-only: true + delinea: + excluded-tags: + hidden: "true" +``` + +:::alert{type="info"} +AWS Secrets Manager does not support negative tag filtering in its `ListSecrets` API. Kestra evaluates `excluded-tags` client-side after fetching the secret list from AWS. For CyberArk, Doppler, 1Password, Vault, BeyondTrust, and Delinea, both `filter-on-tags` and `excluded-tags` are also evaluated client-side. +::: + ## Filter secrets by prefix For AWS Secrets Manager, you can also filter secrets by a name prefix when using read-only mode. Use `filter-on-prefix.prefix` to select secrets whose names start with the given prefix and `filter-on-prefix.keep-prefix` to control whether the prefix is kept in the Kestra secret key. diff --git a/src/contents/docs/07.enterprise/02.governance/unit-tests/index.md b/src/contents/docs/07.enterprise/02.governance/unit-tests/index.md index f1d9af0ee7c..fa68bd28f0f 100644 --- a/src/contents/docs/07.enterprise/02.governance/unit-tests/index.md +++ b/src/contents/docs/07.enterprise/02.governance/unit-tests/index.md @@ -34,10 +34,11 @@ The following diagram illustrates the structure of flows and unit tests together ## Configuration -Unit tests are written in YAML like flows. A test is made up of `testCases`, and each test case is made up of `fixtures` and `assertions`. Fixtures can target **files**, **inputs**, **tasks**, or **triggers** depending on what you need to mock or override. Like flows, you can write unit tests as code, in No Code, or with the [AI Copilot](../../../ai-tools/ai-copilot/index.md). +Unit tests are written in YAML like flows. A test is made up of `testCases`, and each test case is made up of `fixtures`, `assertions`, and an optional `expectedState`. Fixtures can target **files**, **inputs**, **tasks**, or **triggers** depending on what you need to mock or override. Like flows, you can write unit tests as code, in No Code, or with the [AI Copilot](../../../ai-tools/ai-copilot/index.md). - A **fixture** refers to the setup required before a test runs, such as initializing objects or configuring environments, to ensure the test has a consistent starting state. - An **assertion** is a statement that checks if a specific condition is true during the test. If the condition is false, the test fails, indicating an issue with the code being tested, while true indicates the expectation is met. +- **expectedState** sets the terminal state the flow must reach for the test to pass. It defaults to `SUCCESS`; set it to `FAILED`, `WARNING`, `KILLED`, or any other valid state to test intentional failure paths. Common fixture types: - **files**: provide inline files or namespace file URIs the flow can read. @@ -419,6 +420,44 @@ In this example: This approach allows you to test the complete flow logic while avoiding the overhead and complexity of executing actual scripts during testing. +## Assert expected failure state + +Some flows are designed to fail when conditions are not met — for example, a validation guard that uses `io.kestra.plugin.core.execution.Fail` to reject invalid inputs. The `expectedState` property on a test case lets you assert that a flow ends in a specific terminal state. It defaults to `SUCCESS`; set it to `FAILED`, `WARNING`, `KILLED`, or any other valid state. + +The following flow fails when the supplied quantity is not positive: + +```yaml +id: order_validation +namespace: company.team + +inputs: + - id: quantity + type: INT + +tasks: + - id: validate_quantity + type: io.kestra.plugin.core.execution.Fail + condition: "{{ inputs.quantity <= 0 }}" + errorMessage: "Order quantity must be greater than zero" +``` + +The test asserts that passing a negative value causes the expected failure: + +```yaml +id: order_validation_tests +namespace: company.team +flowId: order_validation +testCases: + - id: invalid_quantity_should_fail + type: io.kestra.core.tests.flow.UnitTest + expectedState: FAILED + fixtures: + inputs: + quantity: -1 +``` + +When `expectedState` is set, the test passes only if the execution ends in exactly that state. If it ends in a different state, the test fails and reports both the expected and actual states. + ## Available assertion operators While the above example uses `isNotNull` and `contains` as assertion operators, there are many more that can be used when designing unit tests for your flows. The complete list is as follows: diff --git a/src/contents/docs/07.enterprise/02.governance/worker-isolation/index.md b/src/contents/docs/07.enterprise/02.governance/worker-isolation/index.md index 942b42dc0cd..1c550cf5cf0 100644 --- a/src/contents/docs/07.enterprise/02.governance/worker-isolation/index.md +++ b/src/contents/docs/07.enterprise/02.governance/worker-isolation/index.md @@ -76,7 +76,7 @@ For [Bash tasks](/plugins/plugin-script-shell/io.kestra.plugin.scripts.shell.scr ```yaml kestra: - tasks: + plugins: defaults: - type: io.kestra.plugin.scripts.shell.Commands forced: true @@ -86,6 +86,8 @@ kestra: type: io.kestra.plugin.scripts.runner.docker.Docker ``` +`kestra.tasks.defaults` still works for backward compatibility, but `kestra.plugins.defaults` is the current and recommended property. + Forced plugin defaults: - Ensure a property is set globally for a task, and no task can override it. - Are critical for security and governance — for example, to enforce Shell tasks to run as Docker containers. diff --git a/src/contents/docs/07.enterprise/03.auth/invitations/index.md b/src/contents/docs/07.enterprise/03.auth/invitations/index.md index a6c1cdc7b1f..239caadf6df 100644 --- a/src/contents/docs/07.enterprise/03.auth/invitations/index.md +++ b/src/contents/docs/07.enterprise/03.auth/invitations/index.md @@ -41,6 +41,8 @@ You can check the box to **Create user directly (skip invitation)** if one is no When a user receives an invitation, they can click on the link in the email to accept it. The user will be redirected to the Kestra login page, where they set up their account (i.e., create a password), or log in using SSO if it's enabled. +If password-based login is enabled, the password they choose must satisfy the instance password policy configured under `kestra.security.basic-auth`. See [Security and Secrets configuration](../../../configuration/05.security-and-secrets/index.md) for the available password policy settings. + ## Invite expiration time Users have 7 days to accept the invitation. After this period, the invitation will expire and must be reissued. diff --git a/src/contents/docs/07.enterprise/03.auth/rbac/index.md b/src/contents/docs/07.enterprise/03.auth/rbac/index.md index 33cf4f0e873..df48b9ebcc5 100644 --- a/src/contents/docs/07.enterprise/03.auth/rbac/index.md +++ b/src/contents/docs/07.enterprise/03.auth/rbac/index.md @@ -74,6 +74,7 @@ A Permission is a resource that can be accessed by a User or Group. Open the fol - `APP` - `AI_COPILOT` - `APPEXECUTION` +- `MCP_SERVER` - `TEST` - `ASSET` - `USER` @@ -107,6 +108,20 @@ Example (Flows): For a complete CRUD-to-endpoint mapping for every permission, see the [Permissions Reference](./permissions-reference/index.md). ::: +### MCP server permissions + +`MCP_SERVER` is a first-class RBAC resource that controls access to [Kestra MCP servers](../../../ai-tools/mcp-server/index.md). Supported actions are `VIEW`, `LIST`, `CREATE`, `UPDATE`, and `DELETE`. + +Default role assignments: + +| Role | Actions granted | +|---|---| +| Admin | All (`VIEW`, `LIST`, `CREATE`, `UPDATE`, `DELETE`) | +| Editor / Developer | All (`VIEW`, `LIST`, `CREATE`, `UPDATE`, `DELETE`) | +| Viewer | `VIEW`, `LIST` | + +In addition to these permissions, access to a **private** MCP server is also flow-scoped: a user can connect to a private server only if they have `FLOW.EXECUTE` on at least one namespace that contains a flow with an `McpToolTrigger` pointing at that server. + ### Currently supported roles Currently, Kestra only creates an **Admin** role by default. That role grants full access to **all resources**. diff --git a/src/contents/docs/07.enterprise/03.auth/sso/ldap/index.md b/src/contents/docs/07.enterprise/03.auth/sso/ldap/index.md index 5cc2e82d8a4..fd56249209f 100644 --- a/src/contents/docs/07.enterprise/03.auth/sso/ldap/index.md +++ b/src/contents/docs/07.enterprise/03.auth/sso/ldap/index.md @@ -1,7 +1,7 @@ --- -title: "LDAP Authentication in Kestra: Directory Login" +title: "LDAP Authentication in Kestra: Directory Login and Group Sync" h1: Connect Your LDAP Directory for User Login and Group Sync -description: Enable LDAP authentication in Kestra. Connect your existing LDAP directory to manage user login and group synchronization securely. +description: Enable LDAP authentication in Kestra. Use your LDAP directory for user login, group synchronization, or both — including alongside an existing SSO provider. sidebarTitle: LDAP icon: /src/contents/docs/icons/admin.svg editions: ["EE"] @@ -10,10 +10,18 @@ version: "0.22.0" Enable LDAP authentication in Kestra to authenticate users against your existing directory and sync group memberships automatically. +## Configure LDAP authentication + +Enable LDAP authentication in Kestra to authenticate users against your existing directory, sync group memberships, or both. You can also use LDAP solely for group sync while keeping an existing SSO provider for login. +
+:::alert{type="warning"} +LDAP is a licensed feature. If `micronaut.security.ldap.default` is configured but your license does not include LDAP, Kestra will refuse to start with the error: `LDAP is not supported by your license`. Contact your Kestra account team to enable it. +::: + ## What is LDAP Lightweight directory access protocol (LDAP) allows applications to quickly query user information. Organizations use directories to store usernames, passwords, email addresses, and other static data. LDAP is an open, vendor-neutral protocol for accessing and managing that data. @@ -22,12 +30,22 @@ With Kestra, you can use an existing LDAP directory to authenticate users and sy ## Configuration -LDAP is configured under the security context of your [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) file. +LDAP is configured under the security context of your [Kestra Security and Secrets configuration](../../../../configuration/05.security-and-secrets/index.md) file. [LDAP with Micronaut](https://micronaut-projects.github.io/micronaut-security/4.11.3/guide/#ldap) supports `context`, `search`, and `groups` as core configuration properties supported out of the box. These properties define the connection context, user attribute mapping, and group filtering needed to synchronize users and their group memberships with Kestra. The `user-attributes` section maps LDAP attributes such as `givenName`, `sn`, and `mail` to the corresponding Kestra user properties (first name, last name, and email). +Below are example configurations with Kestra-specific properties on top of the Micronaut configuration. + +The `mode` property controls how Kestra uses the LDAP connection: + +| Mode | Description | +|---|---| +| `AUTHENTICATION` | LDAP handles user login only. No group sync. **This is the default.** | +| `AUTHENTICATION_AND_GROUP_SYNC` | LDAP handles both user login and group membership sync. | +| `GROUP_SYNC_ONLY` | LDAP is used only to resolve group memberships. Users log in via an existing SSO provider. | + The examples below extend the base Micronaut LDAP configuration with these Kestra-specific mappings. ### Unix configuration @@ -37,6 +55,7 @@ micronaut: security: ldap: default: + mode: AUTHENTICATION_AND_GROUP_SYNC # or AUTHENTICATION to skip group sync user-attributes: firstName: givenName lastName: sn @@ -58,6 +77,7 @@ micronaut: base: "ou=groups,dc=example,dc=org" filter: "{&(objectClass=posixGroup)(memberUid={0})}" filter-attribute: uid + attribute: cn ``` ### Windows configuration @@ -68,6 +88,7 @@ micronaut: ldap: default: enabled: true + mode: AUTHENTICATION_AND_GROUP_SYNC # or AUTHENTICATION to skip group sync user-attributes: firstName: givenName lastName: sn @@ -89,6 +110,7 @@ micronaut: base: "DC=domain,DC=local" filter: "(&(objectClass=group)(member={0}))" filter-attribute: dn + attribute: cn ``` Key points for Windows Active Directory: @@ -144,9 +166,51 @@ Get-ADGroupMember -Identity "CN=Auto,OU=Distro,OU=Groups,DC=kestra,DC=local" | S Replace the identity string with the DN of your target group. +### Group sync with SSO (GROUP_SYNC_ONLY) + +If your users already authenticate via SSO, Basic auth, or Passwordless, you can use LDAP solely to resolve group memberships without changing how users log in. Set `mode: GROUP_SYNC_ONLY` and configure the `groups` block. No `user-attributes` mapping is required. + +```yaml +micronaut: + security: + ldap: + default: + mode: GROUP_SYNC_ONLY + context: + server: "ldap://localhost:389" + manager-dn: "cn=admin,dc=kestra,dc=io" + manager-password: "LDAP_ADMIN_PASSWORD" + search: + base: "ou=users,dc=kestra,dc=io" + filter: "(mail={0})" + groups: + enabled: true + base: "ou=groups,dc=kestra,dc=io" + filter: "(member={0})" + attribute: cn +``` + +With this configuration: +- Users log in using their SSO provider. LDAP credentials are never checked. +- At each login, Kestra queries the LDAP directory for the user's group memberships and merges them with any groups sourced from OIDC claims. +- Groups found in LDAP are synced to Kestra using the same rules as standard LDAP group sync — new groups are created automatically, and membership is updated on login. + +Two `groups` properties control how Kestra reads group entries from the directory: +- `filter`: the LDAP search filter used to find groups for a user. `{0}` is replaced with the user's distinguished name (DN). +- `attribute`: the attribute on the group entry whose value becomes the Kestra group name. Defaults to `cn`. +- `filter-attribute`: the user entry attribute substituted into `{0}` in the group filter. Use `dn` for directories that store full DNs in group membership attributes (common in Active Directory). Use `uid` for POSIX-style directories. + +:::alert{type="info"} +`GROUP_SYNC_ONLY` mode requires that the user already exists in Kestra (created on first login). LDAP group sync fires on every subsequent login. +::: + +:::alert{type="warning"} +If the LDAP server is unreachable or misconfigured, group sync fails silently — the user logs in successfully but receives no LDAP-sourced groups. Check server connectivity and `groups` configuration if group assignments are not appearing after login. +::: + ## LDAP users in Kestra -Once LDAP is configured, when a user logs into Kestra for the first time, their credentials are validated against the LDAP directory, and a corresponding user is created in Kestra. If a matching account already exists in Kestra, the user is authenticated using their LDAP credentials. +Once LDAP is configured, when a user logs into Kestra for the first time using LDAP authentication, their credentials are validated against the LDAP directory and a corresponding user is created in Kestra. If a matching account already exists, the user is authenticated using their LDAP credentials. If they are a part of any groups specified in the directory, those groups will be added to Kestra. If the group already exists in Kestra, they will be automatically added. If a user is added to a group after their initial login, they must log out and log back in for the new group assignment to sync, as synchronization occurs only at login. Any user authenticated via LDAP will show `LDAP` as their Authentication method in the **IAM - Users** tab in Kestra. @@ -154,6 +218,10 @@ If they are a part of any groups specified in the directory, those groups will b Any updates to a user and their group access on the LDAP server will update in Kestra at the next synchronization (typically at the next login). +:::alert{type="info"} +Users who log in via SSO with `GROUP_SYNC_ONLY` mode show their SSO provider as their Authentication method in the IAM Users tab, not `LDAP`. The LDAP connection is used only to resolve group memberships in the background. +::: + :::alert{type="warning"} If a user is deleted from the LDAP server, they will lose access to Kestra at the next synchronization or login attempt. ::: diff --git a/src/contents/docs/10.administrator-guide/03.monitoring/index.md b/src/contents/docs/10.administrator-guide/03.monitoring/index.md index 0ba73487b95..20a2c025108 100644 --- a/src/contents/docs/10.administrator-guide/03.monitoring/index.md +++ b/src/contents/docs/10.administrator-guide/03.monitoring/index.md @@ -50,22 +50,17 @@ tasks: triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - FAILED - - WARNING - - type: io.kestra.plugin.core.condition.ExecutionNamespace - namespace: company.analytics - prefix: true + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company.analytics') }}" ``` Adding this single flow will ensure that you receive a Slack alert on any flow failure in the `company.analytics` namespace. Here is an example alert notification: ![alert notification](../../03.tutorial/06.errors/alert-notification.png) -:::alert{type="warning"} -To send this alert on failure across multiple namespaces, add an `OrCondition` to the `conditions` list. See the example below: +:::alert{type="info"} +To alert on failures across multiple namespaces or specific flows, use `mode: ANY` with multiple `dependsOn` entries. The trigger fires when any entry is satisfied: ```yaml id: alert namespace: company.system @@ -80,53 +75,17 @@ tasks: triggers: - id: listen type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - FAILED - - WARNING - - type: io.kestra.plugin.core.condition.Or - conditions: - - type: io.kestra.plugin.core.condition.ExecutionNamespace - namespace: company.product - prefix: true - - type: io.kestra.plugin.core.condition.ExecutionFlow - flowId: cleanup - namespace: company.system -``` -::: - -The example above works correctly. However, if you list the conditions without using `OrCondition`, no alerts will be sent because Kestra will try to match all conditions simultaneously. Since there’s no overlap between them, the conditions cancel each other out. See the example below: - -```yaml -id: bad_example -namespace: company.monitoring -description: This example will not work - -tasks: - - id: send - type: io.kestra.plugin.slack.notifications.SlackExecution - url: "{{ secret('SLACK_WEBHOOK') }}" - channel: "#general" - executionId: "{{trigger.executionId}}" - -triggers: - - id: listen - type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - FAILED - - WARNING - - type: io.kestra.plugin.core.condition.ExecutionNamespace - namespace: company.product - prefix: true - - type: io.kestra.plugin.core.condition.ExecutionFlow - flowId: cleanup + mode: ANY + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company.product') }}" + - flowId: cleanup namespace: company.system + states: [FAILED, WARNING] ``` +::: -Here, there's no overlap between the two conditions. The first condition will only match executions in the `company.product` namespace, while the second condition will only match executions from the `cleanup` flow in the `company.system` namespace. To match executions from the `cleanup` flow in the `company.system` namespace **or** any execution in the `product` namespace, use `OrCondition`. +The example above fires when either any `company.product` flow fails or the specific `cleanup` flow in `company.system` fails. `mode: ANY` means the trigger fires as soon as one entry is satisfied — you do not need to combine everything into a single expression. ## Monitoring diff --git a/src/contents/docs/10.administrator-guide/basic-auth-troubleshooting/index.md b/src/contents/docs/10.administrator-guide/basic-auth-troubleshooting/index.md index 136d70bdfcd..d534287dcf6 100644 --- a/src/contents/docs/10.administrator-guide/basic-auth-troubleshooting/index.md +++ b/src/contents/docs/10.administrator-guide/basic-auth-troubleshooting/index.md @@ -22,6 +22,8 @@ Since Basic Authentication is now required, the `enabled` flag is ignored and sh For production deployments, set a valid email address and a strong password in the configuration file. +If you use the Setup page to create credentials, the password must satisfy the password policy configured under `kestra.security.basic-auth`. See [Security and Secrets configuration](../../configuration/05.security-and-secrets/index.md) for the available password policy settings. + There are four possible scenarios for existing users. ### Scenario 1: The `enabled` flag is set to `true` diff --git a/src/contents/docs/10.administrator-guide/open-telemetry/index.md b/src/contents/docs/10.administrator-guide/open-telemetry/index.md index 5d1617bd94c..ae2b76a11b7 100644 --- a/src/contents/docs/10.administrator-guide/open-telemetry/index.md +++ b/src/contents/docs/10.administrator-guide/open-telemetry/index.md @@ -54,6 +54,35 @@ Kestra propagates the trace context so that traces are correlated: - Flow execution traces correlate with parent flows when the `Subflow` or `ForEachItem` task is used. - External HTTP calls include the standard propagation header for downstream correlation. +### Propagate trace context to scripts + +Scripts run in isolated containers, so OTel spans they generate start a new root trace by default. Pass `{{ trace.parent }}` as the `TRACEPARENT` environment variable to parent those spans under the Kestra task span. + +`{{ trace.parent }}` holds the W3C [traceparent](https://www.w3.org/TR/trace-context/) header; it is empty when tracing is disabled. + +```yaml +id: traced_script +namespace: company.team + +tasks: + - id: run_python + type: io.kestra.plugin.scripts.python.Script + env: + TRACEPARENT: "{{ trace.parent }}" + script: | + from opentelemetry.propagate import extract + from opentelemetry.sdk.trace import TracerProvider + import os + + ctx = extract({"traceparent": os.environ.get("TRACEPARENT", "")}) + tracer = TracerProvider().get_tracer(__name__) + + with tracer.start_as_current_span("my-span", context=ctx): + pass # spans here appear as children of the Kestra task span +``` + +`TRACEPARENT` is recognized by all major OTel SDKs and works the same way for Node.js, Bash, and any other script type. The variable is also usable in HTTP task headers and any other [expression-capable property](../../expressions/index.md#default-execution-context-variables). + ### Example: Jaeger with Docker Compose Enable [Jaeger](https://www.jaegertracing.io), an OpenTelemetry-compatible tracing platform, with Kestra in a Docker Compose configuration file: diff --git a/src/contents/docs/10.administrator-guide/prometheus-metrics/index.md b/src/contents/docs/10.administrator-guide/prometheus-metrics/index.md index 80805c8e8e8..8a8e41b3c69 100644 --- a/src/contents/docs/10.administrator-guide/prometheus-metrics/index.md +++ b/src/contents/docs/10.administrator-guide/prometheus-metrics/index.md @@ -40,6 +40,12 @@ Executor server exclusive: * `kestra_executor_execution_message_process_seconds_max` (gauge): Maximum observed duration of a single execution message processed by the Executor. * `kestra_executor_execution_started_count_total` (counter): The total number of executions started by the Executor. * `kestra_executor_flowable_execution_count_total` (counter): The total number of flowable tasks executed by the Executor +* `kestra_executor_loop_delay_duration_seconds` (summary): Execution delay loop duration inside the Executor. +* `kestra_executor_loop_delay_duration_seconds_max` (gauge): Maximum observed execution delay loop duration inside the Executor. +* `kestra_executor_loop_sla_duration_seconds` (summary): SLA monitor loop duration inside the Executor. +* `kestra_executor_loop_sla_duration_seconds_max` (gauge): Maximum observed SLA monitor loop duration inside the Executor. +* `kestra_executor_processing_flow_trigger_duration_seconds` (summary): Flow trigger processing duration inside the Executor. +* `kestra_executor_processing_flow_trigger_duration_seconds_max` (gauge): Maximum observed flow trigger processing duration inside the Executor. * `kestra_executor_taskrun_created_count_total` (counter): The total number of tasks created by the Executor. * `kestra_executor_taskrun_ended_count_total` (counter): he total number of tasks ended by the Executor. * `kestra_executor_taskrun_ended_duration_seconds` (summary): Task duration inside the Executor. diff --git a/src/contents/docs/10.administrator-guide/purge/index.md b/src/contents/docs/10.administrator-guide/purge/index.md index 2db0d6db334..76bb9468e34 100644 --- a/src/contents/docs/10.administrator-guide/purge/index.md +++ b/src/contents/docs/10.administrator-guide/purge/index.md @@ -11,7 +11,7 @@ Use purge tasks to remove old executions, logs, and key-value pairs, helping red To keep storage optimized, use [`io.kestra.plugin.core.execution.PurgeExecutions`](/plugins/core/execution/io.kestra.plugin.core.execution.purgeexecutions), [`io.kestra.plugin.core.log.PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs), and [`io.kestra.plugin.core.kv.PurgeKV`](/plugins/core/kv/io.kestra.plugin.core.kv.purgekv). - `PurgeExecutions`: deletes execution records -- `PurgeLogs`: removes both `Execution` and `Trigger` logs in bulk +- `PurgeLogs`: removes execution logs and non-execution logs (e.g. trigger logs) in bulk; use `purgeExecutionLogs` and `purgeNonExecutionLogs` to target each type independently - `PurgeKV`: deletes expired keys globally for a specific namespace Together, these replace the legacy `io.kestra.plugin.core.storage.Purge` task with a **faster and more reliable process (~10x faster)**. @@ -45,6 +45,68 @@ triggers: cron: "@daily" ``` +### Selectively purge execution or trigger logs + +Both `purgeExecutionLogs` and `purgeNonExecutionLogs` default to `true`. Set either to `false` to exclude that log type — for example, to retain execution logs for debugging while still clearing trigger logs. + +Purge only trigger (non-execution) logs: + +```yaml +id: purge-trigger-logs +namespace: company.myteam + +tasks: + - id: purge_logs + type: io.kestra.plugin.core.log.PurgeLogs + endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" + purgeExecutionLogs: false + +triggers: + - id: daily + type: io.kestra.plugin.core.trigger.Schedule + cron: "@daily" +``` + +Purge only execution logs: + +```yaml +id: purge-execution-logs +namespace: company.myteam + +tasks: + - id: purge_logs + type: io.kestra.plugin.core.log.PurgeLogs + endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" + purgeNonExecutionLogs: false + +triggers: + - id: daily + type: io.kestra.plugin.core.trigger.Schedule + cron: "@daily" +``` + +The task outputs `executionLogsCount` and `nonExecutionLogsCount` alongside the existing `count` (total), so you can log or alert on how many of each type were removed. + +### Control deletion batch size + +By default, `PurgeLogs` deletes all matching rows in a single transaction. Use `batchSize` to split the deletion into smaller batches — useful when purging a large volume of logs to limit transaction size: + +```yaml +id: purge-logs-batched +namespace: company.myteam + +tasks: + - id: purge_logs + type: io.kestra.plugin.core.log.PurgeLogs + endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}" + batchSize: 1000 + +triggers: + - id: daily + type: io.kestra.plugin.core.trigger.Schedule + cron: "@daily" +``` + ## Purge Key-value pairs The example below purges expired Key-value pairs from the `company` Namespace. It's set up as a flow in the [`system`](../../06.concepts/system-flows/index.md) Namespace. diff --git a/src/contents/docs/10.administrator-guide/security-hardening/index.md b/src/contents/docs/10.administrator-guide/security-hardening/index.md index bbebf7e5757..5e090364369 100644 --- a/src/contents/docs/10.administrator-guide/security-hardening/index.md +++ b/src/contents/docs/10.administrator-guide/security-hardening/index.md @@ -22,6 +22,15 @@ Running workflows in isolated environments reduces the impact of potential malic - **Ephemeral compute** — use Kestra's native [Task Runners](../../07.enterprise/04.scalability/task-runners/index.md) to auto-scale ephemeral compute nodes, which are destroyed after each run to ensure no residual state. - **Minimum host permissions** - grant only the OS-level rights required for the runtime; avoid mounting cloud credential files or granting host-level IAM roles directly. +## Transport security (EE only) + +In distributed deployments, Worker Controllers communicate with Workers over gRPC. By default this channel is plaintext. Enterprise Edition supports TLS encryption and mutual TLS (mTLS) to authenticate both sides of the connection: + +- **One-way TLS** — the controller presents a certificate; workers verify it. Encrypts the channel without requiring worker certificates. +- **Mutual TLS (mTLS)** — both controller and worker present certificates. Use this when you need strong identity verification between components, not just encryption. + +See [gRPC TLS/mTLS configuration](../../configuration/06.enterprise-and-advanced/index.md#grpc-tlsmtls-ee-only) for setup instructions and a full property reference. + ## Plugin and code validation To prevent the execution of malicious code, you can implement several strategies: diff --git a/src/contents/docs/11.migration-guide/v2.0.0/checks-condition-renamed-when/index.md b/src/contents/docs/11.migration-guide/v2.0.0/checks-condition-renamed-when/index.md new file mode 100644 index 00000000000..22706c09647 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/checks-condition-renamed-when/index.md @@ -0,0 +1,43 @@ +--- +title: Check.condition Renamed to when +sidebarTitle: condition → when (Checks) +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: The condition property on flow-level checks has been renamed to when in Kestra 2.0.0, aligning it with the when property used across tasks and triggers. +--- + +Kestra 2.0.0 unifies all Pebble conditional expressions under a single property name: `when`. + +The `condition` property on flow-level `checks` is renamed to `when`. A deprecated alias keeps `condition` functional in 2.0.0 so existing flows continue to parse without changes. The alias is scheduled for removal in a future version — update your flows now to avoid a hard break later. + +## Before + +```yaml +checks: + - condition: "{{ inputs.environment == 'production' }}" + message: "This flow can only run in production" + behavior: BLOCK_EXECUTION + style: ERROR +``` + +## After + +```yaml +checks: + - when: "{{ inputs.environment == 'production' }}" + message: "This flow can only run in production" + behavior: BLOCK_EXECUTION + style: ERROR +``` + +The behavior is identical — the same Pebble rendering and `BLOCK_EXECUTION` / `FAIL_EXECUTION` / `CREATE_EXECUTION` logic apply. + +## Migration steps + +1. **Search your flows** for `condition:` inside `checks` blocks and replace each occurrence with `when:`. The property value and any Pebble expressions stay the same. +2. **Validate** by saving the updated flows in the Kestra UI or via the API. + +:::alert{type="warning"} +The `condition` alias will be removed in a future release. Flows that still use `condition` inside `checks` will fail to parse after the alias is dropped. +::: diff --git a/src/contents/docs/11.migration-guide/v2.0.0/foreach-loop/index.md b/src/contents/docs/11.migration-guide/v2.0.0/foreach-loop/index.md new file mode 100644 index 00000000000..262faddcf14 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/foreach-loop/index.md @@ -0,0 +1,469 @@ +--- +title: ForEach and ForEachItem Replaced by Loop +sidebarTitle: ForEach Replaced by Loop +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: ForEach and ForEachItem are removed in Kestra 2.0. The Loop task replaces both with isolated sub-executions per iteration, safer expression syntax, and built-in output collection. +--- + +`ForEach` and `ForEachItem` are removed in Kestra 2.0. The `Loop` task replaces both. + +:::alert{type="warning"} +Flows that still reference `io.kestra.plugin.core.flow.ForEach` or `io.kestra.plugin.core.flow.ForEachItem` will fail to parse after upgrading to 2.0.0. Complete this migration before upgrading. +::: + +This guide covers both migrations. If your flows use `ForEach`, follow the [Migrating `ForEach`](#migrating-foreach) section. If your flows use `ForEachItem`, follow the [Migrating `ForEachItem`](#migrating-foreachitem) section. Both must be complete before you upgrade. + +## Why the change + +`ForEach` ran all iterations as task runs inside the **same execution**. A flow iterating over a large dataset — thousands of files, rows, or API results — could generate tens of thousands of task runs in a single execution. That volume would exhaust executor memory and could bring down the entire Kestra instance. The failure was not isolated to the flow that caused it; it affected every other flow running at the same time. + +`ForEachItem` addressed this by dispatching each batch to a separate subflow execution, which kept the task run count manageable. But it required splitting your logic into a separate flow, passing data through inputs and outputs across a flow boundary, and managing the lifecycle of subflow executions. For simple per-item processing, this overhead was hard to justify. + +`Loop` fixes the stability problem and simplifies the model. Each iteration runs as an **isolated sub-execution**, so no single flow can generate unbounded task runs in one execution. The child tasks live inline — no separate flow required — and each iteration's failure is contained. A badly-sized loop degrades gracefully rather than destabilizing the instance. + +The expression syntax also improves. In `ForEach`, accessing the current value from inside a nested flowable required `parent.taskrun.value` or `parents[0].taskrun.value`. In `Loop`, `item` is bound to the sub-execution itself, so all tasks — including those inside nested `If`, `Parallel`, or other flowables — access it as `item.value` with no parent traversal required. + +## Expression quick reference + +The table below maps every `ForEach` expression to its `Loop` equivalent. The sections that follow show complete before-and-after examples for each pattern. + +| ForEach expression | Loop equivalent | Notes | +|---|---|---| +| `taskrun.value` | `item.value` | | +| `taskrun.iteration` | `item.index` | Zero-based in both | +| `parent.taskrun.value` | `item.value` | No prefix needed — `item` is accessible from any depth inside a Loop sub-execution, including inside `If` or `Parallel` | +| `parents[0].taskrun.value` | `item.parent.value` | Only when inside an inner of two nested Loops; when used inside a nested flowable (If, Parallel) within a single Loop, it maps to `item.value` instead | +| `parents[1].taskrun.value` | `item.parents[1].value` | One level further up | +| `outputs.task_id[taskrun.value].value` | `outputs.task_id.value` | Inside the iteration; task outputs are scoped to the current sub-execution | +| `outputs.foreach_id[value].field` (after the loop) | `outputs.loop_id.outputs[n].outputs.output_id` (by index) or `loopOutputs(outputs.loop_id.outputs, 'output_id')` (all values as a list) | Outside the loop; outputs are now a list — key-based access by value string is no longer supported | + +## Pattern quick reference + +The table below maps the common 1.0 iteration patterns to their 2.0 equivalents. The expression quick reference above covers the variable renames; this table covers the broader structural changes. + +| 1.0 pattern | 2.0 replacement | What changed | +|---|---|---| +| `ForEach` | `Loop` | Same shape. Adds typed `outputs:`, `finally:`, and `item.parents[N]`. | +| `ForEach + If` | `Loop + If` | No change to the inner `If` task — update expressions only. | +| `ForEachItem` | `Loop + Subflow` | Isolation is now opt-in via `Subflow`. Same per-batch execution model, explicit instead of implicit. | +| `subflowOutputs` | `outputs: [{id, type, value}]` | Per-iteration outputs are declared, not auto-surfaced. | +| `taskrun.value` / `taskrun.iteration` | `item.value` / `item.index` | Variable rename only. | +| Built-in batch aggregation | `Concat + Aggregate` | Reduce is its own task — compose explicitly after the loop. | + +## Migrating `ForEach` + +For most flows, the `ForEach` migration has three parts: replace the task type, update expressions inside child tasks, and update any post-loop output access. The sections below cover each pattern with before-and-after examples. + +### Basic iteration + +Replace `io.kestra.plugin.core.flow.ForEach` with `io.kestra.plugin.core.flow.Loop`. Inside child tasks, replace `{{ taskrun.value }}` with `{{ item.value }}` and `{{ taskrun.iteration }}` with `{{ item.index }}`. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: ["value 1", "value 2", "value 3"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "value={{ taskrun.value }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["value 1", "value 2", "value 3"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "index={{ item.index }} value={{ item.value }}" +``` + +### Nested flowables + +In `ForEach`, tasks inside a nested `If` or `Parallel` had to traverse up to the ForEach task run with `parent.taskrun.value` or `parents[0].taskrun.value`. In `Loop`, `item` is bound to the sub-execution and is accessible directly from any depth — no traversal needed. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: ["value 1", "value 2", "value 3"] + tasks: + - id: check + type: io.kestra.plugin.core.flow.If + condition: '{{ taskrun.value == "value 2" }}' + then: + - id: matched + type: io.kestra.plugin.core.log.Log + message: "Matched at {{ parents[0].taskrun.value }}" + else: + - id: skipped + type: io.kestra.plugin.core.log.Log + message: "Skipped: {{ parents[0].taskrun.value }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["value 1", "value 2", "value 3"] + tasks: + - id: check + type: io.kestra.plugin.core.flow.If + condition: '{{ item.value == "value 2" }}' + then: + - id: matched + type: io.kestra.plugin.core.log.Log + message: "Matched at index={{ item.index }}: {{ item.value }}" + else: + - id: skipped + type: io.kestra.plugin.core.log.Log + message: "Skipped: {{ item.value }}" +``` + +### Concurrent execution + +The `concurrencyLimit` property carries over unchanged. Update only the task type and the expressions inside child tasks. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: [1, 2, 3, 4, 5] + concurrencyLimit: 3 + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Processing {{ taskrun.value }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: [1, 2, 3, 4, 5] + concurrencyLimit: 3 + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Processing {{ item.value }} (index={{ item.index }})" +``` + +### Nested loops + +Nested loops work the same structurally. The change is how you reference outer loop values: `item.parent.value` replaces `parents[0].taskrun.value`, and for deeper hierarchies `item.parents[n]` replaces `parents[n+1].taskrun.value`. + +**Before** + +```yaml +tasks: + - id: outer + type: io.kestra.plugin.core.flow.ForEach + values: [1, 2, 3] + tasks: + - id: inner + type: io.kestra.plugin.core.flow.ForEach + values: ["a", "b"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "outer={{ parents[0].taskrun.value }} inner={{ taskrun.value }}" +``` + +**After** + +```yaml +tasks: + - id: outer + type: io.kestra.plugin.core.flow.Loop + values: [1, 2, 3] + tasks: + - id: inner + type: io.kestra.plugin.core.flow.Loop + values: ["a", "b"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "outer={{ item.parent.value }} inner={{ item.value }}" +``` + +### Failure handling + +`ForEach` required wrapping tasks in `AllowFailure` to continue past failures. `Loop` replaces this with a first-class `transmitFailed` property. Set `transmitFailed: false` on the Loop task and remove the `AllowFailure` wrapper entirely. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: ["ok", "fail", "ok"] + tasks: + - id: guard + type: io.kestra.plugin.core.flow.AllowFailure + tasks: + - id: maybe_fail + type: io.kestra.plugin.core.flow.If + condition: '{{ taskrun.value == "fail" }}' + then: + - id: do_fail + type: io.kestra.plugin.core.execution.Fail + else: + - id: success + type: io.kestra.plugin.core.log.Log + message: "OK: {{ taskrun.value }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["ok", "fail", "ok"] + transmitFailed: false + tasks: + - id: maybe_fail + type: io.kestra.plugin.core.flow.If + condition: '{{ item.value == "fail" }}' + then: + - id: do_fail + type: io.kestra.plugin.core.execution.Fail + else: + - id: success + type: io.kestra.plugin.core.log.Log + message: "OK: {{ item.value }}" +``` + +### Iterating over a map + +`Loop` adds native support for map values — `ForEach` had no equivalent. If your flows previously worked around this limitation by serializing maps or splitting keys and values, you can simplify them. When `values` is a map, `item.key` holds the key and `item.value` holds the associated value. + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: + dev: http://dev.example.com + staging: http://staging.example.com + prod: http://prod.example.com + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "env={{ item.key }} url={{ item.value }}" +``` + +### Iterating over object lists + +If your `ForEach` flow iterated over a list of objects using `fromJson(taskrun.value).field`, the same pattern applies in `Loop` — only the variable name changes. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: + - { id: 101, email: "a@example.com" } + - { id: 102, email: "b@example.com" } + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "User {{ fromJson(taskrun.value).id }} -> {{ fromJson(taskrun.value).email }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: + - { id: 101, email: "a@example.com" } + - { id: 102, email: "b@example.com" } + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "User {{ fromJson(item.value).id }} -> {{ fromJson(item.value).email }}" +``` + +`item.value` is always a string when list elements are not plain strings. Never access `item.value.field` directly — use `fromJson(item.value).field`. + +### Outputs + +In `ForEach`, task outputs from all iterations were automatically merged into a single map in the parent execution, keyed by `taskrun.value`. Any task after the loop could access `outputs.task_id[value].field` without any extra configuration. + +In `Loop`, each iteration runs in its own sub-execution. Task outputs inside an iteration are **not** visible outside the loop by default. You must explicitly declare which values to expose using the `outputs` property on the Loop task, and set `fetchType` to control how they are stored and accessed downstream: + +| `fetchType` | Downstream access | When to use | +|---|---|---| +| `FETCH` | `outputs..outputs` — in-memory list of all iterations | Small iteration counts | +| `STORE` | `outputs..uri` — URI to a file in internal storage | Large iteration counts | +| `FETCH_ONE` | Last iteration's outputs only | When each pass overwrites the previous result | +| `AUTO` | Default; picks automatically based on whether `values` is a URI | General use | +| `NONE` | Outputs not surfaced | When you only care about side effects | + +With `FETCH`, after the loop completes, `outputs..outputs` is a list of iteration results — each entry has an `item` object (with `value`, `iteration`, and `key`) and an `outputs` map of the declared output values. Access a single iteration by index: `outputs..outputs[n].outputs.`. To extract one output across all iterations as a list, use the [`loopOutputs()` function](../../../expressions/04.functions/04.workflow/index.mdx#loopoutputs). With `STORE`, the results are written to internal storage and exposed as `outputs..uri` — use this for loops with large iteration counts where loading everything into memory is not practical. + +**Before** + +```yaml +tasks: + - id: for_each + type: io.kestra.plugin.core.flow.ForEach + values: ["a", "b", "c"] + tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "processed {{ taskrun.value }}" + + - id: summary + type: io.kestra.plugin.core.log.Log + message: "Results: {{ outputs.process | jq('[.[].value]') }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["a", "b", "c"] + outputs: + - id: result + type: STRING + value: "{{ outputs.process.value }}" + tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "processed {{ item.value }}" + + - id: summary + type: io.kestra.plugin.core.log.Log + message: | + Loop ran {{ outputs.loop.iterationCount }} iterations. + All results: {{ loopOutputs(outputs.loop.outputs, 'result') }} + First result: {{ outputs.loop.outputs[0].outputs.result }} +``` + +## Migrating `ForEachItem` + +`ForEachItem` dispatched each batch to a separate subflow execution. `Loop` offers two migration paths depending on whether you need per-batch execution isolation. + +### Inline processing + +For flows that used `ForEachItem` with `batch.rows: 1`, the migration is a direct substitution: replace the `ForEachItem` block and its subflow with a single `Loop` task with inline tasks. When `values` is an internal storage URI, `Loop` iterates one line per iteration with `item.value` holding the line content. + +**Before** + +```yaml +tasks: + - id: each_item + type: io.kestra.plugin.core.flow.ForEachItem + items: "{{ inputs.file }}" + batch: + rows: 1 + wait: true + namespace: company.team + flowId: process_item + inputs: + item: "{{ taskrun.items }}" +``` + +**After** + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: "{{ inputs.file }}" + tasks: + - id: process + type: io.kestra.plugin.core.log.Log + message: "Processing: {{ item.value }}" +``` + +### Isolated per-batch execution + +If your `ForEachItem` flows relied on subflow-level isolation — separate retries, separate logs, and failure containment per batch — preserve that isolation with `Loop` + `Subflow`. Split the source file into chunk URIs first, then loop over the URIs and call the child flow per batch. + +**Parent flow** + +```yaml +tasks: + - id: split + type: io.kestra.plugin.core.storage.Split + from: "{{ inputs.file }}" + rows: 100 + + - id: per_batch + type: io.kestra.plugin.core.flow.Loop + values: "{{ outputs.split.uris }}" + concurrencyLimit: 4 + fetchType: FETCH + outputs: + - id: result_uri + type: STRING + value: "{{ outputs.run_child.outputs.uri }}" + tasks: + - id: run_child + type: io.kestra.plugin.core.flow.Subflow + namespace: company.team + flowId: process_batch + wait: true + transmitFailed: true + inputs: + batch_uri: "{{ item.value }}" + + - id: concat + type: io.kestra.plugin.core.storage.Concat + files: "{{ loopOutputs(outputs.per_batch.outputs, 'result_uri') }}" + extension: .ion +``` + +**Child flow** (`process_batch`) + +```yaml +inputs: + - id: batch_uri + type: STRING + +tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "{{ inputs.batch_uri }}" + +outputs: + - id: uri + type: STRING + value: "{{ outputs.process.value }}" +``` + +The child flow surfaces its result through a flow-level `outputs:` declaration. The parent collects all result URIs with `loopOutputs(outputs.per_batch.outputs, 'result_uri')` and passes them to `Concat`. + +For flows that used `ForEachItem` with batch sizes larger than 1, set `rows` on the `Split` task to your batch size — each chunk URI passed to the child flow then contains that many rows. + +## Migration steps + +1. Search all flows for `io.kestra.plugin.core.flow.ForEach` and replace the task type with `io.kestra.plugin.core.flow.Loop`. +2. Replace every `{{ taskrun.value }}` inside the loop with `{{ item.value }}`. +3. Replace every `{{ taskrun.iteration }}` inside the loop with `{{ item.index }}`. +4. Remove `parent.taskrun.value` and `parents[0].taskrun.value` references inside nested flowables — `{{ item.value }}` works directly at any nesting depth. +5. Update post-loop output access: declare `outputs` on the Loop task and set `fetchType`. With `FETCH`, `outputs..outputs` is an in-memory list of iteration results — access a single entry by index with `outputs..outputs[n].outputs.`, or extract one field across all iterations as a list with `{{ loopOutputs(outputs..outputs, '') }}`. With `STORE`, results are written to internal storage and exposed as `outputs..uri`. See the [fetchType table](#outputs) in the Outputs section for all modes. +6. Search all flows for `io.kestra.plugin.core.flow.ForEachItem` and migrate to `Loop` with a URI value as described above. +7. Validate each updated flow by saving it in the Kestra UI or via the API and confirming no parse errors. diff --git a/src/contents/docs/11.migration-guide/v2.0.0/index.mdx b/src/contents/docs/11.migration-guide/v2.0.0/index.mdx new file mode 100644 index 00000000000..ee5f13ce743 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/index.mdx @@ -0,0 +1,25 @@ +--- +title: 2.0.0 +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +description: Migration guide for upgrading from Kestra 1.3 to 2.0.0, covering deprecated properties and trigger types that are now removed. +--- + +import ChildCard from "~/components/docs/ChildCard.astro" + +This guide covers what you need to change when upgrading from **Kestra 1.3** to **2.0.0**. + +:::alert{type="warning"} +**Start from Kestra 1.3.** This guide assumes you are already running the latest **1.3.x** release. If you are on an older version, complete the required metadata migrations before upgrading to 2.0.0: + +- [KV Store and Secrets metadata migration](../v1.1.0/kv-secrets-metadata-migration) (introduced in 1.1) +- [Namespace Files metadata migration](../v1.2.0/namespace-file-migration) (introduced in 1.2) + +If you are upgrading directly from 1.0, the [LTS migration guide (1.0 → 1.3)](../v1.3.0/lts-migration) consolidates all required steps in one pass. +::: + +## What changed in 2.0.0 + +Kestra 2.0.0 removes several task properties and trigger condition types that were deprecated throughout the 1.x series but continued to work as aliases. If you completed the 1.3 migration, the changes in this guide are all that remain before upgrading. + + \ No newline at end of file diff --git a/src/contents/docs/11.migration-guide/v2.0.0/json-function-removed/index.md b/src/contents/docs/11.migration-guide/v2.0.0/json-function-removed/index.md new file mode 100644 index 00000000000..1b7c2f66f86 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/json-function-removed/index.md @@ -0,0 +1,28 @@ +--- +title: json() Function Removed +sidebarTitle: json() → fromJson() +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: The json() Pebble function has been removed in Kestra 2.0.0. Replace all calls to json() with fromJson() — the signature is identical. +--- + +The `json()` Pebble function has been removed in Kestra 2.0.0. Replace every call to `json(...)` with `fromJson(...)`. The function signature and behavior are identical. + +## Before + +```twig +{{ json(outputs.request.body).products[0].id }} +{{ json(kv('my_json_key')).field }} +``` + +## After + +```twig +{{ fromJson(outputs.request.body).products[0].id }} +{{ fromJson(kv('my_json_key')).field }} +``` + +## What to update + +Search your flows and templates for `json(` and replace each occurrence with `fromJson(`. The `json` Pebble test (`{% if x is json %}`) is unrelated and still works — only the function call form changes. diff --git a/src/contents/docs/11.migration-guide/v2.0.0/local-delete-recursive-default/index.md b/src/contents/docs/11.migration-guide/v2.0.0/local-delete-recursive-default/index.md new file mode 100644 index 00000000000..8128b2e88e5 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/local-delete-recursive-default/index.md @@ -0,0 +1,41 @@ +--- +title: local.Delete recursive Default Changed to false +sidebarTitle: local.Delete recursive Default +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: The recursive property of io.kestra.plugin.fs.local.Delete now defaults to false. Flows that delete a directory without setting recursive explicitly will stop removing subdirectory contents after upgrading. +--- + +The `recursive` property of `io.kestra.plugin.fs.local.Delete` now defaults to `false` instead of `true`. + +:::alert{type="warning"} +Flows that call `io.kestra.plugin.fs.local.Delete` on a directory without setting `recursive` explicitly will silently stop deleting subdirectory contents after upgrading to 2.0.0. No error is raised — the task will succeed but leave nested files in place. +::: + +## Why the change + +The previous default of `true` made directory deletions recursive without any explicit opt-in. A misconfigured `from` path could wipe an entire directory tree. The new default of `false` matches the behavior of every other `Delete` task in `plugin-fs` (SFTP, FTP, NFS, SMB) and requires an explicit opt-in for recursive deletion. + +## Migration steps + +1. Search all flows for tasks of type `io.kestra.plugin.fs.local.Delete`. +2. For each task where `from` points to a directory and `recursive` is not set, add `recursive: true` to preserve the previous behavior. +3. For tasks where `from` points to a single file, no change is needed — `recursive` has no effect on file targets. + +**Before** (recursive deletion happened implicitly) + +```yaml +- id: cleanup + type: io.kestra.plugin.fs.local.Delete + from: /data/uploads/processed/ +``` + +**After** (opt in to keep the same behavior) + +```yaml +- id: cleanup + type: io.kestra.plugin.fs.local.Delete + from: /data/uploads/processed/ + recursive: true +``` diff --git a/src/contents/docs/11.migration-guide/v2.0.0/plugin-defaults-forced-removed/index.md b/src/contents/docs/11.migration-guide/v2.0.0/plugin-defaults-forced-removed/index.md new file mode 100644 index 00000000000..9eae12835bf --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/plugin-defaults-forced-removed/index.md @@ -0,0 +1,58 @@ +--- +title: pluginDefaults.forced Removed from Flows +sidebarTitle: pluginDefaults.forced Removed from Flows +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: The forced property is removed from flow-level pluginDefaults in Kestra 2.0. Use namespace-level Plugin Defaults or global configuration to enforce defaults that tasks cannot override. +--- + +The `forced` property is removed from flow-level `pluginDefaults` in Kestra 2.0. + +:::alert{type="warning"} +Flows that include `forced: true` inside a `pluginDefaults` block will fail to parse after upgrading to 2.0.0. Remove this property before upgrading. +::: + +## Why the change + +`forced: true` in a flow's `pluginDefaults` let a flow author override any value a task explicitly set. This created a security problem: a regular user editing a flow could use `forced: true` to override plugin defaults that a platform administrator had configured at the namespace or tenant level. Platform administrators are now solely responsible for enforcing defaults, and must do so at the namespace or global configuration level. + +Flow-level `pluginDefaults` continue to work for setting convenient defaults — they just can no longer override what a task explicitly declares. + +## Migration steps + +1. Search all flows for `pluginDefaults` blocks that include `forced: true`. +2. Remove the `forced: true` line from each flow. +3. If you need to prevent tasks from overriding a default, move that default to the namespace **Plugin Defaults** tab (Enterprise Edition) or to the `kestra.plugins.defaults` section of your global Kestra configuration. + +**Before** + +```yaml +pluginDefaults: + - type: io.kestra.plugin.scripts.runner.docker.Docker + forced: true + values: + pullPolicy: NEVER +``` + +**After** + +```yaml +pluginDefaults: + - type: io.kestra.plugin.scripts.runner.docker.Docker + values: + pullPolicy: NEVER +``` + +To enforce the value so tasks cannot override it, configure `forced: true` at the global or namespace level instead: + +```yaml +# kestra.yml — global configuration +kestra: + plugins: + defaults: + - type: io.kestra.plugin.scripts.runner.docker.Docker + forced: true + values: + pullPolicy: NEVER +``` diff --git a/src/contents/docs/11.migration-guide/v2.0.0/trigger-conditions-redesign/index.md b/src/contents/docs/11.migration-guide/v2.0.0/trigger-conditions-redesign/index.md new file mode 100644 index 00000000000..dcc6990cc41 --- /dev/null +++ b/src/contents/docs/11.migration-guide/v2.0.0/trigger-conditions-redesign/index.md @@ -0,0 +1,1053 @@ +--- +title: Trigger Conditions Redesign +sidebarTitle: Trigger Conditions Redesign +icon: /src/contents/docs/icons/migration-guide.svg +release: 2.0.0 +editions: ["OSS", "EE"] +description: The conditions list on triggers and the preconditions block on Flow triggers are removed in Kestra 2.0. All trigger types use a top-level when Pebble expression. Flow triggers also use dependsOn and window. +--- + +Kestra 2.0 replaces the `conditions` and `preconditions` system across all trigger types. + +- **All trigger types (Schedule, Webhook, HTTP, Flow, and others)** — the `conditions` list is removed in favor of a top-level `when` Pebble expression. +- **Flow triggers** — both `conditions` and `preconditions` are removed in favor of `dependsOn` (upstream flow entries) and `window` (time window configuration). +- **Flow trigger outputs** — scoped by flow ID: `trigger.outputs..`. +- **Input rendering failures** — now create a `FAILED` execution instead of silently dropping the event. + +Both `conditions` and `preconditions` are removed in Kestra 2.0. Flows that still use them will fail to parse after upgrading. + +## `conditions` → `when` on all triggers + +All trigger types gain a top-level `when` property containing a Pebble expression. When the expression evaluates to `true`, the trigger fires; when `false`, it is skipped. This replaces the `conditions` list, which required a fully qualified Java type for every filtering need and did not compose cleanly across trigger types. + +### `when` expression context + +The variables available in a `when` expression depend on the trigger type: + +| Trigger type | Available variables | +|---|---| +| Schedule | `trigger.date`, `trigger.timestamp` | +| Webhook | `trigger.body`, `trigger.headers` | +| Flow | `namespace`, `flowId`, `state`, `labels`, `outputs`, `hasRetryAttempt` | + +:::alert{type="info"} +**Schedule date skipping:** When a Schedule trigger has a `when` expression, the scheduler evaluates it against each candidate date. If `when` evaluates to `false`, the scheduler skips that date and advances to the next cron-matching date. This is the same behavior as the previous `conditions` on Schedule triggers — `when` controls which scheduled dates fire, not just whether a single date fires. +::: + +### New Pebble helper functions + +These functions are introduced specifically for `when` expressions to replace verbose date formatting patterns: + +| Function | Signature | Description | +|---|---|---| +| `isPublicHoliday` | `isPublicHoliday(date, countryCode[, subDivision])` | Returns `true` if the date is a public holiday. Backed by Jollyday. Optional third argument for sub-divisions (e.g. `'IDF'`). | +| `isDayWeekInMonth` | `isDayWeekInMonth(date, dayOfWeek, position)` | Returns `true` if the date is the Nth occurrence of a weekday in its month. `position` accepts `FIRST`, `SECOND`, `THIRD`, `FOURTH`, or `LAST`. | +| `isWeekend` | `isWeekend(date)` | Returns `true` if the date falls on Saturday or Sunday. | +| `dayOfWeek` | `dayOfWeek(date)` | Returns the day name as a string (`MONDAY`, `TUESDAY`, …, `SUNDAY`). | +| `hourOfDay` | `hourOfDay(date)` | Returns the hour as an integer (0–23). | +| `dayOfMonth` | `dayOfMonth(date)` | Returns the day of the month as an integer (1–31). | +| `monthOfYear` | `monthOfYear(date)` | Returns the month as an integer (1–12). | + +Existing Pebble filters (`startsWith`, `endsWith`, `date`) and operators (`and`, `or`, `not`, `==`, `!=`, `>`, `<`, `>=`, `<=`) cover the remaining use cases. + +### Schedule: specific day of week + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + conditions: + - type: io.kestra.plugin.core.condition.DayWeek + dayOfWeek: MONDAY +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + when: "{{ dayOfWeek(trigger.date) == 'MONDAY' }}" +``` + +### Schedule: weekends only + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + conditions: + - type: io.kestra.plugin.core.condition.Weekend +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + when: "{{ isWeekend(trigger.date) }}" +``` + +### Schedule: weekdays only (exclude weekends) + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + conditions: + - type: io.kestra.plugin.core.condition.Not + conditions: + - type: io.kestra.plugin.core.condition.Weekend +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + when: "{{ not isWeekend(trigger.date) }}" +``` + +### Schedule: exclude Sundays + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + conditions: + - type: io.kestra.plugin.core.condition.Not + conditions: + - type: io.kestra.plugin.core.condition.DayWeek + dayOfWeek: SUNDAY +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 9 * * *" + when: "{{ dayOfWeek(trigger.date) != 'SUNDAY' }}" +``` + +### Schedule: public holidays + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + conditions: + - type: io.kestra.plugin.core.condition.PublicHoliday + country: FR +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + when: "{{ isPublicHoliday(trigger.date, 'FR') }}" +``` + +With a sub-division: `{{ isPublicHoliday(trigger.date, 'FR', 'IDF') }}`. + +### Schedule: workdays only (not weekend, not public holiday) + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + conditions: + - type: io.kestra.plugin.core.condition.Not + conditions: + - type: io.kestra.plugin.core.condition.PublicHoliday + country: FR + - type: io.kestra.plugin.core.condition.Weekend +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + when: "{{ not isWeekend(trigger.date) and not isPublicHoliday(trigger.date, 'FR') }}" +``` + +### Schedule: first Monday of the month + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * 1" + conditions: + - type: io.kestra.plugin.core.condition.DayWeekInMonth + dayOfWeek: MONDAY + dayInMonth: FIRST +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * 1" + when: "{{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }}" +``` + +### Schedule: date range + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "*/5 * * * *" + conditions: + - type: io.kestra.plugin.core.condition.DateTimeBetween + after: "2025-12-31T23:59:59Z" + before: "2026-06-30T23:59:59Z" +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "*/5 * * * *" + when: "{{ trigger.date > '2025-12-31T23:59:59Z' and trigger.date < '2026-06-30T23:59:59Z' }}" +``` + +### Schedule: specific hours only + +**Before** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 * * * *" + conditions: + - type: io.kestra.plugin.core.condition.TimeBetween + after: "08:00:00" + before: "17:00:00" +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 * * * *" + when: "{{ hourOfDay(trigger.date) >= 8 and hourOfDay(trigger.date) < 17 }}" +``` + +### Schedule: combining multiple conditions + +**Before** (first Monday of the month, skip public holidays in France) + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + conditions: + - type: io.kestra.plugin.core.condition.DayWeekInMonth + dayOfWeek: MONDAY + dayInMonth: FIRST + - type: io.kestra.plugin.core.condition.Not + conditions: + - type: io.kestra.plugin.core.condition.PublicHoliday + country: FR +``` + +**After** + +```yaml +triggers: + - id: schedule + type: io.kestra.plugin.core.trigger.Schedule + cron: "0 11 * * *" + when: "{{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') and not isPublicHoliday(trigger.date, 'FR') }}" +``` + +### Webhook: filter by body + +**Before** + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: 4wjtkzwVGBM9yKnjm3yv8r + conditions: + - type: io.kestra.plugin.core.condition.Expression + expression: "{{ trigger.body.hello == 'world' }}" +``` + +**After** + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: 4wjtkzwVGBM9yKnjm3yv8r + when: "{{ trigger.body.hello == 'world' }}" +``` + +### Webhook: filter by header and body + +**Before** + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: myKey + conditions: + - type: io.kestra.plugin.core.condition.Expression + expression: "{{ trigger.headers['X-Event-Type'] == 'deploy' }}" + - type: io.kestra.plugin.core.condition.Expression + expression: "{{ trigger.body.environment == 'production' }}" +``` + +**After** + +```yaml +triggers: + - id: webhook + type: io.kestra.plugin.core.trigger.Webhook + key: myKey + when: "{{ trigger.headers['X-Event-Type'] == 'deploy' and trigger.body.environment == 'production' }}" +``` + +Multiple `Expression` conditions combine into a single `when` expression using `and` / `or`. + +### What replaces what + +| Old condition type | New `when` expression | +|---|---| +| `DayWeek` (e.g. MONDAY) | `{{ dayOfWeek(trigger.date) == 'MONDAY' }}` | +| `Weekend` | `{{ isWeekend(trigger.date) }}` | +| `Not` > `Weekend` (weekdays only) | `{{ not isWeekend(trigger.date) }}` | +| `Not` > `DayWeek` SUNDAY (exclude Sundays) | `{{ dayOfWeek(trigger.date) != 'SUNDAY' }}` | +| `PublicHoliday` (country: FR) | `{{ isPublicHoliday(trigger.date, 'FR') }}` | +| `Not` > `PublicHoliday` + `Weekend` (workdays) | `{{ not isWeekend(trigger.date) and not isPublicHoliday(trigger.date, 'FR') }}` | +| `DayWeekInMonth` (MONDAY, FIRST) | `{{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }}` | +| `DateTimeBetween` (after/before) | `{{ trigger.date > '2025-12-31T23:59:59Z' and trigger.date < '2026-06-30T23:59:59Z' }}` | +| `TimeBetween` (08:00-17:00) | `{{ hourOfDay(trigger.date) >= 8 and hourOfDay(trigger.date) < 17 }}` | +| `Expression` (custom Pebble) | Direct `when` expression, no wrapper needed | +| `Expression` on webhook body/headers | `{{ trigger.body.field == 'value' }}` or `{{ trigger.headers['X-Key'] == 'value' }}` | +| Multiple `Expression` conditions | Combined with `and` / `or` in a single `when` | + +For the full list of Pebble calendar helper functions (`isWeekend`, `isPublicHoliday`, `isDayWeekInMonth`, `hourOfDay`, etc.), see the [date and calendar helpers](../../../expressions/index.mdx#date-and-calendar-helpers) reference. + +## `conditions` and `preconditions` → `dependsOn` on Flow triggers + +Both `conditions` (execution-level types such as `ExecutionStatus`, `ExecutionFlow`, `ExecutionNamespace`) and `preconditions` (upstream flow lists with time windows) are replaced by a single `dependsOn` list. Each entry declares one upstream dependency with typed properties. + +### `dependsOn` entry properties + +| Property | Type | Default | Description | +|---|---|---|---| +| `flowId` | string | — | Exact flow ID to match. Omit to match any flow. | +| `namespace` | string | — | Exact namespace to match. Use `when` for prefix or pattern matching. | +| `states` | list | `[SUCCESS, WARNING]` | Execution states that satisfy this entry. | +| `labels` | map | — | Labels the upstream execution must carry (all must match). | +| `when` | string | — | Pebble expression for additional filtering on the upstream execution context. | + +Both `flowId` and `namespace` use exact matching: `namespace: company.team` matches only `company.team`, not `company.team.project`. For prefix or pattern matching, use `when` with `startsWith` or `endsWith`. + +:::alert{type="warning"} +The default `states` changed from `[SUCCESS, WARNING, PAUSED]` to `[SUCCESS, WARNING]`. If your flows relied on `PAUSED` being included by default, add it explicitly: `states: [SUCCESS, WARNING, PAUSED]`. +::: + +### Single upstream flow + +The `preconditions` block and the `conditions`-based approach both map to a single `dependsOn` entry. + +**Before (from `preconditions`)** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + preconditions: + id: flows + flows: + - namespace: company.team + flowId: extract + states: [SUCCESS] +``` + +**Before (from `conditions`)** + +```yaml +triggers: + - id: on_completion + type: io.kestra.plugin.core.trigger.Flow + states: [SUCCESS] + conditions: + - type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.team + flowId: extract +``` + +**After** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: extract + namespace: company.team + states: [SUCCESS] +``` + +### Multiple upstream flows with a deadline + +**Before** + +```yaml +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + preconditions: + id: staging_deps + timeWindow: + type: DAILY_TIME_DEADLINE + deadline: "09:00:00+01:00" + flows: + - namespace: company.team + flowId: stg_sales + states: [SUCCESS] + - namespace: company.team + flowId: stg_marketing + states: [SUCCESS] +``` + +**After** + +```yaml +triggers: + - id: after_staging + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team + window: + deadline: "09:00:00+01:00" +``` + +`states` defaults to `[SUCCESS, WARNING]`. `window` moves to the trigger level. See [Window configuration](#window-configuration) for all window types and the `onMiss` property. + +### Multiple upstream flows (from `multipleConditions`) + +**Before** + +```yaml +triggers: + - id: multiple_listen_flow + type: io.kestra.plugin.core.trigger.Flow + multipleConditions: + - id: multiple + window: P1D + windowAdvance: P0D + conditions: + flow_a: + type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.team + flowId: multiplecondition_flow_a + flow_b: + type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.team + flowId: multiplecondition_flow_b +``` + +**After** + +```yaml +triggers: + - id: multiple_listen_flow + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: multiplecondition_flow_a + namespace: company.team + states: [SUCCESS] + - flowId: multiplecondition_flow_b + namespace: company.team + states: [SUCCESS] + window: + every: PT1D +``` + +The arbitrary string keys (`flow_a`, `flow_b`) are dropped — `dependsOn` is always a list. The `windowAdvance` property is removed with no direct equivalent. + +### Namespace-wide alerting (prefix matching) + +**Before** + +```yaml +triggers: + - id: alert_on_failure + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.ExecutionStatus + in: + - FAILED + - WARNING + - type: io.kestra.plugin.core.condition.ExecutionNamespace + namespace: company + comparison: PREFIX +``` + +**After** + +```yaml +triggers: + - id: alert_on_failure + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - states: [FAILED, WARNING] + when: "{{ namespace | startsWith('company') }}" +``` + +`namespace` in `dependsOn` is an exact match. Use `when` with `startsWith` for prefix matching. + +### Label-based filtering + +**Before** + +```yaml +triggers: + - id: after_prod + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.ExecutionStatus + in: [SUCCESS] + - type: io.kestra.plugin.core.condition.ExecutionLabels + labels: + env: production +``` + +**After** + +```yaml +triggers: + - id: after_prod + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - namespace: company.team + labels: + env: production + states: [SUCCESS] +``` + +### Conditional filtering with expressions + +**Before** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + preconditions: + id: my_filter + where: + - id: flow1 + filters: + - field: NAMESPACE + type: STARTS_WITH + value: io.kestra.tests + - field: EXPRESSION + type: IS_TRUE + value: "{{ labels.some == 'label' }}" +``` + +**After** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - when: "{{ namespace | startsWith('io.kestra.tests') }}" + states: [SUCCESS] + labels: + some: label +``` + +`labels` handles exact key-value matching declaratively. `when` handles everything else. + +### Filtering on upstream execution outputs + +**Before** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.ExecutionOutputs + expression: "{{ outputs.row_count > 0 }}" +``` + +**After** + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: extract + namespace: company.team + when: "{{ outputs.row_count > 0 }}" +``` + +### Filtering on retry attempts + +**Before** + +```yaml +triggers: + - id: after_flaky + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.HasRetryAttempt +``` + +**After** + +```yaml +triggers: + - id: after_flaky + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flaky_pipeline + namespace: company.team + states: [SUCCESS] + when: "{{ hasRetryAttempt == true }}" +``` + +### Negation: trigger on any state except SUCCESS + +**Before** + +```yaml +triggers: + - id: on_non_success + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.Not + conditions: + - type: io.kestra.plugin.core.condition.ExecutionStatus + in: [SUCCESS] +``` + +**After (option 1 — explicit states)** + +```yaml +triggers: + - id: on_non_success + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: extract + namespace: company.team + states: [FAILED, WARNING, KILLED, CANCELLED] +``` + +**After (option 2 — `when` expression)** + +```yaml +triggers: + - id: on_non_success + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: extract + namespace: company.team + when: "{{ state != 'SUCCESS' }}" +``` + +### Mixed triggers: success and failure on the same upstream flow + +**Before** + +```yaml +triggers: + - id: on_completion + type: io.kestra.plugin.core.trigger.Flow + states: [SUCCESS] + conditions: + - type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.team + flowId: flow_a + - id: on_failure + type: io.kestra.plugin.core.trigger.Flow + states: [FAILED] + preconditions: + id: flowsFailure + flows: + - namespace: company.team + flowId: flow_a + states: [FAILED] +``` + +**After** + +```yaml +triggers: + - id: on_completion + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flow_a + namespace: company.team + states: [SUCCESS] + - id: on_failure + type: io.kestra.plugin.core.trigger.Flow + dependsOn: + - flowId: flow_a + namespace: company.team + states: [FAILED] +``` + +Same `dependsOn` syntax regardless of whether the original used `conditions` or `preconditions`. + +### Passing outputs downstream + +Flow trigger outputs are now scoped by flow ID. The path format is `trigger.outputs..`. + +**Before** (flat map — all upstream outputs merged together) + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + inputs: + date: "{{ trigger.outputs.date }}" + preconditions: + id: flows + flows: + - namespace: company.team + flowId: extract + states: [SUCCESS] +``` + +**After** (scoped by flow ID) + +```yaml +triggers: + - id: after_extract + type: io.kestra.plugin.core.trigger.Flow + inputs: + date: "{{ trigger.outputs.extract.date }}" + dependsOn: + - flowId: extract + namespace: company.team +``` + +For multi-flow triggers, each upstream flow's outputs are accessed under its own key: + +```yaml +dependsOn: + - flowId: stg_sales + namespace: company.team + - flowId: stg_marketing + namespace: company.team +``` + +Access as `{{ trigger.outputs.stg_sales.row_count }}` and `{{ trigger.outputs.stg_marketing.row_count }}`. + +:::alert{type="warning"} +**Breaking change for multi-flow triggers.** Update all `trigger.outputs.` references to `trigger.outputs..`. For triggers with a single `dependsOn` entry, the unscoped form `{{ trigger.outputs. }}` still works as a shorthand — no update required. +::: + +#### ForEachItem chain + +When using Flow triggers to chain `ForEachItem` child flows, reference the child flow's outputs using its `flowId`: + +**Before** + +```yaml +triggers: + - id: 01_complete + type: io.kestra.plugin.core.trigger.Flow + inputs: + testFile: "{{ trigger.outputs.myFile }}" + preconditions: + id: output_01_success + flows: + - namespace: io.kestra.tests.trigger.foreachitem + flowId: flow-trigger-for-each-item-child + states: [SUCCESS] +``` + +**After** + +```yaml +triggers: + - id: 01_complete + type: io.kestra.plugin.core.trigger.Flow + inputs: + testFile: "{{ trigger.outputs.flow-trigger-for-each-item-child.myFile }}" + dependsOn: + - flowId: flow-trigger-for-each-item-child + namespace: io.kestra.tests.trigger.foreachitem +``` + +### `mode`: OR and N-of-M logic + +The `mode` property controls how `dependsOn` entries are combined when evaluating whether to fire. + +| Value | Behavior | Required properties | +|---|---|---| +| `ALL` (default) | Fires when all `dependsOn` entries are satisfied | — | +| `ANY` | Fires as soon as any one entry is satisfied | — | +| `AT_LEAST` | Fires when at least `minSatisfied` entries are satisfied | `minSatisfied` (integer ≥ 1, ≤ entry count) | + +#### OR logic: fire when any upstream completes + +Previously, OR logic required N separate Flow triggers. `mode: ANY` consolidates them into one. + +**Before** (two separate triggers) + +```yaml +triggers: + - id: on_salesforce + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.sources + flowId: ingest_salesforce + - type: io.kestra.plugin.core.condition.ExecutionStatus + in: [SUCCESS] + - id: on_hubspot + type: io.kestra.plugin.core.trigger.Flow + conditions: + - type: io.kestra.plugin.core.condition.ExecutionFlow + namespace: company.sources + flowId: ingest_hubspot + - type: io.kestra.plugin.core.condition.ExecutionStatus + in: [SUCCESS] +``` + +**After** + +```yaml +triggers: + - id: react_to_any_source + type: io.kestra.plugin.core.trigger.Flow + mode: ANY + dependsOn: + - flowId: ingest_salesforce + namespace: company.sources + states: [SUCCESS] + - flowId: ingest_hubspot + namespace: company.sources + states: [SUCCESS] +``` + +`mode: ANY` fires as soon as either dependency is satisfied. The default `mode: ALL` requires every entry to be satisfied before the trigger fires. + +#### OR logic with a time window + +```yaml +triggers: + - id: daily_any_source + type: io.kestra.plugin.core.trigger.Flow + mode: ANY + dependsOn: + - flowId: ingest_salesforce + namespace: company.sources + - flowId: ingest_hubspot + namespace: company.sources + window: + deadline: "09:00:00" +``` + +Fire before 9 AM when either source completes. + +#### N of M: at least 2 out of 3 + +```yaml +triggers: + - id: partial_success + type: io.kestra.plugin.core.trigger.Flow + mode: AT_LEAST + minSatisfied: 2 + dependsOn: + - flowId: ingest_salesforce + namespace: company.sources + states: [SUCCESS] + - flowId: ingest_hubspot + namespace: company.sources + states: [SUCCESS] + - flowId: ingest_zendesk + namespace: company.sources + states: [SUCCESS] + window: + deadline: "09:00:00" +``` + +`mode: AT_LEAST` fires when `minSatisfied` entries are satisfied. `minSatisfied` must be ≥ 1 and ≤ the number of `dependsOn` entries. + +### What replaces what + +| Old property / condition type | New equivalent | +|---|---| +| `conditions` list on Flow trigger | `dependsOn` list | +| `preconditions` block | `dependsOn` list + `window` | +| `multipleConditions` block | `dependsOn` list + `window.every` | +| `ExecutionStatus` (`in: [SUCCESS]`) | `states: [SUCCESS]` on the `dependsOn` entry | +| `ExecutionFlow` (`flowId`, `namespace`) | `flowId` + `namespace` on the `dependsOn` entry | +| `ExecutionNamespace` (exact) | `namespace` on the `dependsOn` entry | +| `ExecutionNamespace` (`comparison: PREFIX`) | `when: "{{ namespace \| startsWith('...') }}"` on the entry | +| `ExecutionLabels` (`labels: {k: v}`) | `labels: {k: v}` on the `dependsOn` entry | +| `ExecutionOutputs` (`expression`) | `when` with `outputs.` on the entry | +| `HasRetryAttempt` | `when: "{{ hasRetryAttempt == true }}"` on the entry | +| `Not` > `ExecutionStatus` | Explicit `states` list or `when: "{{ state != 'SUCCESS' }}"` | +| Multiple triggers for OR logic | `mode: ANY` with `dependsOn` entries | +| `preconditions.resetOnSuccess: true` | `window.fireOnce: true` | +| `timeWindow.type: DAILY_TIME_DEADLINE` | `window.deadline` | +| `timeWindow.type: DAILY_TIME_WINDOW` | `window.from` + `window.to` | +| `timeWindow.type: DURATION_WINDOW` | `window.every` | +| `timeWindow.type: SLIDING_WINDOW` | `window.lookback` | + +## Window configuration + +The `window` property applies to Flow triggers and controls how Kestra accumulates upstream executions before evaluating `dependsOn` entries. Set exactly one property group per window; combining groups is a validation error. + +| Window type | Properties | Behavior | +|---|---|---| +| Deadline | `deadline: "09:00:00+01:00"` | Upstream flows must complete by a fixed time each day | +| Daily time range | `from: "06:00:00"` + `to: "12:00:00"` | Only executions within a daily time range count | +| Fixed interval | `every: PT1D` + optional `offset: PT6H` | Recurring window of a fixed size, offset from midnight | +| Lookback | `lookback: PT1H` | Rolling window looking back from the current evaluation time | + +`fireOnce: true` can be added to any window type to limit the trigger to firing once per window period rather than every time conditions are met. + +### Deadline + +```yaml +window: + deadline: "09:00:00+01:00" +``` + +### Daily time range + +```yaml +window: + from: "06:00:00" + to: "12:00:00" +``` + +### Fixed interval + +```yaml +window: + every: PT1D + offset: PT6H +``` + +### Lookback + +```yaml +window: + lookback: PT1H +``` + +### Fire once per window + +```yaml +window: + deadline: "09:00:00+01:00" + fireOnce: true +``` + +Default is `false` — the trigger fires every time conditions are met within the window. + +### SLA misses with `onMiss` + +`onMiss` is a trigger-level property (peer to `window`) that declares what happens when the deadline passes without all dependencies being satisfied: + +```yaml +onMiss: + behavior: FAIL + labels: + sla: miss + reason: upstreamNotFinishedOnTime +``` + +`behavior: FAIL` creates a `FAILED` execution when the deadline passes. Labels are applied to that execution for downstream alerting. + +### Replacing `timeWindow` types + +| Old `timeWindow.type` | New `window` property | +|---|---| +| `DAILY_TIME_DEADLINE` | `deadline: "09:00:00+01:00"` | +| `DAILY_TIME_WINDOW` | `from: "06:00:00"` + `to: "12:00:00"` | +| `DURATION_WINDOW` | `every: PT1D` + optional `offset: PT6H` | +| `SLIDING_WINDOW` | `lookback: PT1H` | + +`preconditions.resetOnSuccess: true` maps to `window.fireOnce: true`. + +## Behavior changes after upgrading + +### Silent failures → FAILED executions + +Previously, if an expression on a Flow trigger failed to render (for example, because an upstream output key did not exist), the trigger silently dropped the event and no execution was created. In Kestra 2.0, a `FAILED` execution is created instead, making failures visible in the UI and actionable via downstream alerting. + +No migration action is required. Review your Flow trigger `inputs` expressions to ensure they reference valid output keys and avoid unexpected `FAILED` executions after upgrading. + +### State store reset and in-flight events + +Previously, auto-generated condition keys (`condition_1`, `condition_2`, …) meant that reordering entries could reset accumulated window state. In Kestra 2.0, `dependsOn` entry keys are derived from each entry's `namespace` and `flowId`, making them order-independent. + +The trigger-level state store key also changes: the old scheme used `preconditions.id`; the new scheme uses `{flowId}/{triggerId}`. Existing accumulated state from `preconditions` will not be found after upgrading — in-flight multi-flow triggers re-evaluate from scratch. For most deployments this means at most one missed trigger cycle. + +Old-format events in the async queue are discarded gracefully (logged as a warning). No user action is required. + +## Migration steps + +1. **Replace `conditions:` on all triggers** with a `when:` Pebble expression. This applies to Schedule, Webhook, HTTP, and any other trigger type that used `conditions`. +2. **Replace `conditions:` and `preconditions:` on Flow triggers** with `dependsOn:` entries and (if applicable) `window:`. +3. **Check for `PAUSED` state dependencies.** The default `states` changed from `[SUCCESS, WARNING, PAUSED]` to `[SUCCESS, WARNING]`. Add `PAUSED` explicitly if your flows depended on it: `states: [SUCCESS, WARNING, PAUSED]`. +4. **Update `trigger.outputs` references** in multi-flow triggers from `trigger.outputs.` to `trigger.outputs..`. Single-flow triggers can keep the unscoped form. +5. **Update `timeWindow` to `window`** using the property mapping table above. +6. **Validate** by saving updated flows in the Kestra UI or via the API and confirming they parse without errors. diff --git a/src/contents/docs/14.best-practices/0.flows/index.md b/src/contents/docs/14.best-practices/0.flows/index.md index 761be303a8b..9f1812977f5 100644 --- a/src/contents/docs/14.best-practices/0.flows/index.md +++ b/src/contents/docs/14.best-practices/0.flows/index.md @@ -74,38 +74,16 @@ This helps prevent stalled executions and ensures resource efficiency. ## Flow trigger on state change -Kestra can automatically start a flow as soon as another flow completes. This makes it easy to create dependencies between flows, even when they are owned by different teams. For example, a flow can trigger based on the `state` of another flow’s execution. There are multiple ways to configure this behavior, but one approach is recommended as a best practice. - -Take the following two triggers polling one specific flow: one using `preconditions.flows.states` to define the required `states` and the other using the `states` property. - -**Option 1** - -```yaml -triggers: - - id: release - type: io.kestra.plugin.core.trigger.Flow - preconditions: - id: flows - flows: - - namespace: company.release - flowId: parent - states: - - SUCCESS -``` - -or **Option 2** +Kestra can automatically start a flow as soon as another flow completes. This makes it easy to create dependencies between flows, even when they are owned by different teams. Use `dependsOn` to declare the upstream flow and the required states: ```yaml triggers: - id: release type: io.kestra.plugin.core.trigger.Flow - states: - - SUCCESS - preconditions: - id: flows - flows: - - namespace: company.release - flowId: parent + dependsOn: + - namespace: company.release + flowId: parent + states: [SUCCESS] ``` -While both configurations will work, **Option 1** is the recommended approach. It is more performant and declarative compared to **Option 2**, especially when working with flow triggers dependent on state. +`states` defaults to `[SUCCESS, WARNING]`. Declare it explicitly when you need a different set. diff --git a/src/contents/docs/14.best-practices/11.foreach-and-foreachitem/index.md b/src/contents/docs/14.best-practices/11.foreach-and-foreachitem/index.md deleted file mode 100644 index df1bab59bad..00000000000 --- a/src/contents/docs/14.best-practices/11.foreach-and-foreachitem/index.md +++ /dev/null @@ -1,284 +0,0 @@ ---- -title: "ForEach vs ForEachItem in Kestra: When to Use Each" -h1: "ForEach vs ForEachItem: Scaling and Output Access" -sidebarTitle: ForEach vs ForEachItem -icon: /src/contents/docs/icons/best-practices.svg -description: Learn when to use ForEach or ForEachItem in Kestra, how they scale differently, and how to access their outputs correctly in downstream tasks. ---- - -Use `ForEach` and `ForEachItem` for different scaling and orchestration patterns. - -## Choose the right loop primitive - -Both tasks iterate over multiple items, but they do it in different ways: - -- `ForEach` creates child task runs inside the same execution. -- `ForEachItem` creates one subflow execution per batch of items. - -That design difference affects performance, restart behavior, and how you access outputs. - -## Decision guide - -Use `ForEach` when: - -- You already have a small list in memory, such as an input, a small JSON array, or a small fetched result. -- The work for each item is lightweight. -- You want to share outputs between sibling tasks inside the loop. -- You want a simple loop without introducing a subflow. - -Use `ForEachItem` when: - -- You need to process a large dataset or file. -- You want to split data into batches and scale processing through subflows. -- You need better isolation, troubleshooting, and restart behavior for individual batches. -- The data already lives in Kestra internal storage, or can be written there first. - -:::alert{type="warning"} -`ForEach` can generate many task runs in a single execution. For large fan-out or nested loops, prefer `ForEachItem` or a `Subflow`-based design to avoid oversized execution contexts and slower orchestration. -::: - -:::alert{type="info"} -`ForEachItem` expects `items` to be a Kestra internal storage URI, for example `{{ outputs.extract.uri }}` or a `FILE` input. If your source data is a regular JSON array, Excel file, Parquet file, or another non line-oriented format, convert it first. -::: - -## `Subflow` vs `ForEachItem` - -`Subflow` and `ForEachItem` both create child executions, but they solve different orchestration problems. - -Use `Subflow` when: - -- You want to trigger one child flow once. -- You already know the exact inputs to pass to that child flow. -- You want execution isolation without batching or iteration. -- You are decomposing a large workflow into smaller reusable modules. - -Use `ForEachItem` when: - -- You want to start many child flow executions from one dataset or file. -- You need batching by `rows`, `partitions`, or `bytes`. -- You want to process file-backed items incrementally at scale. -- You want Kestra to merge outputs from multiple child executions. - -Rule of thumb: - -- `Subflow` is one child execution for one unit of work. -- `ForEachItem` is many child executions for many units of work. - -For example, if you need to process one uploaded file in a dedicated child flow, use `Subflow`. If you need to split that file into many batches and process each batch in its own child flow execution, use `ForEachItem`. - -## Understand the main difference - -`ForEach` iterates over a list of values and exposes: - -- `{{ taskrun.value }}` for the current value -- `{{ taskrun.iteration }}` for the zero-based loop index - -`ForEachItem` iterates over batches of file-backed items and exposes: - -- `{{ taskrun.items }}` for the current batch file URI -- `{{ taskrun.iteration }}` for the zero-based batch index - -In practice: - -- `ForEach` is best when the iteration value itself is the thing you want to work with. -- `ForEachItem` is best when each iteration should receive a file or batch and hand it off to a subflow. - -## Best practices for `ForEach` - -- Keep the `values` list small to moderate in size. -- Use `concurrencyLimit` deliberately rather than leaving fan-out unbounded. -- If each iteration needs multiple tasks in parallel, put a `Parallel` task inside the loop instead of expecting child tasks to run concurrently by default. -- If iterating over JSON objects, remember that `taskrun.value` is a JSON string. Use `fromJson(taskrun.value)` to access properties. -- When referencing outputs from sibling tasks inside the same loop iteration, use `outputs.task_id[taskrun.value]`. - -### Example: use sibling outputs correctly inside `ForEach` - -```yaml -id: foreach_outputs -namespace: company.team - -tasks: - - id: enrich_regions - type: io.kestra.plugin.core.flow.ForEach - values: ["north", "south", "west"] - concurrencyLimit: 2 - tasks: - - id: metadata - type: io.kestra.plugin.core.output.OutputValues - values: - region: "{{ taskrun.value }}" - bucket: "landing-{{ taskrun.value }}" - - - id: build_message - type: io.kestra.plugin.core.debug.Return - format: "Load {{ outputs.metadata[taskrun.value].values.region }} into {{ outputs.metadata[taskrun.value].values.bucket }}" - - - id: log_one_result - type: io.kestra.plugin.core.log.Log - message: "{{ outputs.build_message['north'].value }}" -``` - -Why this pattern works: - -- Inside the loop, `outputs.metadata[taskrun.value]` reads the output from the current iteration. -- Outside the loop, `outputs.build_message['north'].value` reads the output for one specific loop value. - -### Example: iterate over JSON objects safely - -```yaml -id: foreach_json -namespace: company.team - -tasks: - - id: process_users - type: io.kestra.plugin.core.flow.ForEach - values: - - {"id": 101, "email": "a@example.com"} - - {"id": 102, "email": "b@example.com"} - tasks: - - id: log_user - type: io.kestra.plugin.core.log.Log - message: "User {{ fromJson(taskrun.value).id }} -> {{ fromJson(taskrun.value).email }}" -``` - -## Best practices for `ForEachItem` - -- Store the dataset in internal storage first and pass its URI to `items`. -- If your source file is CSV, JSON, Excel, or another external format, convert it to ION before passing it to `ForEachItem`. -- Batch by `rows`, `partitions`, or `bytes` based on how the downstream subflow processes data. -- Design the subflow so it can be rerun independently for one batch. -- Prefer passing `taskrun.items` to a `FILE` input in the subflow. -- If the parent flow must depend on child results, keep `wait: true`. -- If a child failure should fail the parent task, keep `transmitFailed: true`. - -### Example: process a file in batches with `ForEachItem` - -This pattern is recommended when each batch should run in its own execution. - -```yaml -id: parent_foreachitem -namespace: company.team - -tasks: - - id: download_orders_csv - type: io.kestra.plugin.core.http.Download - uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - - - id: orders_to_ion - type: io.kestra.plugin.serdes.csv.CsvToIon - from: "{{ outputs.download_orders_csv.uri }}" - - - id: process_batches - type: io.kestra.plugin.core.flow.ForEachItem - items: "{{ outputs.orders_to_ion.uri }}" - batch: - rows: 2 - namespace: company.team - flowId: process_order_batch - wait: true - transmitFailed: true - inputs: - orders_file: "{{ taskrun.items }}" - - - id: log_merged_outputs_uri - type: io.kestra.plugin.core.log.Log - message: "{{ outputs.process_batches_merge.subflowOutputs }}" - - - id: preview_merged_outputs - type: io.kestra.plugin.core.log.Log - message: "{{ read(outputs.process_batches_merge.subflowOutputs) }}" -``` - -And the subflow: - -```yaml -id: process_order_batch -namespace: company.team - -inputs: - - id: orders_file - type: FILE - -tasks: - - id: inspect_batch - type: io.kestra.plugin.core.log.Log - message: "{{ read(inputs.orders_file) }}" - -outputs: - - id: batch_summary - type: STRING - value: "{{ 'Processed batch content: ' ~ read(inputs.orders_file) }}" -``` - -Here, `orders_file` is a batch file generated from the ION output of `CsvToIon`. Each subflow execution receives one batch file through `{{ taskrun.items }}`. - -## Use `ForEachItem` outputs correctly - -`ForEachItem` is best consumed through its internal helper task outputs: - -- `{{ outputs.task_id_split.splits }}` contains the file listing generated batch URIs. -- `{{ outputs.task_id_merge.subflowOutputs }}` contains a file with the merged outputs from the child subflows. - -If your `ForEachItem` task id is `process_batches`, those become: - -- `{{ outputs.process_batches_split.splits }}` -- `{{ outputs.process_batches_merge.subflowOutputs }}` - -This is different from `ForEach`, where you typically access outputs by loop value, such as `outputs.inner['north'].value`. - -### Example: consume merged subflow outputs - -If the subflow defines typed flow outputs, `ForEachItem` merges them into a file exposed by the internal merge task. In the example above, each child execution returns a `batch_summary` string, and the merge task gathers those subflow outputs into a single file. - -```yaml -id: parent_read_merged_outputs -namespace: company.team - -tasks: - - id: download_orders_csv - type: io.kestra.plugin.core.http.Download - uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv - - - id: orders_to_ion - type: io.kestra.plugin.serdes.csv.CsvToIon - from: "{{ outputs.download_orders_csv.uri }}" - - - id: process_batches - type: io.kestra.plugin.core.flow.ForEachItem - items: "{{ outputs.orders_to_ion.uri }}" - batch: - rows: 2 - namespace: company.team - flowId: process_order_batch - wait: true - transmitFailed: true - inputs: - orders_file: "{{ taskrun.items }}" - - - id: log_merged_outputs_uri - type: io.kestra.plugin.core.log.Log - message: "{{ outputs.process_batches_merge.subflowOutputs }}" - - - id: preview_merged_outputs - type: io.kestra.plugin.core.log.Log - message: "{{ read(outputs.process_batches_merge.subflowOutputs) }}" -``` - -Use `{{ outputs.process_batches_merge.subflowOutputs }}` when a downstream task needs the collected outputs from all child subflows. -If you want to inspect the merged file content directly, use `read(outputs.process_batches_merge.subflowOutputs)`. - -## Common mistakes to avoid - -- Do not use `ForEach` for very large datasets just because the input started as a JSON array. -- Do not pass a non-storage path or raw inline content to `ForEachItem.items`; it must be a Kestra internal storage URI. -- Do not assume sibling task outputs in `ForEach` use the plain `outputs.task_id.value` syntax; inside the loop, use `outputs.task_id[taskrun.value]`. -- Do not expect `ForEach` child tasks to run in parallel unless you either set loop concurrency or add a `Parallel` task inside the loop. -- Do not forget that `taskrun.iteration` starts at `0` for both `ForEach` and `ForEachItem`. - -## Recommended rule of thumb - -Use `ForEach` for orchestration over a relatively small list of values. - -Use `ForEachItem` for data processing over file-backed items or batches, especially when you need scale, restartability, or subflow isolation. - -For API details, see the [ForEach plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreach), the [ForEachItem plugin documentation](/plugins/core/flow/io.kestra.plugin.core.flow.foreachitem), and the [Outputs documentation](../../05.workflow-components/06.outputs/index.md). diff --git a/src/contents/docs/14.best-practices/11.loop/index.md b/src/contents/docs/14.best-practices/11.loop/index.md new file mode 100644 index 00000000000..f31be09ec3c --- /dev/null +++ b/src/contents/docs/14.best-practices/11.loop/index.md @@ -0,0 +1,230 @@ +--- +title: "Loop Task Best Practices in Kestra" +h1: "Best Practices for the Loop Task" +sidebarTitle: Loop +icon: /src/contents/docs/icons/best-practices.svg +description: Best practices for using the Loop task in Kestra — output collection, concurrency, error handling, large-file processing, and subflow isolation patterns. +--- + +Use `Loop` for all iteration needs in Kestra. + +## Choose the right iteration pattern + +`Loop` runs child tasks for each item in a list, map, file, or URI list. Every iteration is an isolated sub-execution. + +Use a plain `Loop` when: + +- You need to run the same tasks for each item in a list or dataset. +- Each iteration should process one value, one object, or one file chunk. +- You want parallel iteration with controlled concurrency. +- You want per-iteration failure handling without stopping the entire loop. + +Use `Loop` + `Subflow` when: + +- You need full execution isolation per batch — own retries, own logs, own failure state. +- Each batch should be independently restartable. + +## Access the iteration value + +Inside a Loop, use `item.value` for the current value and `item.index` for the zero-based position. These are available in every child task, including those nested inside `If`, `Parallel`, or other flowable tasks — no parent traversal needed. + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["north", "south", "west"] + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "region={{ item.value }} index={{ item.index }}" +``` + +When iterating over JSON objects, `item.value` is a JSON string. Use `fromJson(item.value).field` to access properties — `item.value.field` does not work. + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: + - {"id": 101, "email": "a@example.com"} + - {"id": 102, "email": "b@example.com"} + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "User {{ fromJson(item.value).id }} -> {{ fromJson(item.value).email }}" +``` + +## Expose outputs explicitly + +Task outputs inside a loop are not visible outside it by default. Declare an `outputs:` block on the Loop task to surface values. Choose `fetchType` based on data volume: + +- `AUTO` — default, switches automatically based on whether `values` is a URI +- `FETCH` — collects all iteration results inline (suitable for small iteration counts) +- `STORE` — writes results to internal storage and exposes a URI (preferred for large iteration counts) + +After the loop: +- `outputs..outputs` is a list of per-iteration results +- `outputs..outputs[n].outputs.` accesses a specific iteration by index +- `loopOutputs(outputs..outputs, '')` extracts one field across all iterations as a flat list + +### Example: collect outputs and read them downstream + +```yaml +id: loop_outputs +namespace: company.team + +tasks: + - id: enrich_regions + type: io.kestra.plugin.core.flow.Loop + values: ["north", "south", "west"] + concurrencyLimit: 2 + fetchType: AUTO + outputs: + - id: bucket + type: STRING + value: "landing-{{ item.value }}" + - id: message + type: STRING + value: "{{ outputs.build_message.value }}" + tasks: + - id: build_message + type: io.kestra.plugin.core.debug.Return + format: "Load {{ item.value }} into landing-{{ item.value }}" + + - id: log_first + type: io.kestra.plugin.core.log.Log + message: "{{ outputs.enrich_regions.outputs[0].outputs.message }}" + + - id: log_all + type: io.kestra.plugin.core.log.Log + message: "{{ loopOutputs(outputs.enrich_regions.outputs, 'message') }}" +``` + +Inside the loop, sibling task outputs are accessed with plain `outputs.task_id.attribute` syntax — each iteration runs in its own isolated context, so there is no ambiguity. + +## Use `concurrencyLimit` deliberately + +- `1` (default) — sequential execution +- A positive integer — bounded parallelism; prefer this for heavy workloads +- `0` — unlimited; all iterations run simultaneously; avoid for large datasets without understanding resource implications + +## Process large files with Split and Loop + +For file-backed datasets, use `Split` to break the file into chunk URIs, then loop over the URI list. Each `item.value` is one chunk URI. + +Passing `values: "{{ outputs.split.uris }}"` where `outputs.split.uris` is a **list** is different from passing a single file URI string. A list iterates over elements; a single URI string iterates line-by-line through that file. + +```yaml +tasks: + - id: split + type: io.kestra.plugin.core.storage.Split + from: "{{ inputs.file }}" + rows: 100 + + - id: per_chunk + type: io.kestra.plugin.core.flow.Loop + values: "{{ outputs.split.uris }}" + concurrencyLimit: 4 + fetchType: FETCH + outputs: + - id: result_uri + type: STRING + value: "{{ outputs.process.value }}" + tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "processed chunk {{ item.index }}: {{ item.value }}" + + - id: summary + type: io.kestra.plugin.core.log.Log + message: "{{ loopOutputs(outputs.per_chunk.outputs, 'result_uri') }}" +``` + +## Handle per-iteration failures + +Set `transmitFailed: false` to continue the loop when individual iterations fail. Use `errors:` to run tasks per failed iteration, and `finally:` for a one-time cleanup block after all iterations finish. `errors:` requires `transmitFailed: false`; `finally:` always runs regardless. + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["ok", "boom", "ok"] + transmitFailed: false + tasks: + - id: maybe_fail + type: io.kestra.plugin.scripts.shell.Commands + commands: + - | + if [ "{{ item.value }}" = "boom" ]; then exit 1; fi + echo "ok {{ item.value }}" + errors: + - id: handle_error + type: io.kestra.plugin.core.log.Log + message: "Iteration {{ item.index }} ({{ item.value }}) failed" + finally: + - id: cleanup + type: io.kestra.plugin.core.log.Log + message: "Loop finished (with or without failures)" +``` + +## Use Loop + Subflow for isolated per-batch execution + +When each batch needs its own execution — independent retries, logs, and failure ownership — pair `Loop` with `Subflow`. The parent splits the data and fans out; the child flow receives one chunk URI per invocation and returns its result as a flow-level output. + +```yaml +# Parent flow +tasks: + - id: split + type: io.kestra.plugin.core.storage.Split + from: "{{ inputs.file }}" + rows: 100 + + - id: per_batch + type: io.kestra.plugin.core.flow.Loop + values: "{{ outputs.split.uris }}" + concurrencyLimit: 4 + fetchType: FETCH + outputs: + - id: result_uri + type: STRING + value: "{{ outputs.run_child.outputs.uri }}" + tasks: + - id: run_child + type: io.kestra.plugin.core.flow.Subflow + namespace: company.team + flowId: process_batch + wait: true + transmitFailed: true + inputs: + batch_uri: "{{ item.value }}" + + - id: concat + type: io.kestra.plugin.core.storage.Concat + files: "{{ loopOutputs(outputs.per_batch.outputs, 'result_uri') }}" + extension: .ion +``` + +## Compose Loop with supporting tasks + +`Loop` iterates. These tasks handle the rest — batching, transforming, stitching, and reducing. Each does one thing well; combine them around `Loop` to build larger pipelines. + +| Task | Role | When to reach for it | +|---|---|---| +| `io.kestra.plugin.core.storage.Split` | Batching | Split a single file into chunk URIs by `rows`, `bytes`, `partitions`, or `separator`. Feeds `Loop.values` for map-reduce. | +| `io.kestra.plugin.core.storage.Concat` | Stitching | Concatenate per-iteration output files into one before a reduce step. | +| `io.kestra.plugin.transform.Aggregate` | Reduce | Group records by one or more keys with `count()`, `sum()`, `max()`, and more. The reduce side of map-reduce. | +| `io.kestra.plugin.transform.Filter` | Predicate | Keep only rows where a boolean expression holds. | +| `io.kestra.plugin.transform.Map` | Project | Per-record rename, drop, or compute fields — SQL `SELECT`-style. | +| `io.kestra.plugin.transform.Unnest` | Explode | Flatten an array field into one row per element, carrying sibling fields through. | +| `io.kestra.plugin.core.flow.Subflow` | Isolate | Spawn a separate execution per iteration — own retries, own logs, own failure state. The `ForEachItem` replacement. | +| `io.kestra.plugin.core.flow.Parallel` | Fan-out | Run independent task groups concurrently inside a single iteration. | + +## Common mistakes to avoid + +- Do not use `taskrun.value` or `taskrun.iteration` — use `item.value` and `item.index`. +- Do not access `item.value.field` directly on object values — use `fromJson(item.value).field`. +- Do not expect loop outputs to be visible downstream without declaring an `outputs:` block. +- Do not use `outputs.task_id[item.value]` inside a loop — sibling outputs are accessed with plain `outputs.task_id.attribute`. +- Do not set `concurrencyLimit: 0` on very large datasets without considering memory and worker capacity. + +For more details, see the [Loop task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.loop) and the [Flowable Tasks reference](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop). diff --git a/src/contents/docs/14.best-practices/11.purging-data/index.md b/src/contents/docs/14.best-practices/11.purging-data/index.md index 1c04b0a037d..3242273c2aa 100644 --- a/src/contents/docs/14.best-practices/11.purging-data/index.md +++ b/src/contents/docs/14.best-practices/11.purging-data/index.md @@ -33,8 +33,8 @@ Use this rule of thumb: | If you want to remove... | Prefer | Why | | --- | --- | --- | -| Old execution records | [`PurgeExecutions`](/plugins/core/execution/io.kestra.plugin.core.execution.purgeexecutions) | It permanently deletes execution metadata and related execution data | -| Old execution and trigger logs | [`PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs) | It is designed for bulk log cleanup | +| Old execution records | [`PurgeExecutions`](/plugins/core/tasks/io.kestra.plugin.core.execution.purgeexecutions) | It permanently deletes execution metadata and related execution data | +| Old execution logs, trigger logs, or both | [`PurgeLogs`](/plugins/core/log/io.kestra.plugin.core.log.purgelogs) | Use `purgeExecutionLogs` and `purgeNonExecutionLogs` to target each type independently, or leave both `true` (default) to purge all logs | | Expired runtime state in the KV Store | [`PurgeKV`](/plugins/core/kv/io.kestra.plugin.core.kv.purgekv) or automatic KV expiration purge | It removes stale KV entries without treating them as static configuration | | Old Namespace file versions | [`PurgeFiles`](/plugins/core/namespace/io.kestra.plugin.core.namespace.purgefiles) | It applies retention rules to Namespace files and their versions | | Old asset records, usages, or lineage data | [`PurgeAssets`](../../10.administrator-guide/purge/index.md#purge-assets-and-lineage-retention) | It applies retention to asset-related records without touching executions or logs | @@ -66,6 +66,8 @@ This is usually the right choice when: Best practice: - set separate retention periods for executions and logs if your teams use them differently +- use `purgeExecutionLogs: false` to retain execution logs for failed workflow debugging while still purging trigger logs, or `purgeNonExecutionLogs: false` to do the reverse +- set `batchSize` on `PurgeLogs` when purging large volumes of logs to limit the number of rows deleted per transaction - avoid deleting recent data that is still useful for troubleshooting failed workflows - run purge flows on a schedule instead of waiting for storage pressure diff --git a/src/contents/docs/15.how-to-guides/access-local-files/index.md b/src/contents/docs/15.how-to-guides/access-local-files/index.md index 0a433998ac4..acae1157d81 100644 --- a/src/contents/docs/15.how-to-guides/access-local-files/index.md +++ b/src/contents/docs/15.how-to-guides/access-local-files/index.md @@ -6,7 +6,7 @@ stage: Getting Started topics: - Scripting - Integrations -description: Access and process files stored on your local machine within Kestra workflows using bind mounts and the Process task runner. +description: Access files stored on your local machine within Kestra workflows using bind mounts, and batch-upload files to the local filesystem using the local.Uploads task. --- Access locally stored files on your machine inside Kestra workflows. @@ -62,3 +62,99 @@ tasks: commands: - cat /files/myfile.txt ``` + +## Batch-uploading files with `local.Uploads` + +[`io.kestra.plugin.fs.local.Uploads`](/plugins/plugin-fs/local/io.kestra.plugin.fs.local.uploads) writes multiple Kestra internal storage files to a directory on the local filesystem in a single task. It mirrors the `Uploads` task available on the FTP, FTPS, SFTP, and SMB backends. + +### Configure allowed paths + +Both [`local.Upload`](/plugins/plugin-fs/local/io.kestra.plugin.fs.local.upload) (single file) and `local.Uploads` (batch) require the destination directory to be listed in the plugin's `allowed-paths` configuration. Add the following to your `kestra.yml`: + +```yaml +kestra: + plugins: + configurations: + - type: io.kestra.plugin.fs.local.Uploads + values: + allowed-paths: + - /data/uploads + - type: io.kestra.plugin.fs.local.Upload + values: + allowed-paths: + - /data/uploads +``` + +Without this, any write to `/data/uploads` is rejected with a `SecurityException` even if the path is bind-mounted into the container. + +### Upload a list of files + +Pass a list of Kestra internal storage URIs to `from`. Each file is written to the `to` directory using its original filename. + +The flow below runs a data ingestion job that produces run logs and SQL migration scripts, then archives the logs to a local directory: + +```yaml +id: archive_pipeline_logs +namespace: company.team + +tasks: + - id: run_pipeline + type: io.kestra.plugin.scripts.shell.Commands + taskRunner: + type: io.kestra.plugin.core.runner.Process + outputFiles: + - "*.log" + - "*.sql" + commands: + - echo "ingested 1024 rows" > ingest.log + - echo "0 errors" > errors.log + - echo "ALTER TABLE orders ADD COLUMN status TEXT;" > schema.sql + - echo "INSERT INTO orders VALUES (1, 'pending');" > seed.sql + + - id: upload_logs + type: io.kestra.plugin.fs.local.Uploads + from: + - "{{ outputs.run_pipeline.outputFiles['ingest.log'] }}" + - "{{ outputs.run_pipeline.outputFiles['errors.log'] }}" + to: /data/uploads/logs +``` + +### Upload with custom destination filenames + +To rename files at the destination, pass a map of `destinationFilename: sourceURI` pairs instead of a list. This is useful for versioning — for example, tagging migration scripts with a version prefix before archiving them. + +In the flow above, replace the `upload_logs` task with: + +```yaml + - id: upload_migrations + type: io.kestra.plugin.fs.local.Uploads + from: + v1_schema.sql: "{{ outputs.run_pipeline.outputFiles['schema.sql'] }}" + v1_seed.sql: "{{ outputs.run_pipeline.outputFiles['seed.sql'] }}" + to: /data/uploads/migrations +``` + +### Filter by regular expression + +Use `regExp` to upload only files whose internal storage URI matches a pattern. Files that do not match are skipped. + +When a task produces a mixed set of outputs, `regExp` lets you route file types to separate destinations without splitting the upstream task. In the flow above, replace the `upload_logs` task with: + +```yaml + - id: upload_sql_only + type: io.kestra.plugin.fs.local.Uploads + from: + - "{{ outputs.run_pipeline.outputFiles['ingest.log'] }}" + - "{{ outputs.run_pipeline.outputFiles['errors.log'] }}" + - "{{ outputs.run_pipeline.outputFiles['schema.sql'] }}" + - "{{ outputs.run_pipeline.outputFiles['seed.sql'] }}" + regExp: ".*\\.sql$" + to: /data/uploads/migrations +``` + +### Additional properties + +| Property | Default | Description | +|---|---|---| +| `maxFiles` | `25` | Upper bound on how many files are written. Excess files are dropped with a warning. | +| `overwrite` | `true` | When `false`, the task fails if a destination file already exists. | diff --git a/src/contents/docs/15.how-to-guides/alerting/index.md b/src/contents/docs/15.how-to-guides/alerting/index.md index 735f2b894e5..741dbb44b1a 100644 --- a/src/contents/docs/15.how-to-guides/alerting/index.md +++ b/src/contents/docs/15.how-to-guides/alerting/index.md @@ -79,7 +79,7 @@ errors: ## Flow trigger -Subflows cut down on duplication, but you still need the `errors` block in every flow. For a fully centralized approach, use a **Flow trigger** that reacts to execution status. Trigger conditions let you target specific states, such as `FAILED` or `WARNING`, and you can define separate triggers per status if needed. +Subflows cut down on duplication, but you still need the `errors` block in every flow. For a fully centralized approach, use a **Flow trigger** that reacts to execution status. The `when` expression lets you target specific states, such as `FAILED` or `WARNING`, and you can define separate triggers per status if needed. ```yaml id: failure_alert_slack @@ -95,11 +95,8 @@ tasks: triggers: - id: on_failure type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - FAILED - - WARNING + dependsOn: + - states: [FAILED, WARNING] ``` diff --git a/src/contents/docs/15.how-to-guides/golang/index.md b/src/contents/docs/15.how-to-guides/golang/index.md index 9bbe06d0a6c..8a10512f723 100644 --- a/src/contents/docs/15.how-to-guides/golang/index.md +++ b/src/contents/docs/15.how-to-guides/golang/index.md @@ -215,3 +215,60 @@ tasks: Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) + +## Automate Go with triggers + +You can also use Go code as polling logic by using `ScriptTrigger` or `CommandsTrigger`. These trigger types run Go code on an interval and start a flow execution only when the `exitCondition` matches. + +Use `ScriptTrigger` for inline Go code: + +```yaml +id: go_script_trigger +namespace: company.team + +triggers: + - id: script_failure + type: io.kestra.plugin.scripts.go.ScriptTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + script: | + package main + + func main() { + panic("boom") + } + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +Use `CommandsTrigger` when you want to run Go commands instead: + +```yaml +id: commands_trigger +namespace: company.team + +triggers: + - id: commands_failure + type: io.kestra.plugin.scripts.go.CommandsTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + containerImage: golang + commands: + - go run missing.go + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +These trigger types support: + +- `interval` to control how often the script or commands run +- `exitCondition` to match an exit code such as `exit 1`, or a regex or substring matched against emitted vars and failure logs +- `edge` to emit only on a transition from not matching to matching diff --git a/src/contents/docs/15.how-to-guides/javascript/index.md b/src/contents/docs/15.how-to-guides/javascript/index.md index 8e1737cf8cc..c5abdea588a 100644 --- a/src/contents/docs/15.how-to-guides/javascript/index.md +++ b/src/contents/docs/15.how-to-guides/javascript/index.md @@ -240,6 +240,57 @@ Kestra.timer('duration', end - start); Once this has executed, `duration` will be viewable under **Metrics**. ![metrics](./metrics.png) +## Automate JavaScript with triggers + +You can also use JavaScript itself as polling logic by using `ScriptTrigger` or `CommandsTrigger`. These trigger types run Node.js code on an interval and start a flow execution only when the `exitCondition` matches. + +Use `ScriptTrigger` for inline Node.js code: + +```yaml +id: node_script_trigger +namespace: company.team + +triggers: + - id: script_failure + type: io.kestra.plugin.scripts.node.ScriptTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + script: | + throw new Error("boom"); + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }}" +``` + +Use `CommandsTrigger` when you want to run Node.js commands instead: + +```yaml +id: node_commands_trigger +namespace: company.team + +triggers: + - id: on_fail + type: io.kestra.plugin.scripts.node.CommandsTrigger + interval: PT5S + exitCondition: "exit 1" + commands: + - node -e "throw new Error('boom')" + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }}" +``` + +These trigger types support: + +- `interval` to control how often the script or commands run +- `exitCondition` to match an exit code such as `exit 1`, or a regex or substring matched against emitted vars and failure logs +- `edge` to emit only on a transition from not matching to matching + ## Execute GraalVM Task Kestra also supports GraalVM integration, allowing you to execute JavaScript code directly on the JVM, with the potential for performance improvements. There are currently two tasks: diff --git a/src/contents/docs/15.how-to-guides/loop/index.md b/src/contents/docs/15.how-to-guides/loop/index.md index 48a15d28366..5269a683fab 100644 --- a/src/contents/docs/15.how-to-guides/loop/index.md +++ b/src/contents/docs/15.how-to-guides/loop/index.md @@ -1,16 +1,16 @@ --- title: Loop Over a List of Values -h1: Iterate Over Lists with the ForEach Task +h1: Iterate Over Lists with the Loop Task icon: /src/contents/docs/icons/tutorial.svg stage: Intermediate topics: - Kestra Workflow Components -description: Learn how to iterate over lists of values in Kestra workflows using the ForEach task to execute tasks for each item efficiently. +description: Learn how to iterate over a list of values in Kestra workflows using the Loop task, access iteration context, collect outputs, and run iterations in parallel. --- How to iterate over a list of values in your flow. -In this guide, you will learn how to iterate over a list of values using the `ForEach` task. This task enables you to loop through a list of values and execute specific tasks for each value in the list. This approach is useful for scenarios where multiple similar tasks need to be run for different inputs. +In this guide, you will learn how to use the `Loop` task to iterate over a list of values and run tasks for each item. Each iteration runs as an isolated sub-execution with access to the current value via `item.value` and the zero-based index via `item.index`. ## Prerequisites @@ -19,70 +19,109 @@ Before you begin: - Deploy [Kestra](../../02.installation/index.mdx) in your preferred development environment. - Ensure you have a [basic understanding of how to run Kestra flows.](../../03.tutorial/index.mdx) -## Loop over nested lists of values +## Basic iteration -This example demonstrates how to use `ForEach` to loop over a list of strings and then loop through a nested list for each string. +The simplest use of `Loop` iterates over a static list and runs child tasks for each item. The example below makes an API call for each author in the list. -You can access the current iteration value using the variable `{{ taskrun.value }}` or `{{ parent.taskrun.value }}` if you are in a nested child task. Additionally, you can access the batch or iteration number with `{{ taskrun.iteration }}`. +```yaml +id: loop_basic +namespace: company.team + +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["pynchon", "dostoyevsky", "hedayat"] + tasks: + - id: api + type: io.kestra.plugin.core.http.Request + uri: "https://openlibrary.org/search.json?author={{ item.value }}&sort=new" +``` + +Inside each iteration: +- `{{ item.value }}` — the current value from the list +- `{{ item.index }}` — the zero-based position (0, 1, 2, …) -To see the flow in action, define the `each_nested` flow as shown below: +After execution, the Gantt view shows a separate task group for each author. + +When `values` contains objects, each `item.value` is a JSON string. Use `fromJson(item.value).field` to access fields — `item.value.field` does not work. + +## Nested loops + +To iterate over multiple dimensions, nest `Loop` tasks. The inner loop accesses the outer loop's value with `{{ item.parent.value }}`. For three or more levels, `{{ item.parents[1].value }}` is the grandparent — `item.parents[0]` is the same as `item.parent`. ```yaml -id: each_nested +id: loop_nested namespace: company.team tasks: - - id: 1_each - type: io.kestra.plugin.core.flow.ForEach - values: '["s1", "s2", "s3"]' + - id: outer + type: io.kestra.plugin.core.flow.Loop + values: ["bucket1", "bucket2"] tasks: - - id: 1-1_return - type: io.kestra.plugin.core.debug.Return - format: "{{task.id}} > {{taskrun.value}} > {{taskrun.startDate}}" - - id: 1-2_each - type: io.kestra.plugin.core.flow.ForEach - values: '["a a", "b b"]' + - id: inner + type: io.kestra.plugin.core.flow.Loop + values: [2025, 2026] tasks: - - id: 1-2-1_return - type: io.kestra.plugin.core.debug.Return - format: "{{task.id}} > {{taskrun.value}} > {{taskrun.startDate}}" - - id: 1-2-2_return - type: io.kestra.plugin.core.debug.Return - format: "{{task.id}} > {{ outputs['1-2-1_return'].s1[taskrun.value].value }} >> get {{ outputs['1-2-1_return']['s1'][taskrun.value].value }} > {{taskrun.startDate}}" - - id: 1-3_return - type: io.kestra.plugin.core.debug.Return - format: "{{task.id}} > {{ outputs['1-1_return'][taskrun.value].value }} > {{taskrun.startDate}}" - - id: 2_return - type: io.kestra.plugin.core.debug.Return - format: "{{task.id}} > {{outputs['1-2-1_return'].s1['a a'].value}}" + - id: log + type: io.kestra.plugin.core.log.Log + message: "bucket={{ item.parent.value }} year={{ item.value }}" ``` -Save and execute the `each_nested` flow. - -The above flow, when executed, iterates over a nested list of values, logging messages at each level of iteration to track the processing of both the outer and inner list items. - -Within the flow: +## Collect outputs across iterations -- `1_each`: Uses the `ForEach` task to iterate over the list `["s1", "s2", "s3"]`. For each value, it runs the nested tasks defined within. +By default, outputs produced inside a loop are not visible to tasks that run after it. Declare an `outputs:` block on the Loop task to surface values explicitly. After the loop, `outputs.loop.outputs` is a list of per-iteration results. Use `loopOutputs()` to extract one field across all iterations as a flat list. - - `1-1_return`: Logs the task ID, the current list value, and the task run start time. +```yaml +id: loop_outputs +namespace: company.team - - `1-2_each`: Iterates over a second list `["a a", "b b"]` and runs a set of tasks for each value in this nested list. +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["alpha", "beta", "gamma"] + fetchType: AUTO + outputs: + - id: label + type: STRING + value: "{{ outputs.process.value }}" + tasks: + - id: process + type: io.kestra.plugin.core.debug.Return + format: "processed {{ item.value }}" - - `1-2-1_return`: Logs the task ID, the nested list value, and the start time of the task run. + - id: read_outputs + type: io.kestra.plugin.core.log.Log + message: "All results: {{ loopOutputs(outputs.loop.outputs, 'label') }}" +``` - - `1-2-2_return`: Logs a custom output from `1-2-1_return`, which shows how to access outputs from previous iterations within the nested loop. +## Run iterations in parallel - - `1-3_return`: Logs the output from `1-1_return` after the inner loop is completed and displays the corresponding value processed in the outer loop. +Set `concurrencyLimit` to `0` to run all iterations simultaneously, or to a positive integer to cap how many run at once. Combine with an inner `Parallel` task to also parallelise work within each iteration. -- `2_return`: Fetches the output from the nested loop (`1-2-1_return` for the value `a a`) and logs it. +```yaml +id: loop_parallel +namespace: company.team +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] + concurrencyLimit: 0 + tasks: + - id: parallel + type: io.kestra.plugin.core.flow.Parallel + tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Processing {{ item.value }}" + - id: shell + type: io.kestra.plugin.scripts.shell.Commands + commands: + - "echo done {{ item.value }}" +``` ## Next steps -Now that you've seen how to loop over a list of values using `ForEach`, you can apply this technique to any scenario where multiple iterations of similar tasks are needed. You can further extend this flow by: -- Adding more complex nested loops. -- Using dynamic input values instead of hardcoded lists. -- Logging or processing additional data from each iteration. - -For more advanced use cases, refer to Kestra’s official [ForEach](/plugins/core/flow/io.kestra.plugin.core.flow.foreach) task documentation and the [Best Practices for ForEach and ForEachItem](../../14.best-practices/11.foreach-and-foreachitem/index.md) guide, which covers how to access sibling task outputs inside and outside the loop, when to use `ForEachItem` instead, and common mistakes to avoid. +- For the full Loop property reference, see the [Loop task documentation](/plugins/core/flow/io.kestra.plugin.core.flow.loop). +- For output collection patterns, error handling, and map-reduce examples, see the [Flowable Tasks](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop) reference. +- For Loop best practices, see the [Loop best practices guide](../../14.best-practices/11.loop/index.md). diff --git a/src/contents/docs/15.how-to-guides/multiplecondition-listener/index.md b/src/contents/docs/15.how-to-guides/multiplecondition-listener/index.md index 71355361851..ee77f991340 100644 --- a/src/contents/docs/15.how-to-guides/multiplecondition-listener/index.md +++ b/src/contents/docs/15.how-to-guides/multiplecondition-listener/index.md @@ -8,35 +8,20 @@ topics: - Kestra Workflow Components --- -How to set up a Flow to only trigger when multiple conditions are met. +How to set up a flow that only triggers when multiple upstream flows have all succeeded. -In this tutorial, we’ll explore how to set up a flow in Kestra that only triggers when multiple conditions are met. Specifically, we will create a flow that only executes if two other flows, `multiplecondition-flow-a` and `multiplecondition-flow-b`, have executed successfully within the last 24 hours. +In this guide, we’ll create a flow that only executes if two other flows, `multiplecondition_flow_a` and `multiplecondition_flow_b`, have each completed successfully within the last 24 hours. This pattern uses the `dependsOn` property on the Flow trigger. -## Why Use Multiple Condition Listeners? +## When to use this pattern -The `MultipleCondition` listener allows you to build more complex workflows that depend on the success of several flows. For example, if you have two dependent tasks or processes that need to succeed before triggering another process, this listener ensures that the next workflow is only executed when both conditions are met within a specific time window. +Use multiple upstream dependencies when a downstream process should only run after several independent upstream flows all succeed. For example, if you have separate ingestion flows for different data sources and want to run a transformation only after all sources have completed, `dependsOn` with a time window is the right tool. -## Activation Process Overview +## How it works -The listener will trigger under the following conditions: - -1. Both `multiplecondition-flow-a` and `multiplecondition-flow-b` must have successful executions. -2. The listener checks if both flows succeeded within the last 24 hours. -3. If the conditions are met, the flow is activated, and the conditions reset. -4. Future executions will only re-trigger the flow if both flows succeed again within another 24-hour window. - -## How the Process Works - -1. Time Window (P1D or 24 hours): - - - The `MultipleCondition` listener checks if both flows (`multiplecondition-flow-a` and `multiplecondition-flow-b`) have been executed successfully within the past 24 hours. - -2. Resetting Conditions: - - - Once the listener triggers, the conditions reset, meaning that even if one of the flows succeeds again, the listener won't trigger until both flows succeed within a new 24-hour period. - -3. Flow Dependency: - - This is particularly useful when you have flows that depend on each other or when the successful execution of multiple workflows is a prerequisite for a downstream task. +1. Both `multiplecondition_flow_a` and `multiplecondition_flow_b` must complete successfully. +2. Both must complete within the same 24-hour window (`window.every: P1D`). +3. Once both conditions are satisfied, the listener flow triggers. +4. The window resets each day, so both flows must succeed again within the next window to re-trigger the listener. ## First Flow: `multiplecondition_flow_a` @@ -85,7 +70,7 @@ id: multiplecondition_listener namespace: company.team description: | - This flow will start only if `multiplecondition_flow_a` and `multiplecondition_flow_b` are successful during the last 24h. + This flow starts only if `multiplecondition_flow_a` and `multiplecondition_flow_b` both succeed within the same 24-hour window. tasks: - id: only_listener @@ -95,44 +80,24 @@ tasks: triggers: - id: multiple_listen_flow type: io.kestra.plugin.core.trigger.Flow - conditions: - - type: io.kestra.plugin.core.condition.ExecutionStatus - in: - - SUCCESS - - id: multiple - type: io.kestra.plugin.core.condition.MultipleCondition - window: P1D - windowAdvance: P0D - conditions: - flow_a: - type: io.kestra.plugin.core.condition.ExecutionFlow - namespace: company.team - flowId: multiplecondition_flow_a - flow_b: - type: io.kestra.plugin.core.condition.ExecutionFlow - namespace: company.team - flowId: multiplecondition_flow_b + dependsOn: + - flowId: multiplecondition_flow_a + namespace: company.team + states: [SUCCESS] + - flowId: multiplecondition_flow_b + namespace: company.team + states: [SUCCESS] + window: + every: P1D ``` -## Explanation of the Flow +## Explanation of the flow -1. Tasks Section: +1. **Tasks** — `only_listener` outputs a static value when the trigger fires. Replace this with whatever downstream logic you need. +2. **`dependsOn`** — declares two upstream flow dependencies. Both entries must be satisfied before the trigger fires. `states: [SUCCESS]` means only successful executions count. - - The task `only_listener` outputs a static value (`children`) when the trigger conditions are met. This part can be customized to perform more complex tasks after the conditions are satisfied. - -2. Triggers Section: - - - - The `multiple_listen_flow` trigger listens for both `multiplecondition_flow_a` and `multiplecondition_flow_b`. - - Execution Status Condition: Ensures that only successful executions (status `SUCCESS`) are considered. - - MultipleCondition: This condition checks that both `flow_a` and `flow_b` have successfully completed within the last 24 hours (`P1D`). - -3. Window: - - - - The `window: P1D` ensures that the listener checks for executions within the past 24 hours. - - The `windowAdvance: P0D` parameter ensures that the time window starts immediately, without any delay. +3. **`window.every: P1D`** — defines a 24-hour evaluation window. Kestra accumulates upstream executions within this window and fires the trigger once all `dependsOn` entries are satisfied within the same window period. ## Expected Output @@ -150,6 +115,4 @@ When both multiplecondition_flow_a and multiplecondition_flow_b succeed within 2 ## Conclusion -In this tutorial, we’ve demonstrated how to set up a `MultipleCondition` listener that checks for the success of multiple flows within a specified time window. This is a powerful feature for managing complex workflows that depend on the successful execution of multiple tasks. - -By using this listener, you can ensure that downstream processes are only triggered when all necessary upstream conditions are met. +This guide demonstrated how to use `dependsOn` with a time window to trigger a flow only when multiple upstream flows all succeed within the same period. Use this pattern whenever a downstream process must wait on several independent upstream flows before running. diff --git a/src/contents/docs/15.how-to-guides/python/index.md b/src/contents/docs/15.how-to-guides/python/index.md index 961246a030c..9e2b4608858 100644 --- a/src/contents/docs/15.how-to-guides/python/index.md +++ b/src/contents/docs/15.how-to-guides/python/index.md @@ -337,12 +337,13 @@ flow.execute('example', 'python_scripts', {'greeting': 'hello from Python'}) Read more about it on the [execution page](../../05.workflow-components/03.execution/index.md). -## Automate Python with Triggers +## Automate Python with triggers You can combine your Python code with a trigger to automatically execute your code. There's a few key ways you can automate it: - Run on a schedule - Run when a webhook is called - Run when a file is available in a data lake or storage bucket +- Run Python code on a polling interval and emit only when a condition matches
@@ -436,6 +437,61 @@ triggers: maxKeys: 1 ``` +### Run Python as a polling trigger + +If you want the polling logic itself to be written in Python, you can use `ScriptTrigger` or `CommandsTrigger`. These triggers run Python code on an interval and start a flow execution only when the `exitCondition` matches the result. + +Use `ScriptTrigger` for inline Python code: + +```yaml +id: python_script_trigger +namespace: company.team + +triggers: + - id: script_failure + type: io.kestra.plugin.scripts.python.ScriptTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + script: | + raise Exception("boom") + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +Use `CommandsTrigger` when you want to run Python commands instead: + +```yaml +id: python_commands_trigger +namespace: company.team + +triggers: + - id: on_fail + type: io.kestra.plugin.scripts.python.CommandsTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + containerImage: python:3.13-slim + commands: + - python3 -c "raise Exception('boom')" + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +These triggers support: + +- `interval` to control how often the Python code runs +- `exitCondition` to match an exit code such as `exit 1`, or a regex or substring matched against emitted vars and failure logs +- `edge` to emit only when the condition changes from not matching to matching + +Use these trigger types when you want Python itself to decide whether a polling condition has been met, rather than relying on a separate external-system trigger. + ## Execute GraalVM Task diff --git a/src/contents/docs/15.how-to-guides/secops-with-kestra/index.md b/src/contents/docs/15.how-to-guides/secops-with-kestra/index.md index ae2225f2af3..ea20c73962a 100644 --- a/src/contents/docs/15.how-to-guides/secops-with-kestra/index.md +++ b/src/contents/docs/15.how-to-guides/secops-with-kestra/index.md @@ -223,13 +223,11 @@ triggers: - id: postVMCreation type: io.kestra.plugin.core.trigger.Flow inputs: - ipAddress: "{{ trigger.outputs.externalIPAddress }}" - preconditions: - id: vmCreationSuccess - flows: - - namespace: company.ops.it - flowId: createVMRevamped - states: [ SUCCESS, WARNING ] + ipAddress: "{{ trigger.outputs.createVMRevamped.externalIPAddress }}" + dependsOn: + - namespace: company.ops.it + flowId: createVMRevamped + states: [SUCCESS, WARNING] ``` ## Step 7: Review the Topology diff --git a/src/contents/docs/15.how-to-guides/shell/index.md b/src/contents/docs/15.how-to-guides/shell/index.md index 4c1c70a12cd..4203fd985e0 100644 --- a/src/contents/docs/15.how-to-guides/shell/index.md +++ b/src/contents/docs/15.how-to-guides/shell/index.md @@ -172,3 +172,57 @@ tasks: Once this has executed, both the metrics can be viewed under **Metrics**. ![metrics](./metrics.png) + +## Automate Shell with triggers + +You can also use shell code as polling logic by using `ScriptTrigger` or `CommandsTrigger`. These trigger types run shell code on an interval and start a flow execution only when the `exitCondition` matches. + +Use `ScriptTrigger` for inline shell code: + +```yaml +id: script_trigger +namespace: company.team + +triggers: + - id: script_failure + type: io.kestra.plugin.scripts.shell.ScriptTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + containerImage: ubuntu + script: | + cat /path/that/does/not/exist + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +Use `CommandsTrigger` when you want to run shell commands instead: + +```yaml +id: commands_trigger +namespace: company.team + +triggers: + - id: commands_failure + type: io.kestra.plugin.scripts.shell.CommandsTrigger + interval: PT10S + exitCondition: "exit 1" + edge: true + containerImage: ubuntu + commands: + - cat /path/that/does/not/exist + +tasks: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Triggered with exitCode={{ trigger.exitCode }} (condition={{ trigger.condition }})" +``` + +These trigger types support: + +- `interval` to control how often the script or commands run +- `exitCondition` to match an exit code such as `exit 1`, or a regex or substring matched against emitted vars and failure logs +- `edge` to emit only on a transition from not matching to matching diff --git a/src/contents/docs/16.scripts/07.input-output-files/index.md b/src/contents/docs/16.scripts/07.input-output-files/index.md index 17afb29d6a3..7279425058c 100644 --- a/src/contents/docs/16.scripts/07.input-output-files/index.md +++ b/src/contents/docs/16.scripts/07.input-output-files/index.md @@ -148,3 +148,31 @@ tasks: Note how the `outputFiles` property is used to specify the list of files to be persisted in Kestra's internal storage. The `outputFiles` property supports [glob patterns](https://en.wikipedia.org/wiki/Glob_(programming)). The subsequent task can access the output file by leveraging the syntax `{{outputs.yourTaskId.outputFiles['yourFileName.fileExtension']}}`. + +### Referencing output file paths inside the script + +For local runners (Process and Docker), writing files by plain name works because Kestra sets the process working directory automatically. For remote task runners (Kubernetes, AWS Batch, Azure Batch, etc.), the working directory is an execution-specific absolute path. Rather than constructing it manually with `{{ workingDir }}/filename`, you can use `{{ outputFiles["filename"] }}` to get the resolved absolute path by name: + +```yaml +id: output_file_remote +namespace: company.team + +tasks: + - id: shell + type: io.kestra.plugin.scripts.shell.Commands + taskRunner: + type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes + config: + masterUrl: https://my-cluster:6443 + caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" + outputFiles: + - out.txt + commands: + - echo "Hello from Kubernetes" > {{ outputFiles["out.txt"] }} +``` + +The `{{ outputFiles["filename"] }}` expression resolves to the absolute path of the named file in the task's working directory — the same value as `{{ workingDir }}/filename`. The same form is used in JDBC tasks (e.g., `COPY ... TO '{{ outputFiles["out.csv"] }}'`), so the pattern is consistent across task types. + +:::alert{type="info"} +Only named files are available as Pebble expressions. Glob patterns such as `*.csv` are collected post-run but cannot be referenced as `{{ outputFiles["*.csv"] }}` inside the script. Declare named files for any output you need to reference by path, and use globs only for bulk collection. +::: diff --git a/src/contents/docs/16.scripts/index.mdx b/src/contents/docs/16.scripts/index.mdx index a39b4778f67..7efd1286628 100644 --- a/src/contents/docs/16.scripts/index.mdx +++ b/src/contents/docs/16.scripts/index.mdx @@ -32,4 +32,4 @@ If you use the [Enterprise Edition](/docs/enterprise), you can also run your scr The following pages dive into details of each task runner, supported programming languages, and how to manage dependencies. - \ No newline at end of file + diff --git a/src/contents/docs/ai-tools/ai-agents/index.md b/src/contents/docs/ai-tools/ai-agents/index.md index 9694808602a..046bae864de 100644 --- a/src/contents/docs/ai-tools/ai-agents/index.md +++ b/src/contents/docs/ai-tools/ai-agents/index.md @@ -118,8 +118,118 @@ Following `multilingual_agent` is the `english_brevity` task, which only needs a ![AI Agent Abbreviated Summary](./ai-agent-brevity.png) -These outputs can then be passed on as notifications or system messages to external tools or subflows within Kestra. Other useful outputs include `tokenUsage` to compare different providers for the same tasks. For more examples and details about properties, outputs, and definitions, refer to the AI [Agent plugin documentation](/plugins/plugin-ai/agent). +These outputs can then be passed on as notifications or system messages to external tools or subflows within Kestra. Other useful outputs include `tokenUsage` to compare different providers for the same tasks. At runtime, Kestra also emits counter metrics — `ai.agent.tool.calls`, `ai.provider.calls`, and `ai.embedding.store.calls` — tagged by class name, which you can scrape with Prometheus or export via OpenTelemetry to monitor AI task usage. For more examples and details about properties, outputs, and definitions, refer to the AI [Agent plugin documentation](/plugins/plugin-ai/agent). ### Plugin defaults Each task using the AI Agent requires the `provider` property. To avoid repetition and simplify the flow building experience, first consider using [Kestra's AI Copilot](../ai-copilot/index.md), next consider using [Plugin Defaults](../../05.workflow-components/09.plugin-defaults/index.md) to ensure consistency and remove repetition. Additionally, for your provider API key, secure it either through the [Key-Value Store](../../06.concepts/05.kv-store/index.md) or as a [Secret](../../06.concepts/04.secret/index.md) if using [Kestra Enterprise Edition](../../07.enterprise/01.overview/01.enterprise-edition/index.md). + +## Agent tools + +The AI Agent can be extended with **tools** — capabilities the LLM can choose to invoke at runtime to complete its task. Tools are listed under the `tools` property of an `AIAgent` task. + +### Skills + +The [**Skill**](/plugins/plugin-ai/tool/skill) tool lets you attach structured instructions to an agent that it can activate on demand. Rather than including all instructions in the system message, skills let you define discrete, reusable knowledge blocks — each with a name, a description the LLM uses to decide when to activate it, and the actual instruction content. + +This is useful when an agent has multiple possible modes of operation, such as translating text, reviewing code, or formatting data, where you want the LLM to select and apply the right instructions based on context rather than always receiving all instructions at once. + +Each skill requires: +- `name` — a unique identifier for the skill +- `description` — explains to the LLM when to activate the skill +- `content` or `contentUri` — the instruction content, either inline or loaded from Kestra internal storage + +#### Inline skill content + +The simplest way to define a skill is with inline `content`: + +```yaml +id: agent_with_skills +namespace: company.ai + +tasks: + - id: agent + type: io.kestra.plugin.ai.agent.AIAgent + prompt: Translate the following text to French - "Hello, how are you today?" + provider: + type: io.kestra.plugin.ai.provider.GoogleGemini + modelName: gemini-2.5-flash + apiKey: "{{ secret('GEMINI_API_KEY') }}" + tools: + - type: io.kestra.plugin.ai.tool.Skill + skills: + - name: translation_expert + description: Expert translator for multiple languages + content: | + You are an expert translator. When translating text: + 1. Preserve the original meaning and tone + 2. Use natural phrasing in the target language + 3. Keep proper nouns unchanged +``` + +#### Loading skill content from storage + +For longer or reusable instructions, store the skill content as a file in Kestra internal storage and reference it with `contentUri`. This is especially useful when skill content is generated or updated by an earlier task in the same flow: + +```yaml +id: agent_with_skill_from_storage +namespace: company.ai + +tasks: + - id: write_instructions + type: io.kestra.plugin.core.storage.Write + content: | + You are a senior code reviewer. When reviewing code: + 1. Check for security vulnerabilities + 2. Ensure proper error handling + 3. Verify naming conventions are followed + 4. Flag any code duplication + + - id: agent + type: io.kestra.plugin.ai.agent.AIAgent + prompt: Review this Python function - "def add(a, b): return a + b" + provider: + type: io.kestra.plugin.ai.provider.GoogleGemini + modelName: gemini-2.5-flash + apiKey: "{{ secret('GEMINI_API_KEY') }}" + tools: + - type: io.kestra.plugin.ai.tool.Skill + skills: + - name: code_review_expert + description: Expert code reviewer with strict guidelines + contentUri: "{{ outputs.write_instructions.uri }}" +``` + +A single `Skill` tool can define multiple skills. Each skill must have a unique name. `content` and `contentUri` are mutually exclusive — exactly one must be set per skill. For more details on all available properties, refer to the [Skill plugin documentation](/plugins/plugin-ai/tool/skill). + +### Kestra-native tools + +- [**KestraFlow**](/plugins/plugin-ai/tool/kestraflow) — triggers a Kestra flow as a tool, either with a predefined namespace and flow ID or dynamically based on the agent's prompt. +- [**KestraTask**](/plugins/plugin-ai/tool/kestratask) — exposes one or more Kestra runnable tasks as tools, letting the agent supply values for properties left unset. + +### Web search + +- [**TavilyWebSearch**](/plugins/plugin-ai/tool/tavilywebsearch) — gives the agent access to live web results via the Tavily search API. +- [**GoogleCustomWebSearch**](/plugins/plugin-ai/tool/googlecustomwebsearch) — gives the agent access to live web results via a Google Custom Search Engine. + +### Code execution + +- [**CodeExecution**](/plugins/plugin-ai/tool/codeexecution) — lets the agent write and run JavaScript snippets in a Judge0 sandbox (via RapidAPI). + +### Nested agents + +- [**AIAgent**](/plugins/plugin-ai/tool/aiagent) — wraps another AI agent as a callable tool so a parent agent can delegate sub-tasks to a specialized child agent. +- [**A2AClient**](/plugins/plugin-ai/tool/a2aclient) — forwards prompts to a remote AI agent over the Agent-to-Agent (A2A) protocol and returns its response. + +### MCP clients + +Kestra supports MCP in two directions. These clients cover the **Kestra-as-client** direction: your flow calls tools on an *external* MCP server. For the opposite direction — exposing your flows *as* MCP tools for external AI agents to call — see [MCP Server](../mcp-server/index.md) and the [McpToolTrigger](../../05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md). + +Connect the agent to any [Model Context Protocol (MCP)](https://modelcontextprotocol.io) server to expose its tools: + +- [**DockerMcpClient**](/plugins/plugin-ai/tool/dockermcpclient) — runs an MCP server inside a Docker container. +- [**SseMcpClient**](/plugins/plugin-ai/tool/ssemcpclient) — connects to a remote MCP server over Server-Sent Events (SSE). +- [**StdioMcpClient**](/plugins/plugin-ai/tool/stdiomcpclient) — spawns a local MCP server process and communicates over stdio. +- [**StreamableHttpMcpClient**](/plugins/plugin-ai/tool/streamablehttpmcpclient) — connects to an MCP server over HTTP streaming. + +The [Kestra Python MCP server](https://github.com/kestra-io/mcp-server-python) is an example of an external MCP server you can connect to from a Kestra AI Agent task using one of the clients above. diff --git a/src/contents/docs/ai-tools/index.mdx b/src/contents/docs/ai-tools/index.mdx index b7d008e8c5a..5b1761d08a3 100644 --- a/src/contents/docs/ai-tools/index.mdx +++ b/src/contents/docs/ai-tools/index.mdx @@ -12,7 +12,7 @@ Create, refine, and orchestrate workflows using natural language or autonomous d ## Learn how Kestra AI tools accelerate orchestration -Kestra provides two AI-powered features — **AI Copilot** and **AI Agents** — that extend how workflows can be created and executed. Additionally, **Agent Skills** let you bring Kestra expertise to external AI coding agents. +Kestra provides AI-powered features — **AI Copilot**, **AI Agents**, and **MCP Server** — that extend how workflows can be created and executed. Additionally, **Agent Skills** let you bring Kestra expertise to external AI coding agents. ## AI Copilot @@ -22,6 +22,10 @@ AI Copilot allows users to generate and refine flow definitions from natural lan AI Agents provide autonomous orchestration capabilities. An AI Agent task uses a large language model (LLM), optional memory, and configured tools such as web search, task execution, or flow calling. The agent can dynamically decide which actions to take, loop until conditions are satisfied, and adapt based on new information. Unlike static flows that follow a fixed sequence, agents operate adaptively while remaining observable and fully defined as code. +## MCP Server + +The Kestra MCP Server exposes flows as tools for MCP-compatible AI agents. Add an `McpToolTrigger` to any flow and it is automatically registered as a named tool on a Kestra MCP server. AI agents such as Claude Desktop, Claude Code, and Cursor can then discover and invoke your flows directly, with flow inputs and outputs mapped to a JSON schema. + ## Agent Skills Agent Skills are structured knowledge files that teach external AI coding agents — such as Claude Code, Cursor, and Windsurf — how to generate Kestra flows and operate Kestra environments using `kestractl`. Unlike AI Copilot (which works inside the Kestra UI) or AI Agents (which run inside flows), Agent Skills bring Kestra expertise directly to your editor or terminal. @@ -32,8 +36,9 @@ Together, these approaches offer complementary ways to work with AI: - **AI Copilot**: speeds up flow creation and modification by translating natural language instructions into YAML. - **AI Agents**: enable adaptive orchestration patterns where task sequences are not predetermined but are chosen dynamically at runtime. +- **MCP Server**: exposes flows as callable tools for external AI agents, making Kestra orchestration available to any MCP-compatible client. - **Agent Skills**: give external AI coding agents structured knowledge to generate valid Kestra flows and operate environments from your development tools. -AI Copilot and AI Agents are built into Kestra, while Agent Skills extend Kestra expertise to the external tools you already use. +AI Copilot, AI Agents, and MCP Server are built into Kestra, while Agent Skills extend Kestra expertise to the external tools you already use. \ No newline at end of file diff --git a/src/contents/docs/ai-tools/mcp-server/index.md b/src/contents/docs/ai-tools/mcp-server/index.md new file mode 100644 index 00000000000..6775061de41 --- /dev/null +++ b/src/contents/docs/ai-tools/mcp-server/index.md @@ -0,0 +1,94 @@ +--- +title: MCP Server in Kestra – Expose Flows as AI Tools +h1: Configure Kestra MCP Servers and Connect AI Agents +description: Configure Kestra MCP servers to expose flows as tools for AI agents. Learn how to create servers, set authentication, and connect Claude Desktop, Claude Code, and Cursor. +sidebarTitle: MCP Server +icon: /src/contents/docs/icons/ai.svg +version: "2.0.0" +editions: ["OSS", "EE"] +--- + +A Kestra MCP server exposes flows as named tools over HTTP for AI agents to discover and call. + +A Kestra MCP server is a tenant-scoped entity that uses the [Model Context Protocol](https://modelcontextprotocol.io). Any flow with an [`McpToolTrigger`](../../05.workflow-components/07.triggers/06.mcp-tool-trigger/index.md) is automatically registered as a named tool on its target server. AI agents discover the tool list at connection time, so adding or removing triggers takes effect without restarting clients. + +## Two directions: Kestra as server vs. Kestra as client + +Kestra supports MCP in both directions: + +| Direction | How | When to use | +|---|---|---| +| **Kestra as MCP server** | `McpToolTrigger` + MCP server entity | AI agents (Claude, Cursor) call your flows as tools | +| **Kestra as MCP client** | MCP client tasks (`SseMcpClient`, `StreamableHttpMcpClient`, `StdioMcpClient`, `DockerMcpClient`) | Your flows call external MCP servers as part of an AI Agent task | + +This page covers Kestra as an MCP server. For using external MCP servers from within flows, see [AI Agents](../ai-agents/index.md). + +## Default server + +A `default` MCP server is automatically provisioned for every tenant on startup. You can use it immediately — no setup needed. The `McpToolTrigger`'s `mcpServer` property defaults to `"default"`, so a minimal trigger requires no explicit server reference. + +## Managing MCP servers + +Navigate to **Tenant → MCP Servers** in the left sidebar to view, create, edit, and manage MCP servers. + +Each server has the following fields: + +| Field | Description | +|---|---| +| `name` | Display name for the server. | +| `description` | Optional description shown in the UI. | +| `systemPrompt` | Instructions prepended to every AI agent session connected to this server. Use this to guide agent behavior — for example, to restrict which tools to call or define the agent's persona. | +| `serverType` | `PRIVATE` (default) or `PUBLIC`. A private server requires authentication; a public server accepts unauthenticated connections. | +| `authType` | `BASIC` (username/password, available in OSS and EE) or `API_TOKEN` (EE only). | + +### Authentication types + +| Auth type | Available in | Notes | +|---|---|---| +| `BASIC` | OSS, EE | Username and password required on connect. | +| `API_TOKEN` | EE only | API token required on connect. Rejected on OSS. | +| OAuth2 | EE only | Required for browser-based MCP clients such as Claude web. Not yet available. | + +Keep servers private unless you have a specific reason to expose them publicly. A public server allows any MCP client to call any flow registered on it without authentication. + +## Connecting an AI agent client + +Open a server in the UI and click the **Connect** tab: + +- The SSE endpoint URL for the server +- Ready-to-paste configuration snippets for: + - **Claude Desktop** — JSON block to add to `claude_desktop_config.json` + - **Claude Code** — `claude mcp add` command to run in your terminal + - **Cursor** — server URL to paste into Cursor Settings → MCP → Add new MCP server + - **Codex** — connection configuration + +## Viewing registered tools + +The **Tool Flows** tab on each server lists all flows that have an `McpToolTrigger` pointing at that server. Use this to audit which flows are exposed and to navigate directly to a flow's trigger configuration. + +## RBAC (Enterprise) + +In the Enterprise Edition, `MCP_SERVER` is a first-class RBAC resource. See [RBAC](../../07.enterprise/03.auth/rbac/index.md#mcp-server-permissions) for the default role assignments. + +Access to a private server is also flow-scoped: a user can connect to a private MCP server only if they have `FLOW.EXECUTE` permission on at least one namespace that has a flow with an `McpToolTrigger` pointing at that server. + +## MCP server cache configuration + +By default, each webserver node caches MCP server configuration in memory and hot-reloads it when a server is created, updated, or deleted. Two optional properties control the cache behavior: + +| Property | Default | Description | +|---|---|---| +| `kestra.mcp.server-cache-config.maximum-size` | `500` | Maximum number of MCP server entries to cache. | +| `kestra.mcp.server-cache-config.expire-after-access` | `PT5M` | How long a cache entry remains valid after last access. | + +Example configuration: + +```yaml +kestra: + mcp: + server-cache-config: + maximum-size: 200 + expire-after-access: PT10M +``` + +Adjust these only if you have a large number of MCP servers or tight memory constraints. The defaults are sufficient for most deployments. diff --git a/src/contents/docs/configuration/02.runtime-and-storage/index.md b/src/contents/docs/configuration/02.runtime-and-storage/index.md index 40339874ce5..ef42315d5f6 100644 --- a/src/contents/docs/configuration/02.runtime-and-storage/index.md +++ b/src/contents/docs/configuration/02.runtime-and-storage/index.md @@ -24,6 +24,19 @@ Queues and repositories must stay compatible: - JDBC queue with H2, MySQL, or PostgreSQL repository - Kafka queue with Elasticsearch repository in Enterprise Edition +## Allocated CPU cores + +Kestra sizes several internal thread pools based on the number of CPU cores available to the process. By default, it uses the number of CPU cores reported by the runtime environment. + +If you want Kestra to size those pools using a different value, set `kestra.allocated-cpu-cores`: + +```yaml +kestra: + allocated-cpu-cores: 2 +``` + +This is useful when you want to limit how aggressively Kestra allocates worker, scheduler, and queue-related threads without changing container limits or host-level CPU settings. + ## Database and datasources Start here if you are choosing the persistence layer for a new Kestra instance or moving from a local setup to a durable environment. In most teams, this is the first configuration page they revisit after initial installation. @@ -212,6 +225,7 @@ Common options include: - `local` for local testing - `s3` +- `s3files` - `gcs` - `azure` - `minio` @@ -301,6 +315,44 @@ kestra: sts-endpoint-override: "" ``` +#### S3 Files compatibility mode + +If your bucket has [AWS S3 Files](https://aws.amazon.com/blogs/aws/launching-s3-files-making-s3-buckets-accessible-as-file-systems/) enabled, set `s3-files-compatible: true`. S3 Files requires bucket versioning, so Kestra enables versioning on the bucket during startup. Kestra will also hard-delete all object versions and delete markers instead of writing a single delete marker. + +```yaml +kestra: + storage: + type: s3 + s3: + region: "" + bucket: "" + s3-files-compatible: true +``` + +:::alert{type="warning"} +Do not enable this flag on a plain S3 bucket that is not using S3 Files. Once versioning is enabled on a bucket it cannot be fully disabled, only suspended. +::: + +### S3 Files + +Use `s3files` when Kestra runs on a host where an [S3 Files](https://aws.amazon.com/blogs/aws/launching-s3-files-making-s3-buckets-accessible-as-file-systems/) NFS filesystem is already mounted locally. This backend reads and writes directly through the local filesystem — no S3 SDK or AWS credentials are required. + +Mount the NFS filesystem on every host that runs a Kestra component before configuring this backend. All components must share the same mount path. + +```yaml +kestra: + storage: + type: s3files + s3files: + mount-path: "/mnt/s3files" +``` + +`mount-path` must point to a directory that exists and is readable and writable by the Kestra process. Kestra will not create the directory on startup. + +Object metadata is stored in `.meta` sidecar files alongside each object on the filesystem. Custom S3 object metadata is not exposed through this backend. + +If you prefer to keep the S3 SDK path (for example, because not every host has the NFS mount), use the standard `s3` backend with `s3-files-compatible: true` instead. + ### MinIO MinIO is a good self-hosted choice when you want object storage behavior without depending on a cloud provider: @@ -493,6 +545,8 @@ kestra: Use `html-head` sparingly for environment banners, extra CSS, or internal scripts that must load with the app shell. +### Local file access + To allow universal file access from host-mounted paths, both mount the directory and add it to the allowlist: ```yaml @@ -505,6 +559,22 @@ kestra: Without the allowlist, file-access URIs pointing at local host paths will be rejected even if the path is mounted into the container. +The `io.kestra.plugin.fs.local.Upload` and `io.kestra.plugin.fs.local.Uploads` tasks enforce their own `allowed-paths` check, independent of `kestra.local-files.allowed-paths`. Configure permitted directories under `plugins.configurations`: + +```yaml +kestra: + plugins: + configurations: + - type: io.kestra.plugin.fs.local.Uploads + values: + allowed-paths: + - /data/uploads + - type: io.kestra.plugin.fs.local.Upload + values: + allowed-paths: + - /data/uploads +``` + ## When to use this page - Need logs, telemetry, metrics, endpoints, CORS, or SSL: [Observability and Networking](../03.observability-and-networking/index.md) diff --git a/src/contents/docs/configuration/04.plugins-and-execution/index.md b/src/contents/docs/configuration/04.plugins-and-execution/index.md index 822a9dd383e..28e935413f9 100644 --- a/src/contents/docs/configuration/04.plugins-and-execution/index.md +++ b/src/contents/docs/configuration/04.plugins-and-execution/index.md @@ -93,6 +93,17 @@ kestra: Plugin defaults are evaluated by the Executor and propagated to other components, so every server should use the same `kestra.plugins.defaults`. ::: +`kestra.plugins.defaults` is the canonical global configuration key. The older `kestra.tasks.defaults` key is still recognized for compatibility, but it is deprecated and should be replaced. + +Precedence works as follows: + +- global plugin defaults provide the base values +- flow-level `pluginDefaults` override global defaults +- task properties override non-forced defaults +- `forced: true` prevents the task from overriding that property + +Use non-forced defaults for convenience and consistency, and use forced defaults when the platform must enforce a value such as a specific task runner. + Enable or preconfigure plugin features globally: ```yaml diff --git a/src/contents/docs/configuration/05.security-and-secrets/index.md b/src/contents/docs/configuration/05.security-and-secrets/index.md index 3fe0dddce6d..1c57321d1a8 100644 --- a/src/contents/docs/configuration/05.security-and-secrets/index.md +++ b/src/contents/docs/configuration/05.security-and-secrets/index.md @@ -305,15 +305,22 @@ kestra: expire-after: P30D ``` -For username/password auth, enforce password complexity explicitly: +For username/password auth, configure password complexity explicitly: ```yaml kestra: security: basic-auth: - password-regexp: "" + password-min-length: 8 + password-require-special: true + password-min-digits: 1 + password-min-lower-case: 1 + password-min-upper-case: 1 + password-allowed-special-characters: "!@#$%^&*" ``` +These rules apply anywhere Kestra asks a user to set or reset a password, including the initial setup flow, invitation acceptance, and user management screens. + ### Delete configuration files after startup If the runtime reads secrets from configuration files, delete them after startup so tasks cannot read them later from disk: @@ -385,6 +392,24 @@ kestra: Keep the external process manager timeout longer than Kestra's own termination grace period. Otherwise Kubernetes, Docker, or systemd can kill the process before graceful shutdown finishes. ::: +## Regex timeout + +Kestra protects worker threads from ReDoS (catastrophic backtracking) by enforcing a timeout on all regex operations. This applies to [Pebble expression filters](../../expressions/index.mdx) (`regexMatch`, `regexReplace`, `regexExtract`, `replace` with `regexp=true`) and to `validator` patterns on `STRING` and `SECRET` inputs. When a pattern exceeds the limit, the task fails immediately with a timeout error rather than hanging indefinitely. + +The default timeout is **10 seconds**. To change it, set `kestra.regex.timeout` in your configuration: + +```yaml +kestra: + regex: + timeout: 30s +``` + +Accepts any [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations) string (e.g., `5s`, `PT30S`, `1m`). + +:::alert{type="info"} +The timeout is set once at startup and cannot be changed at runtime without restarting the server. +::: + ## Related docs - Secrets manager concepts: [External Secrets Manager](../../07.enterprise/02.governance/secrets-manager/index.md) diff --git a/src/contents/docs/configuration/06.enterprise-and-advanced/index.md b/src/contents/docs/configuration/06.enterprise-and-advanced/index.md index 730342a4668..ae2293a6fc4 100644 --- a/src/contents/docs/configuration/06.enterprise-and-advanced/index.md +++ b/src/contents/docs/configuration/06.enterprise-and-advanced/index.md @@ -17,6 +17,7 @@ This area includes: - Enterprise license configuration - Enterprise Java security +- gRPC TLS/mTLS for worker ↔ controller communication - UI sidebar customization - historical multi-tenancy and default tenant settings - custom links in the UI @@ -88,6 +89,149 @@ kestra: The old multi-tenancy and default-tenant configuration was removed in `0.23.0`; keep it only in mind for migration work. +## gRPC TLS/mTLS (EE only) + +Use this section when running Kestra in a distributed topology where the Worker Controller and Workers communicate over gRPC and you need to encrypt that channel. By default, gRPC traffic is plaintext. Enabling TLS here encrypts the controller ↔ worker channel; enabling mTLS additionally requires workers to present a certificate the controller trusts. + +This feature is active on any component with server type `CONTROLLER`, `WORKER`, or `STANDALONE`. + +### One-way TLS + +The controller presents a certificate; workers verify it against a truststore. Configure the controller (server side) with a keystore and the workers (client side) with a matching truststore: + +**Controller:** + +```yaml +kestra: + grpc: + tls: + enabled: true + key-store: + path: /etc/kestra/tls/controller-keystore.p12 + type: PKCS12 + password: "" +``` + +**Worker:** + +```yaml +kestra: + grpc: + tls: + enabled: true + trust-store: + path: /etc/kestra/tls/ca-truststore.p12 + type: PKCS12 + password: "" +``` + +If no truststore is provided on the worker side, the JVM default trust store is used. This is appropriate when the controller certificate is signed by a well-known CA. + +### Mutual TLS (mTLS) + +Set `client-auth: REQUIRE` on the controller to enforce that workers present a certificate. Both sides need a keystore and a truststore: + +**Controller:** + +```yaml +kestra: + grpc: + tls: + enabled: true + client-auth: REQUIRE + key-store: + path: /etc/kestra/tls/controller-keystore.p12 + type: PKCS12 + password: "" + trust-store: + path: /etc/kestra/tls/ca-truststore.p12 + type: PKCS12 + password: "" +``` + +**Worker:** + +```yaml +kestra: + grpc: + tls: + enabled: true + key-store: + path: /etc/kestra/tls/worker-keystore.p12 + type: PKCS12 + password: "" + trust-store: + path: /etc/kestra/tls/ca-truststore.p12 + type: PKCS12 + password: "" +``` + +`client-auth` also accepts `OPTIONAL`, which requests a client certificate but does not require one. + +### Authority override for static discovery + +When using static discovery, the gRPC channel authority is the synthetic value `controllers` rather than a real hostname. If the controller certificate's Subject Alternative Names (SANs) do not include `controllers`, TLS verification will fail. Set `authority-override` on the worker to a hostname that is present in the certificate's SANs: + +```yaml +kestra: + grpc: + tls: + enabled: true + authority-override: kestra-controller + trust-store: + path: /etc/kestra/tls/ca-truststore.p12 + type: PKCS12 + password: "" +``` + +This is not needed with DNS-based discovery, where the authority is derived from the actual hostname. + +### JKS keystores + +PKCS12 is the recommended format. For JKS keystores, set `type: JKS`. JKS also supports a separate key password (used when the private key entry password differs from the store password): + +```yaml +kestra: + grpc: + tls: + enabled: true + key-store: + path: /etc/kestra/tls/keystore.jks + type: JKS + password: "" + key-password: "" +``` + +### Development: skip certificate verification + +:::alert{type="warning"} +`insecure-trust-all-certificates: true` disables CA verification entirely. Use only in local development or CI environments where certificates are self-signed and not managed. Never enable this in production. +::: + +```yaml +kestra: + grpc: + tls: + enabled: true + insecure-trust-all-certificates: true +``` + +### Configuration reference + +| Property | Default | Description | +| --- | --- | --- | +| `kestra.grpc.tls.enabled` | `false` | Enable TLS for gRPC communication | +| `kestra.grpc.tls.key-store.path` | — | Path to keystore file | +| `kestra.grpc.tls.key-store.type` | `PKCS12` | Keystore format (`PKCS12` or `JKS`) | +| `kestra.grpc.tls.key-store.password` | — | Keystore password | +| `kestra.grpc.tls.key-store.key-password` | — | Private key entry password (JKS only) | +| `kestra.grpc.tls.trust-store.path` | — | Path to truststore file | +| `kestra.grpc.tls.trust-store.type` | `PKCS12` | Truststore format | +| `kestra.grpc.tls.trust-store.password` | — | Truststore password | +| `kestra.grpc.tls.client-auth` | `NONE` | Client auth mode: `NONE`, `OPTIONAL`, or `REQUIRE` | +| `kestra.grpc.tls.insecure-trust-all-certificates` | `false` | Skip CA verification (development only) | +| `kestra.grpc.tls.authority-override` | — | Override TLS authority for static discovery | + ## Elasticsearch, Kafka, and indexing This section is really about one architectural choice: running Kestra on the Kafka plus Elasticsearch stack instead of the simpler JDBC-backed setup. If you are on PostgreSQL or MySQL only, much of this page will not apply. @@ -268,6 +412,25 @@ kestra: If indexing falls behind, tune indexer batch settings before changing flow definitions. Those settings control how aggressively Kafka-backed events are flushed into Elasticsearch. +## MCP server cache + +Each webserver node caches MCP server configuration in memory and hot-reloads it when a server is created, updated, or deleted. Two properties control this cache: + +| Property | Default | Description | +|---|---|---| +| `kestra.mcp.server-cache-config.maximum-size` | `500` | Maximum number of MCP server entries held in the cache. | +| `kestra.mcp.server-cache-config.expire-after-access` | `PT5M` | Duration after which a cache entry expires if not accessed. | + +```yaml +kestra: + mcp: + server-cache-config: + maximum-size: 200 + expire-after-access: PT10M +``` + +Tune these only if you have a large number of MCP servers or tight memory constraints. The defaults are sufficient for most deployments. + ## AI and isolated environments These are the most optional settings on the page. They matter only if you are enabling Copilot integrations or operating Kestra in restricted network environments. diff --git a/src/contents/docs/expressions/01.context/index.mdx b/src/contents/docs/expressions/01.context/index.mdx index 50fb4726207..f6b737b8f15 100644 --- a/src/contents/docs/expressions/01.context/index.mdx +++ b/src/contents/docs/expressions/01.context/index.mdx @@ -25,6 +25,7 @@ The execution context usually includes: - `namespace` in Enterprise Edition when namespace variables are configured - `envs` for environment variables - `globals` for global configuration values +- `item` inside a [Loop](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop) task iteration :::alert{type="info"} To inspect the full runtime context, use `{{ printContext() }}` in the Debug Expression console. @@ -54,11 +55,10 @@ The Debug Expression console is available in the Kestra UI under **Executions | `{{ taskrun.startDate }}` | Start date of the current task run | | `{{ taskrun.attemptsCount }}` | Retry and restart attempt count | | `{{ taskrun.parentId }}` | Parent task run identifier for nested tasks | -| `{{ taskrun.value }}` | Current loop or flowable value | -| `{{ parent.taskrun.value }}` | Value of the nearest parent task run | | `{{ parent.outputs }}` | Outputs of the nearest parent task run | | `{{ parents }}` | List of parent task runs | | `{{ labels }}` | Execution labels accessible by key | +| `{{ trace.parent }}` | W3C `traceparent` header for the current execution; only populated when [OpenTelemetry tracing is enabled](../../10.administrator-guide/open-telemetry/index.md#traces) | Example: @@ -93,6 +93,36 @@ When the execution is started by a `Flow` trigger: | `{{ trigger.flowId }}` | ID of the triggering flow | | `{{ trigger.flowRevision }}` | Revision of the triggering flow | +## Loop iteration context + +Inside a [Loop](../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop) task, each iteration runs as an isolated sub-execution. The `item` variable is available to all tasks within that sub-execution. + +| Expression | Description | +|---|---| +| `{{ item.index }}` | Zero-based index of the current iteration | +| `{{ item.value }}` | Value of the current iteration | +| `{{ item.key }}` | Map key of the current iteration; only set when `values` is a map | +| `{{ item.parent.index }}` | Index of the nearest enclosing loop (nested loops only) | +| `{{ item.parent.value }}` | Value of the nearest enclosing loop (nested loops only) | +| `{{ item.parents[n].value }}` | Value of the nth ancestor loop, counting from innermost (`[0]` = immediate parent) | + +Because `item` is bound to the loop execution rather than individual task runs, flowable tasks nested inside a `Loop` (such as `If` or `Parallel`) can access `item` directly without any `parent.` prefix. + +```yaml +tasks: + - id: loop + type: io.kestra.plugin.core.flow.Loop + values: ["value 1", "value 2", "value 3"] + tasks: + - id: check + type: io.kestra.plugin.core.flow.If + condition: '{{ item.value == "value 2" }}' + then: + - id: log + type: io.kestra.plugin.core.log.Log + message: "Matched at index {{ item.index }}: {{ item.value }}" +``` + ## Environment and global variables Kestra provides access to environment variables prefixed with `ENV_` by default, unless configured otherwise in the [runtime and storage configuration](/docs/configuration/runtime-and-storage). diff --git a/src/contents/docs/expressions/03.filters/03.strings/index.mdx b/src/contents/docs/expressions/03.filters/03.strings/index.mdx index c9112ce1e0c..fa4e67279a1 100644 --- a/src/contents/docs/expressions/03.filters/03.strings/index.mdx +++ b/src/contents/docs/expressions/03.filters/03.strings/index.mdx @@ -120,6 +120,31 @@ Escapes special characters in a string. The `type` argument controls which style {# output: Can\'t be here #} ``` +## Regex filters + +Three filters cover the most common regex operations: + +- `regexMatch(regex)` — returns `true` if the input contains a substring matching the pattern, `false` otherwise. +- `regexReplace(regex, replacement)` — replaces all non-overlapping matches. Use `$1`, `$2`, … to reference capture groups in the replacement. +- `regexExtract(regex, group)` — returns the first match or a specific capture group. `group` defaults to `0` (the whole match); returns `null` if there is no match. + +```twig +{{ "hello world" | regexMatch("w[a-z]+") }} +{# output: true #} +{{ "2024-01-15" | regexReplace("(\\d{4})-(\\d{2})-(\\d{2})", "$3/$2/$1") }} +{# output: 15/01/2024 #} +{{ "order-12345-done" | regexExtract("\\d+") }} +{# output: 12345 #} +{{ "2024-01-15" | regexExtract("(\\d{4})-(\\d{2})-(\\d{2})", 1) }} +{# output: 2024 #} +``` + +:::alert{type="warning"} +Regex filter operations are subject to a **10-second timeout** to prevent ReDoS (catastrophic backtracking). If a pattern takes longer than the limit, the task fails with a timeout error. + +Patterns with nested quantifiers such as `(a+)+` applied to large inputs are most likely to trigger this. Use anchored, non-ambiguous patterns to avoid it. The timeout can be adjusted with [`kestra.regex.timeout`](../../../configuration/05.security-and-secrets/index.md#regex-timeout) in your Kestra configuration. +::: + ## Worked string filter example This flow builds a sanitized filename and a display-safe summary from a raw input title: diff --git a/src/contents/docs/expressions/03.filters/index.mdx b/src/contents/docs/expressions/03.filters/index.mdx index 45174dffbc2..d673208f47b 100644 --- a/src/contents/docs/expressions/03.filters/index.mdx +++ b/src/contents/docs/expressions/03.filters/index.mdx @@ -14,7 +14,7 @@ Use filters when you need to transform a value with the pipe syntax: `{{ value | - [JSON and structured data](./01.json/index.mdx) — `toJson`, `toIon`, `jq` - [Numbers and collections](./02.collections/index.mdx) — `abs`, `number`, `first`, `last`, `sort`, `chunk`, `distinct`, and more -- [Strings](./03.strings/index.mdx) — `lower`, `upper`, `replace`, `slugify`, `base64encode`, and more +- [Strings](./03.strings/index.mdx) — `lower`, `upper`, `replace`, `slugify`, `base64encode`, `regexMatch`, `regexReplace`, `regexExtract`, and more - [Dates](./04.dates/index.mdx) — `date`, `dateAdd`, `timestamp`, `timestampMilli`, and precision variants - [YAML](./05.yaml/index.mdx) — `yaml`, `indent`, `nindent` diff --git a/src/contents/docs/expressions/04.functions/04.workflow/index.mdx b/src/contents/docs/expressions/04.functions/04.workflow/index.mdx index 777bb45cf5a..fd1f40bbc78 100644 --- a/src/contents/docs/expressions/04.functions/04.workflow/index.mdx +++ b/src/contents/docs/expressions/04.functions/04.workflow/index.mdx @@ -1,13 +1,37 @@ --- title: "Workflow Helper Functions in Kestra Expressions" h1: "Workflow Helper Functions" -description: Reference for Kestra's workflow and execution helper functions — errorLogs(), currentEachOutput(), tasksWithState(), iterationOutput(), parentOutput(), and appLink(). +description: Reference for Kestra's workflow and execution helper functions — loopOutputs(), errorLogs(), currentEachOutput(), tasksWithState(), iterationOutput(), and appLink(). sidebarTitle: Workflow Functions icon: /src/contents/docs/icons/expression.svg --- This group is more situational, but it becomes valuable in complex flows where you need to inspect sibling results, build links back into Kestra, or summarize failures. +## `loopOutputs()` + +Extracts a named output from every iteration of a [Loop](../../../05.workflow-components/01.tasks/00.flowable-tasks/index.md#loop) task and returns the values as an ordered list. Use it after a `Loop` task to collect one output field from all iterations without manually traversing the outputs list. + +```twig +{{ loopOutputs(outputs.myLoop.outputs, 'result') }} +``` + +`outputs` must be the loop's `.outputs` list. `name` is the declared output ID to extract. The function returns one value per iteration in order, with `null` for iterations where the key is missing. + +For a `Loop` over `["a", "b", "c"]` with a declared output `result`: + +```twig +{{ loopOutputs(outputs.loop.outputs, 'result') }} +{# → ["processed a", "processed b", "processed c"] #} +``` + +To access a single iteration directly, use list index notation: + +```twig +{{ outputs.loop.outputs[0].outputs.result }} {# first iteration's output #} +{{ outputs.loop.outputs[0].item.value }} {# first iteration's value #} +``` + ## `errorLogs()` Prints all error logs from the current execution: @@ -45,15 +69,6 @@ Retrieves the output of a specific iteration from a previous task. Both argument {{ iterationOutput(outputs.myTask, 2).value }} ``` -## `parentOutput()` - -Retrieves the output of a parent task. The optional `index` argument specifies which ancestor to target; omitting it returns the direct parent's output: - -```twig -{{ parentOutput() }} -{{ parentOutput(1) }} -``` - ## `appLink()` Enterprise Edition's `appLink()` builds links back to Kestra Apps: diff --git a/src/contents/docs/expressions/04.functions/06.dates/index.mdx b/src/contents/docs/expressions/04.functions/06.dates/index.mdx index 96c3b19f477..cc19500c8a2 100644 --- a/src/contents/docs/expressions/04.functions/06.dates/index.mdx +++ b/src/contents/docs/expressions/04.functions/06.dates/index.mdx @@ -33,6 +33,17 @@ Returns `true` if the date is the Nth occurrence of the given weekday in its mon {{ isDayWeekInMonth(trigger.date, 'MONDAY', 'FIRST') }} ``` +## `isLastWorkingDay()` + +Returns `true` if the date is the last working day of its month. Working days default to Monday–Friday. An optional second argument overrides which days count as working days using a comma-separated list of uppercase day names: + +```twig +{{ isLastWorkingDay(trigger.date) }} +{{ isLastWorkingDay(trigger.date, 'MONDAY,TUESDAY,WEDNESDAY,THURSDAY') }} +``` + +The `date` argument accepts any ISO 8601 date or datetime string. Combine with `isPublicHoliday()` if you also need to exclude public holidays. + ## `dayOfWeek()` Returns the uppercase day name such as `MONDAY`: diff --git a/src/contents/docs/expressions/04.functions/index.mdx b/src/contents/docs/expressions/04.functions/index.mdx index 825844ff74b..6f6a8272a27 100644 --- a/src/contents/docs/expressions/04.functions/index.mdx +++ b/src/contents/docs/expressions/04.functions/index.mdx @@ -17,7 +17,7 @@ Functions are best thought of as helpers that either fetch something, compute so - [Rendering and debugging](./01.rendering/index.mdx) — `render()`, `renderOnce()`, `printContext()`, template inheritance helpers - [Data access](./02.data-access/index.mdx) — `secret()`, `credential()`, `read()`, `fileURI()`, `kv()`, `encrypt()`, `decrypt()` - [Data parsing](./03.parsing/index.mdx) — `fromJson()`, `fromIon()`, `yaml()` -- [Workflow helpers](./04.workflow/index.mdx) — `errorLogs()`, `currentEachOutput()`, `tasksWithState()`, `iterationOutput()`, `parentOutput()`, `appLink()` +- [Workflow helpers](./04.workflow/index.mdx) — `errorLogs()`, `currentEachOutput()`, `tasksWithState()`, `iterationOutput()`, `appLink()` - [Utilities](./05.utilities/index.mdx) — `now()`, `uuid()`, `randomInt()`, `http()`, `fileSize()`, `fileExists()`, and more - [Date and calendar](./06.dates/index.mdx) — `isWeekend()`, `isPublicHoliday()`, `dayOfWeek()`, `monthOfYear()`, and more diff --git a/src/contents/docs/task-runners/04.types/03.kubernetes-task-runner/index.md b/src/contents/docs/task-runners/04.types/03.kubernetes-task-runner/index.md index fbef02271f7..743075cdfd6 100644 --- a/src/contents/docs/task-runners/04.types/03.kubernetes-task-runner/index.md +++ b/src/contents/docs/task-runners/04.types/03.kubernetes-task-runner/index.md @@ -291,6 +291,29 @@ taskRunner: caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" ``` +## Connection and concurrency settings + +At high concurrency, each task opens multiple WebSocket connections against the API server — one for the pod watch, one for the log stream, and one or two for file upload and sidecar signaling. On clusters that enforce API rate limits (such as GKE), this can cause transient failures and slow API server responses, compounding timeout issues. + +Three properties on the `config:` block let you cap concurrent connections and tune reconnect backoff: + +| Property | Default | Description | +|---|---|---| +| `maxConcurrentRequests` | `64` | Maximum total concurrent HTTP requests per client. | +| `maxConcurrentRequestsPerHost` | `5` | Maximum concurrent HTTP requests to the API server host. | +| `watchReconnectInterval` | `PT1S` | Backoff between watch reconnects. Increase to prevent reconnect storms under API pressure. | + +```yaml +taskRunner: + type: io.kestra.plugin.ee.kubernetes.runner.Kubernetes + config: + masterUrl: https://docker-for-desktop:6443 + caCertData: "{{ secret('K8S_CA_CERT_DATA') }}" + maxConcurrentRequests: 32 + maxConcurrentRequestsPerHost: 3 + watchReconnectInterval: PT5S +``` + ## Pod and container customization The Kubernetes task runner exposes several properties for customizing the pod spec beyond standard options like `resources` and `namespace`. These are advanced properties intended for cases such as security hardening, shared volumes, custom sidecars, or node scheduling constraints. diff --git a/src/contents/docs/task-runners/04.types/04.aws-batch-task-runner/index.md b/src/contents/docs/task-runners/04.types/04.aws-batch-task-runner/index.md index 433c56937ad..8c854808ae6 100644 --- a/src/contents/docs/task-runners/04.types/04.aws-batch-task-runner/index.md +++ b/src/contents/docs/task-runners/04.types/04.aws-batch-task-runner/index.md @@ -30,6 +30,8 @@ To support `inputFiles`, `namespaceFiles`, and `outputFiles`, the task runner cr 2. The _main_ container that fetches input files into the `{{ workingDir }}` directory and runs the task. 3. An _after_-container that fetches output files using `outputFiles` to make them available from the Kestra UI for download and preview. +The before- and after-containers use the `amazon/aws-cli` image. If your environment restricts which images can be pulled (ECR pull-through cache, VPC egress policy, or image allowlist), ensure this image is accessible. + **EKS:** Uses [EKS job definitions](https://docs.aws.amazon.com/batch/latest/userguide/jobs-eks.html) with a Kubernetes pod. Sidecar containers run as pod containers using the same S3-based file transfer pattern. The main container command is wrapped in `/bin/sh -c`, so the container image must include `/bin/sh`. Since the working directory of the container isn’t known in advance, you must define the working and output directories explicitly. For example, use `cat {{ workingDir }}/myFile.txt` instead of `cat myFile.txt`. diff --git a/src/contents/docs/task-runners/04.types/07.google-cloudrun-task-runner/index.md b/src/contents/docs/task-runners/04.types/07.google-cloudrun-task-runner/index.md index 77d205309c7..126d677ea04 100644 --- a/src/contents/docs/task-runners/04.types/07.google-cloudrun-task-runner/index.md +++ b/src/contents/docs/task-runners/04.types/07.google-cloudrun-task-runner/index.md @@ -119,7 +119,7 @@ Three properties control how long the runner waits and how often it checks job s | Property | Default | Description | |---|---|---| -| `waitUntilCompletion` | `PT1H` | Maximum wall-clock time before the job is timed out. The task's own `timeout` takes precedence when set. | +| `waitUntilCompletion` | `PT1H` | Maps to the GCP **Task timeout** field visible in the GCP console under Task capacity. Controls both the GCP-enforced task timeout and the Kestra polling timeout — the Cloud Run task is forcibly terminated by GCP when this duration elapses. The Kestra task-level `timeout` property takes precedence when set. GCP maximum is 168 hours (`PT168H`). | | `completionCheckInterval` | `PT5S` | How often to poll the Cloud Run API for job status. Lower values reduce latency for short jobs; higher values reduce API calls for long ones. | | `waitForLogInterval` | `PT5S` | Extra time to stream late log entries after job completion. | diff --git a/src/contents/docs/use-cases/05.approval-processes/index.md b/src/contents/docs/use-cases/05.approval-processes/index.md index 2915c68c138..e3671895ce5 100644 --- a/src/contents/docs/use-cases/05.approval-processes/index.md +++ b/src/contents/docs/use-cases/05.approval-processes/index.md @@ -314,13 +314,10 @@ tasks: triggers: - id: flow type: io.kestra.plugin.core.trigger.Flow - preconditions: - id: flow1 - flows: - - flowId: pause_demo - namespace: demo - states: - - SUCCESS + dependsOn: + - flowId: pause_demo + namespace: demo + states: [SUCCESS] ``` Why this is robust: diff --git a/src/contents/docs/version-control-cicd/04.git/index.md b/src/contents/docs/version-control-cicd/04.git/index.md index 8bc75779dd7..6b14f0c5ab1 100644 --- a/src/contents/docs/version-control-cicd/04.git/index.md +++ b/src/contents/docs/version-control-cicd/04.git/index.md @@ -21,6 +21,8 @@ There are multiple ways to combine Kestra with Git: - [SyncNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.syncnamespacefiles) syncs namespace files the same way. - [PushFlows](/plugins/plugin-git/io.kestra.plugin.git.pushflows) commits and pushes flow edits from the UI to Git, useful when you rely on the built-in editor but still want version history. - [PushNamespaceFiles](/plugins/plugin-git/io.kestra.plugin.git.pushnamespacefiles) does the same for namespace files. +- [SyncBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.SyncBlueprints) syncs Custom Blueprints from Git to Kestra (Enterprise Edition). +- [PushBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.PushBlueprints) commits and pushes Custom Blueprints from Kestra to Git (Enterprise Edition). - [Clone](https://kestra.io/plugins/plugin-git/io.kestra.plugin.git.clone) clones a repository directly into a flow so scripts are available at runtime. - [TenantSync](/plugins/plugin-git/io.kestra.plugin.git.tenantsync) synchronizes all namespaces in a tenant, including flows, files, apps, tests, and dashboards. - [NamespaceSync](/plugins/plugin-git/io.kestra.plugin.git.namespacesync) keeps a single namespace in sync with a Git repo. @@ -188,6 +190,15 @@ The [Git Clone](/plugins/plugin-git/io.kestra.plugin.git.clone) pattern clones a - Infrastructure deployments via [Terraform CLI](/plugins/plugin-terraform/cli/io.kestra.plugin.terraform.cli.terraformcli), [OpenTofu CLI](/plugins/plugin-opentofu/cli/io.kestra.plugin.opentofu.cli.opentofucli), [Terragrunt CLI](/plugins/plugin-terragrunt/cli/io.kestra.plugin.terragrunt.cli.terragruntcli), or [Ansible CLI](/plugins/plugin-ansible/cli/io.kestra.plugin.ansible.cli.ansiblecli) - Docker builds via the [Docker Build task](/plugins/plugin-docker/io.kestra.plugin.docker.build) +## Git SyncBlueprints and PushBlueprints + +[SyncBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.SyncBlueprints) and [PushBlueprints](/plugins/plugin-ee-git/io.kestra.plugin.ee.git.PushBlueprints) bring the same GitOps patterns to Custom Blueprints (Enterprise Edition). + +- **`SyncBlueprints`** – pulls blueprints from a Git directory into Kestra. Use this when a central team owns an approved blueprint library and needs to deploy it across environments. Set `delete: true` to treat Git as the sole source of truth and remove any blueprints not present in the repository. +- **`PushBlueprints`** – exports blueprints from Kestra to Git. Use this when teams author blueprints in the UI and want version history or a pull request review workflow. + +See [Custom Blueprints](../07.enterprise/02.governance/custom-blueprints/index.md#version-control-for-custom-blueprints) for full examples and the blueprint YAML file format. + ## Git TenantSync and NamespaceSync Both [Git TenantSync](/plugins/plugin-git/io.kestra.plugin.git.tenantsync) and [Git NamespaceSync](/plugins/plugin-git/io.kestra.plugin.git.namespacesync) give you full control over synchronizing Kestra objects with your Git repository. diff --git a/src/contents/redirects/docs.yml b/src/contents/redirects/docs.yml index edb7cdd57d1..708e52bda08 100644 --- a/src/contents/redirects/docs.yml +++ b/src/contents/redirects/docs.yml @@ -22,6 +22,8 @@ to: "/docs/api-reference" - regexp: "/docs/best-practices/pebble-templating-with-namespace-files/?$" to: "/docs/best-practices/expressions-with-namespace-files" +- regexp: "^/docs/best-practices/foreach-and-foreachitem(/.*)?$" + to: "/docs/best-practices/loop" - regexp: "/docs/concepts/editor(/.*)?$" to: "/docs/ui/flows" - regexp: "/docs/concepts/expression/basic-usage"