This document explains the current capabilities of the TQL workflow runtime and how to take advantage of the latest improvements.
needs= step IDs (not dataset names). The engine now maps those identifiers back to the dataset emitted by each step, so query and output runners automatically pull in the correct data.- Use the optional
fromfield on query steps when you want to target a specific named dataset. Otherwise, every dependency listed inneedsis loaded into the query's EAV store. - Map-mode HTTP sources resolve their
mapFromreference through the same lookup, so you can safely use either the step id or the dataset name when iterating.
steps:
- id: fetch_posts
type: source
source:
kind: http
url: 'https://api.example.com/posts'
mode: batch
out: posts
- id: filter_posts
type: query
needs: [posts] # ❌ Wrong! Should be [fetch_posts]
eqls: 'FIND item AS ?p RETURN ?p'
out: filtered_postsError: Unknown dependency "posts" in step "filter_posts". Did you mean step id "fetch_posts"?
steps:
- id: fetch_data
type: source
source:
kind: http
url: 'https://api.example.com/data'
mode: batch
# ❌ Missing out: field
- id: process_data
type: query
needs: [fetch_data] # ❌ fetch_data doesn't produce output
eqls: 'FIND item AS ?x RETURN ?x'
out: processed_datasteps:
- id: fetch_users
type: source
source:
kind: http
url: 'https://api.example.com/users'
mode: batch
out: users
- id: fetch_posts
type: source
source:
kind: http
url: 'https://api.example.com/posts/{{row.id}}'
mode: map
# ❌ Missing mapFrom: users
out: posts- Output steps now read from their declared dependencies instead of the most recent dataset in memory. When multiple dependencies are present, the workflow uses the last one in the list; keep the desired dataset last for clarity.
- The runner surfaces friendly log lines showing which dataset and row count were written.
- Cache keys include the normalized step specification plus a hash of the datasets that have been materialised so far. Because dependency resolution is deterministic, cache hits remain stable across runs.
Note: You can use tql wf as a shortcut for tql workflow.
tql workflow run examples/workflows/webfonts-serifs.yml --dry --limit 10 --log prettyOptions:
--dry: Dry run mode (limit data processing)--watch: Watch file for changes and re-run--limit <number>: Limit rows per step (default: 50)--var <key=value...>: Set template variables (can be used multiple times)--cache <mode>: Cache mode: read|write|off (default: write)--log <format>: Log format: pretty|json (default: pretty)--no-color: Disable colored output--out <dir>: Output directory (default: ./out)
# Show execution plan
tql workflow plan examples/workflows/simple-demo.yml
# Generate Graphviz DOT format
tql workflow plan examples/workflows/simple-demo.yml --dot
# Generate Mermaid format
tql workflow plan examples/workflows/simple-demo.yml --mermaid
# Generate JSON format for tooling
tql workflow plan examples/workflows/simple-demo.yml --jsonOptions:
--var <key=value...>: Set template variables (can be used multiple times)--dot: Output as Graphviz DOT format--mermaid: Output as Mermaid format--json: Output as JSON format
# Pass variables for URL interpolation
tql workflow run workflow.yml --var API_KEY=abc123 --var LIMIT=100
# Use in workflow YAML
steps:
- id: fetch_data
type: source
source:
kind: http
url: "https://api.example.com/data?key={{var.API_KEY}}&limit={{var.LIMIT}}"
mode: batch
out: data- Cache not working: Check
--cache writeon first run, then--cache readon subsequent runs - Cache key correlation: Use
--log jsonto see cache keys, or look for[abc12345]in pretty logs
- Large responses: Default 10MB limit. Set
TQL_HTTP_MAX_BYTES=50MBenvironment variable to increase - Non-JSON responses: Check content-type header and first 256 bytes of response body in error message
- Row limits:
--limit Ncaps all steps to N rows - Map mode caps: First 20 rows of
mapFromdataset processed in dry mode - Memory usage: Large datasets may be truncated; use
--limitto control
- "Unknown dependency": Use step IDs in
needs, not dataset names - "Circular dependency": Shows minimal cycle like
a → b → a - "Missing mapFrom": Map mode sources require
mapFrom: dataset_name - "Duplicate output": Each step's
outfield must be unique
- Opt-in analytics: Set
TQL_TELEMETRY=trueto help improve TQL - No PII collected: Only command types, durations, and success/failure
- Disable: Set
TQL_TELEMETRY=falseor unset the environment variable