Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
bootstrap.yaml	bootstrap.yaml
demo.gif	demo.gif
engrams.yaml	engrams.yaml
prompts.yaml	prompts.yaml
secrets.yaml.example	secrets.yaml.example
story.yaml	story.yaml
storyrun.yaml	storyrun.yaml

Map-Reduce Summarizer

Batch pipeline that fetches and summarizes multiple articles in parallel using the map-reduce pattern.

What it demonstrates

Map-reduce pattern — fans out work across dynamic input arrays
Dynamic parallelism — concurrency is configured, not hard-coded; the adapter spawns one sub-story per item
Sub-stories — article-processor is a self-contained Story invoked per article
Storage refs for large payloads — article bodies may be offloaded to storage transparently
ConfigMap-backed prompts — system prompts stored in a ConfigMap, not inline in Engram YAML

Architecture

graph TD
    A[StoryRun: map-reduce-pipeline] --> B[fan-out]
    B -->|article 1| C1[article-processor]
    B -->|article 2| C2[article-processor]
    B -->|article N| CN[article-processor]
    C1 --> D[aggregate]
    C2 --> D
    CN --> D
    D --> E[digest output]

    subgraph article-processor
        F[fetch] --> G[summarize]
    end

Resources

story.yaml contains two Story resources:

Story	Role
`article-processor`	Sub-story invoked per article (fetch + summarize)
`map-reduce-pipeline`	Main pipeline: fan-out across articles, then aggregate

Prerequisites

BubuStack installed on your cluster
Shared storage enabled for BubuStack payload offloading (for example the SeaweedFS/S3 quickstart with the bubu-default bucket)
EngramTemplates: http-request, openai-chat, map-reduce-adapter
OpenAI API key

Quick start

# 1. Create the namespace
kubectl apply -f bootstrap.yaml

# 2. Make sure shared storage is enabled in the cluster before deploying this example.
#    If you used the Bobrapet quickstart, SeaweedFS/S3 is already installed.

# 3. Create the secrets (copy and edit first)
cp secrets.yaml.example secrets.yaml
# Edit secrets.yaml with your OpenAI API key
kubectl apply -f secrets.yaml

# 4. Deploy the ConfigMap (system prompts)
kubectl apply -f prompts.yaml

# 5. Deploy the Engrams
kubectl apply -f engrams.yaml

# 6. Deploy the Stories (2 Story resources in one file)
kubectl apply -f story.yaml

# 7. Trigger a run
kubectl apply -f storyrun.yaml

Verify

# Watch the StoryRun progress
kubectl get storyruns -n map-reduce-summarizer -w

# Check the top-level StepRun phases
kubectl get stepruns -n map-reduce-summarizer

# Inspect the map stage result summary (succeeded/failed item counts)
kubectl get steprun map-reduce-summarizer-run-fan-out -n map-reduce-summarizer \
  -o json | jq '.status.output.stats'

# View the final digest output
kubectl get storyrun map-reduce-summarizer-run -n map-reduce-summarizer \
  -o jsonpath='{.status.output}' | jq .

Cleanup

kubectl delete namespace map-reduce-summarizer

Under the Hood

The fan-out step uses the map-reduce-adapter Engram. When the StepRun starts, the adapter reads inputs.articles (an array) and submits one durable StoryTrigger per array element. The controller resolves each accepted trigger into a child article-processor StoryRun.
The adapter manages concurrency (3 at a time), watches child StoryRun phases, and collects results. inlineResultsLimit: 10 means successful child outputs are written directly into fan-out.status.output.results instead of forcing another storage lookup hop.
Each child StoryRun is a full StoryRun with its own StepRuns (fetch -> summarize). The controller treats them identically to top-level runs.
When all children complete (or fail with allowFailures), the adapter writes the collected results array to its StepRun output. Large article bodies from the child stories may already be represented as $bubuStorageRef pointers, backed by the shared S3-compatible storage backend. The parent StoryRun controller then creates the aggregate StepRun and resolves those refs during template evaluation when needed.
failFast: false means the adapter waits for ALL children even if some fail early. allowFailures: true means the parent pipeline can still continue, so check fan-out.status.output.stats if the aggregate digest looks empty.

CRDs involved: Story (x2: article-processor + map-reduce-pipeline), StoryTrigger (per map item), StoryRun (parent + N children), StepRun, Engram, EngramTemplate (http-request, openai-chat, map-reduce-adapter), ConfigMap (prompts)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

Map-Reduce Summarizer

What it demonstrates

Architecture

Resources

Prerequisites

Quick start

Verify

Cleanup

Under the Hood

Uh oh!

FilesExpand file tree

map-reduce-summarizer

Directory actions

More options

Directory actions

More options

Latest commit

History

map-reduce-summarizer

Folders and files

parent directory

README.md

Map-Reduce Summarizer

What it demonstrates

Architecture

Resources

Prerequisites

Quick start

Verify

Cleanup

Under the Hood