Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Multi-Model Consensus

Run the same question through multiple AI models from different providers in parallel and synthesize the best answer.

Multi-Model Consensus Demo

What it demonstrates

  • Cross-provider consensus — OpenAI and Anthropic models analyze the same question in parallel
  • Parallel execution — four analyst steps run concurrently with no dependencies
  • Mixed EngramTemplates — OpenAI analysts use openai-chat, Claude uses http-request calling the Anthropic Messages API directly
  • Fan-in pattern — a judge step waits for all analysts before synthesizing
  • Output aggregation — the Story output section collects individual and synthesized results into storyrun.status.output
  • Automatic payload offloading — AI responses that exceed the default inline threshold (4 KB) are automatically offloaded to S3; the controller resolves storage refs transparently when evaluating templates

Architecture

graph LR
    Q[Question] --> A1[analysis-gpt4o<br/>GPT-4o, t=0.3]
    Q --> A2[analysis-gpt4o-mini<br/>GPT-4o-mini, t=0.3]
    Q --> A3[analysis-creative<br/>GPT-4o-mini, t=0.9]
    Q --> A4[analysis-claude<br/>Claude Sonnet]
    A1 --> J[consensus<br/>GPT-4o, t=0.2]
    A2 --> J
    A3 --> J
    A4 --> J
    J --> C[Consensus Answer]
Loading

Prerequisites

  • BubuStack installed on your cluster
  • Shared storage enabled for BubuStack payload offloading (for example the SeaweedFS/S3 quickstart with the bubu-default bucket)
  • EngramTemplates: openai-chat, http-request
  • OpenAI API key with access to gpt-4o and gpt-4o-mini
  • Anthropic API key with access to Claude Sonnet

Quick start

# 1. Create the namespace
kubectl apply -f bootstrap.yaml

# 2. Make sure shared storage is enabled in the cluster before deploying this example.
#    If you used the Bobrapet quickstart, SeaweedFS/S3 is already installed.

# 3. Create the secrets (copy and edit first)
cp secrets.yaml.example secrets.yaml
# Edit secrets.yaml with your OpenAI and Anthropic API keys
kubectl apply -f secrets.yaml

# 4. Deploy the Engrams
kubectl apply -f engrams.yaml

# 5. Deploy the Story
kubectl apply -f story.yaml

# 6. Trigger a run
kubectl apply -f storyrun.yaml

Verify

# Watch the StoryRun progress
kubectl get storyruns -n multi-model-consensus -w

# Check individual StepRun phases (4 analysts + 1 judge)
kubectl get stepruns -n multi-model-consensus

# View the consensus output
kubectl get storyrun multi-model-consensus-run -n multi-model-consensus \
  -o jsonpath='{.status.output}' | jq .

Cleanup

kubectl delete namespace multi-model-consensus

Under the Hood

  1. The Story defines 5 steps. Four analysts have no needs — the controller creates all 4 StepRuns immediately.

  2. Three OpenAI analysts use the openai-chat EngramTemplate with different config overrides (model, temperature). The Claude analyst uses the http-request EngramTemplate configured with customHeader auth to pass the x-api-key header to the Anthropic Messages API. The Engram encapsulates the URL, method, headers, and auth — the step only provides the request body.

  3. The consensus step has needs: [analysis-gpt4o, analysis-gpt4o-mini, analysis-creative, analysis-claude]. The StoryRun controller watches all 4 StepRuns; only when all reach Succeeded does it create the consensus StepRun.

  4. Template expressions differ by provider. OpenAI steps return output.text directly. The Claude step returns the raw Anthropic Messages API response, so the template uses (index steps["analysis-claude"].output.body.content 0).text to extract the first content block.

  5. AI model responses typically exceed the default 4 KB inline threshold. When this happens, the controller automatically offloads step outputs to S3 and stores a $bubuStorageRef pointer. When the consensus step's template references these outputs, the controller resolves the storage refs transparently based on the templating.offloaded-data-policy setting (inject spawns a materialize pod, controller resolves in-process). See the materialize-demo example for explicit offloading control.

CRDs involved: Story, StoryRun, StepRun, Engram, EngramTemplate (openai-chat, http-request)