This is the shortest end-to-end walkthrough for a new user who wants to see what llm-batch-pipeline does in practice.
It covers two real workflows:
- OpenAI Batch API with
gpt-4o-mini - A 3-way sharded Ollama setup at
http://nanu:11435,http://nanu:11436, andhttp://nanu:11437(Note: "nanu" is my server name for the ollama server. Replace by your ollama server).
These instructions were tested against the live services on 2026-04-09.
You will:
- create a batch job with the built-in
spam_detectionplugin - add two sample
.emlfiles - render a batch JSONL file
- submit it to a backend
- validate the model output against a Pydantic schema
- evaluate the predictions against ground truth
- Python 3.13+
uv- dependencies installed:
uv sync- for OpenAI: a
.envfile in the repo root withOPENAI_API_KEY=...
The CLI will auto-load .env from the repository root.
Before using any backend, please verify the install:
uv run llm-batch-pipeline list # list the plugins
uv sync --group dev
uv run pytest -q # quick self-test. if this fails, please submit a bug reportuv run llm-batch-pipeline init getting_started_openai --plugin spam_detection --model gpt-4o-miniThis creates a directory like batches/batch_001_getting_started_openai.
Use that path in the commands below as <openai-batch-dir>.
cp src/llm_batch_pipeline/examples/spam_detection/prompt.txt <openai-batch-dir>/prompt.txt
cp src/llm_batch_pipeline/examples/spam_detection/schema.py <openai-batch-dir>/schema.pyIf you feel adventurous, you can modify the prompt.txt. Note that it needs to fit to schema.py of course. schema.py is a pydantic class which helps in validating the answers that the LLM sends back. Of course, every field which is mentioned in schema.py must be present in the prompt.txt and vice versa. It really helps to test out the prompt and the schema on single files.
cat > <openai-batch-dir>/input/ham__team_sync.eml <<'EOF'
From: alice@example.com
To: bob@example.com
Subject: Team sync tomorrow
Date: Mon, 1 Jan 2024 10:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Hi Bob,
Can we meet tomorrow at 3pm to review the release checklist and assign the last two action items?
Thanks,
Alice
EOF
cat > <openai-batch-dir>/input/spam__million_prize.eml <<'EOF'
From: prizes@claim-now.biz
To: victim@example.com
Subject: URGENT!! Claim your 1000000 dollar prize now
Date: Mon, 1 Jan 2024 11:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Congratulations!
You have been selected to receive a 1000000 dollar cash prize. Click http://claim-prize-now.example.com immediately and send your bank details today to avoid losing your winnings.
EOFcat > <openai-batch-dir>/evaluation/category-map.json <<'EOF'
{
"ham": "ham",
"spam": "spam"
}
EOFThe ham__... and spam__... filename prefixes are how the evaluator infers ground truth from this file.
uv run llm-batch-pipeline render --batch-dir <openai-batch-dir> --plugin spam_detectionThis writes the request payload to <openai-batch-dir>/job/batch-00001.jsonl.
This should give something like this:
- discover ok: Discovered 2 files
discover: completed — 2 files
filter_1: completed — kept 2/2
- filter_1 ok: filter_1: kept 2/2
transform: completed — 2 files
⠋ Filtering (post) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/2 0% 0:00:00 < -:--:-- ?it/s- transform ok: transform: transformed 2 files
filter_2: completed — kept 2/2
render: completed — 1 shard(s), 2 requests
Rendered 1 shard(s) to (...)/llm_batch_pipeliner/<openai-batch-dir>/job
- render ok: Rendered 2 requests into 1 shard(s)
You can now look at the file. It is essentially a JSONL file suitable for submission to openai's Batch API:
jq -C . <openai-batch-dir>/job/batch.jsonlIt's interesting to see how the prompt as well as the rendered schema.py end up as a list of JSON structures for openai.
Next it's time to ...
uv run llm-batch-pipeline submit --batch-dir <openai-batch-dir> --backend openaiNotes:
- this command waits for the batch to complete by default
- in the live test for this guide, a 2-request batch took about 45 minutes to finish
- batch metadata is saved to
<openai-batch-dir>/output/submission.json
If you do not want to keep the terminal open:
uv run llm-batch-pipeline submit --batch-dir <openai-batch-dir> --backend openai --no-wait
uv run llm-batch-pipeline submit --batch-dir <openai-batch-dir> --backend openai --resume-batch-id <batch-id>You can also go to platform.openai.com, log in and go to the batches tab:
It can take 24h to process the batch file. You can't specify shorter time to completion windows than 24h. After the batch completed, the system can download the resulting output and ...
uv run llm-batch-pipeline validate --batch-dir <openai-batch-dir>This reads <openai-batch-dir>/output/output.jsonl and writes validated rows to <openai-batch-dir>/results/validated.json.
uv run llm-batch-pipeline evaluate \
--batch-dir <openai-batch-dir> \
--label-field classification \
--confidence-field confidence \
--positive-class spamThis prints accuracy, macro F1, per-class metrics, and the confusion matrix to the terminal.
In the tested run, the OpenAI batch classified both sample emails correctly.
uv run llm-batch-pipeline init getting_started_ollama --plugin spam_detection --model gemma4:latestThis creates a directory like batches/batch_002_getting_started_ollama.
Use that path in the commands below as <ollama-batch-dir>.
cp src/llm_batch_pipeline/examples/spam_detection/prompt.txt <ollama-batch-dir>/prompt.txt
cp src/llm_batch_pipeline/examples/spam_detection/schema.py <ollama-batch-dir>/schema.pycat > <ollama-batch-dir>/input/ham__team_sync.eml <<'EOF'
From: alice@example.com
To: bob@example.com
Subject: Team sync tomorrow
Date: Mon, 1 Jan 2024 10:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Hi Bob,
Can we meet tomorrow at 3pm to review the release checklist and assign the last two action items?
Thanks,
Alice
EOF
cat > <ollama-batch-dir>/input/spam__million_prize.eml <<'EOF'
From: prizes@claim-now.biz
To: victim@example.com
Subject: URGENT!! Claim your 1000000 dollar prize now
Date: Mon, 1 Jan 2024 11:00:00 +0000
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Congratulations!
You have been selected to receive a 1000000 dollar cash prize. Click http://claim-prize-now.example.com immediately and send your bank details today to avoid losing your winnings.
EOF
cat > <ollama-batch-dir>/evaluation/category-map.json <<'EOF'
{
"ham": "ham",
"spam": "spam"
}
EOFuv run llm-batch-pipeline render --batch-dir <ollama-batch-dir> --plugin spam_detectionuv run llm-batch-pipeline submit \
--batch-dir <ollama-batch-dir> \
--backend ollama \
--model gemma4:latest \
--base-url http://nanu:11435 \
--base-url http://nanu:11436 \
--base-url http://nanu:11437 \
--num-shards 3 \
--num-parallel-jobs 1Notes:
- these exact three URLs were verified for this guide
http://11436is not a valid endpoint; usehttp://nanu:11436- in the live test for this guide, the full 2-request Ollama submission finished in about 6 seconds
uv run llm-batch-pipeline validate --batch-dir <ollama-batch-dir>uv run llm-batch-pipeline evaluate \
--batch-dir <ollama-batch-dir> \
--label-field classification \
--confidence-field confidence \
--positive-class spamIn the tested run, the Ollama batch also classified both sample emails correctly.
After render:
<batch-dir>/job/batch-00001.jsonl
After submit:
<batch-dir>/output/output.jsonl<batch-dir>/output/summary.json
After validate:
<batch-dir>/results/validated.json
After evaluate:
- metrics printed to stdout
If you already trust your prompt, schema, and backend settings, you can collapse the whole pipeline into one command:
uv run llm-batch-pipeline run --batch-dir <batch-dir> --plugin spam_detection --auto-approve ...For a first pass, the staged workflow above is easier to debug because you can inspect the rendered JSONL, raw model output, validated JSON, and evaluation step separately.