Skip to content

fix: set per_device_train_batch_size to match dataset size#240

Merged
abrichr merged 2 commits into
mainfrom
fix/trl-empty-dataset-batch-size
Mar 29, 2026
Merged

fix: set per_device_train_batch_size to match dataset size#240
abrichr merged 2 commits into
mainfrom
fix/trl-empty-dataset-batch-size

Conversation

@abrichr
Copy link
Copy Markdown
Member

@abrichr abrichr commented Mar 29, 2026

Summary

TRL's default per_device_train_batch_size=8, but with 1-3 tasks the dataset is too small to form a single batch. TRL computes 0 steps and exits with "There seems not to be a single sample in your epoch_iterator".

Fix: set batch_size=n_tasks when building default GRPOConfig. When user provides their own trl_config, warn if batch_size > dataset_size.

Test plan

  • 27 TRL tests pass
  • Client re-test with single task

🤖 Generated with Claude Code

abrichr and others added 2 commits March 29, 2026 14:18
TRL's default per_device_train_batch_size=8, but with 1-3 tasks the
dataset is too small to form a single batch. TRL computes 0 steps and
exits with "There seems not to be a single sample in your epoch_iterator".

Fix: set batch_size=n_tasks when building default GRPOConfig. When the
user provides their own trl_config, warn if batch_size > dataset size.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
batch_size=n_tasks could OOM on GPU with many tasks. batch_size=1 is
safer and matches the standalone trainer behavior (one task per step,
rotating through tasks via epochs). Each step still does num_generations
rollouts, so learning signal is preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@abrichr abrichr merged commit 048796c into main Mar 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant