Skip to content

Possible train/val/test leakage in ConstitutionOptimizer due to row-level splitting instead of group-aware splitting #1

@RobinsonBeato

Description

@RobinsonBeato

Possible train/val/test leakage in ConstitutionOptimizer due to row-level splitting instead of group-aware splitting

Motivation

Hi, thanks for open-sourcing this project.

While reading ConstitutionOptimizer.get_train_test_val_data(), I noticed that the reward dataset appears to be split at the individual-row level after random.shuffle(reward_dataset), without grouping by root_id, root_prompt, parent_id, or parent_prompt.

From the current implementation, each reward example is built from exploration rows that retain strong structural dependencies:

  • shared root_prompt / root_id
  • shared parent_prompt / parent_id
  • multiple ACE mutations derived from the same parent prompt

Potential issue

This seems important because the surrogate-classifier evaluation may then place highly correlated examples from the same exploration subtree into both train and val/test splits.

In that case, the reported validation/test loss may overestimate generalization, since the model is not being evaluated on truly independent prompt families.

More concretely:

  • exploration_data.csv is loaded
  • labeled rows are converted into reward_dataset
  • the dataset is shuffled
  • then split 80/10/10 at the example level

This could lead to optimistic estimates of surrogate performance, especially if multiple mutations from the same prompt subtree appear across splits, effectively reducing the independence of evaluation samples.

This might be particularly relevant if the goal is to assess generalization to unseen prompts rather than interpolation within a prompt family.

Question

I may be missing intended behavior, but I did not see a grouping-aware split.

Would it make sense to switch to a grouped split, for example:

  • grouping by root_id to test generalization to unseen root prompts / trees, or
  • grouping by parent_id / parent_prompt to ensure sibling mutations of the same prompt do not leak across splits?

Suggestion

A possible approach could be to use a group-aware split (e.g., GroupShuffleSplit / GroupKFold) where the grouping key is:

  • root_id (for stronger generalization evaluation), or
  • parent_id (for stricter independence among sibling mutations)

I think this would make the constitution-surrogate evaluation more robust and easier to interpret.

Closing

Happy to put together a PR if this direction makes sense.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions