Many examples within Eval Protocol, particularly those involving local evaluations (e.g., examples/math_example/local_eval.py or CLI-driven runs like eval-protocol run ...) and TRL integrations (e.g., examples/math_example/trl_grpo_integration.py), leverage Hydra for flexible and powerful configuration management. This guide explains how Hydra is used and how you can interact with it.
Hydra allows for:
- Structured Configuration: Define configurations in YAML files, promoting clarity and organization.
- Composition: Combine and override configuration files easily.
- Command-Line Overrides: Change any configuration parameter directly from the command line without modifying code.
- Dynamic Output Directories: Automatically create organized, timestamped output directories for each run.
An example using Hydra will typically have a conf/ subdirectory:
example_directory/
├── main.py
├── README.md
└── conf/
├── config.yaml # Main configuration for the script (e.g., run_math_eval.yaml, trl_grpo_config.yaml)
├── dataset/ # Dataset specific configurations
│ ├── base_dataset.yaml
│ └── specific_dataset_prompts.yaml
└── ... # Other partial configs (e.g., model configs, training params)
- Main Config File: This is the entry point for Hydra (e.g.,
run_math_eval.yamlforeval-protocol run, ortrl_grpo_config.yamlfor a TRL script). It often includes defaults and references other configuration files or groups. - Dataset Configs: As detailed in the Dataset Configuration Guide, these YAML files define how data is loaded and processed.
- Other Configs: You might find separate YAML files for model parameters, training arguments, or other components, which are then composed into the main config.
-
Activate Your Virtual Environment:
source .venv/bin/activate -
Navigate to the Repository Root: Most Hydra-based scripts are designed to be run from the root of the
eval-protocolrepository. -
Execute the Script:
- For
eval-protocol run(CLI-driven evaluation): Theeval-protocol runcommand itself is integrated with Hydra. You specify the path to the configuration directory and the name of the main configuration file.# Example from math_example python -m eval_protocol.cli run --config-path examples/math_example/conf --config-name run_math_eval.yaml - For Python scripts using Hydra directly (e.g., TRL integration):
Hydra automatically discovers the
# Example from math_example .venv/bin/python examples/math_example/trl_grpo_integration.pyconf/directory relative to the script if structured correctly, or uses the--config-pathand--config-namearguments if the script is designed to accept them likeeval-protocol run.
- For
This is one of Hydra's most powerful features. You can override any parameter defined in the YAML configuration files directly from the command line using a key=value syntax.
-
Simple Override:
# Override number of samples for eval-protocol run python -m eval_protocol.cli run --config-path examples/math_example/conf --config-name run_math_eval.yaml evaluation_params.limit_samples=10 # Override model name for a TRL script .venv/bin/python examples/math_example/trl_grpo_integration.py model_name=mistralai/Mistral-7B-Instruct-v0.2
-
Nested Parameter Override: Use dot notation to access nested parameters.
# Override learning rate within GRPO training arguments for a TRL script .venv/bin/python examples/math_example/trl_grpo_integration.py grpo.learning_rate=5e-5 grpo.num_train_epochs=3 -
Changing Dataset: If your main config allows choosing different dataset configurations (often via a
defaultslist or a parameter likedataset_config_name):# Assuming run_math_eval.yaml can switch dataset configs python -m eval_protocol.cli run --config-path examples/math_example/conf --config-name run_math_eval.yaml dataset=my_custom_dataset_prompts(The exact way to switch datasets depends on how the
main_config.yamlis structured, often using config groups.) -
Multiple Overrides:
python -m eval_protocol.cli run --config-path examples/math_example/conf --config-name run_math_eval.yaml \ evaluation_params.limit_samples=5 \ generation.model_name="accounts/fireworks/models/llama-v3p1-8b-instruct" \ reward.params.tolerance=0.01
Hydra automatically manages output directories for each run.
- Default Location: By default, outputs (logs, results, saved models, etc.) are saved to a timestamped directory structure like
outputs/YYYY-MM-DD/HH-MM-SS/(relative to where the command is run, typically the repository root). - Configuration: The base output path can often be configured within the YAML files (e.g., via
hydra.run.diror a custom output directory parameter in the main config). - Contents: Inside the run-specific output directory, you'll typically find:
- A
.hydra/subdirectory containing the complete configuration used for that run (including overrides), which is excellent for reproducibility. - Log files.
- Result files (e.g.,
*.jsonlfiles with evaluation scores or generated outputs).
- A
- Look for a
conf/directory within an example to understand its Hydra setup. - Use command-line overrides for quick experiments without changing YAML files.
- Check the
.hydra/directory in your outputs to see the exact configuration used for any given run.
For more advanced Hydra features, such as multi-run for hyperparameter sweeps or custom plugins, refer to the official Hydra documentation.