This document lists every environment variable recognized by the MetaClaw benchmark framework, grouped by who sets them and where they are consumed.
These are set by the user (or by sourcing scripts/_env_arg_example.sh)
before running any script or CLI command.
| Variable | Required | Description |
|---|---|---|
BENCHMARK_BASE_URL |
yes | Base URL of the OpenAI-compatible API endpoint, e.g. https://api.openai.com/v1 |
BENCHMARK_API_KEY |
yes | API key for the above endpoint |
BENCHMARK_MODEL |
yes | Model ID as expected by the API server, e.g. gpt-4o |
Where consumed:
data/*/openclaw_cfg/openclaw.jsonβ injected as"baseUrl","apiKey", model"id", and"primary"key via${BENCHMARK_BASE_URL},${BENCHMARK_API_KEY},${BENCHMARK_MODEL}placeholders.infer_cmd.pyrewrites these placeholders into the per-test work copy ofopenclaw.jsonat runtime.scripts/config/*.yamlβ injected asllm.api_base,llm.api_key,llm.model_idwhen the script callswrite_proxy_config().
BENCHMARK_MODELmust exactly match the model ID your API server expects. It is embedded in the agent config as"primary": "metaclaw-bench/${BENCHMARK_MODEL}".
| Variable | Required | Default | Description |
|---|---|---|---|
METACLAW_ROOT |
no | auto-detected | Absolute path to the MetaClaw project root (the directory that contains benchmark/) |
Where consumed:
src/utils.py:get_project_root()β all relative path resolution in the CLI goes through this function. If unset, it falls back toPath(__file__).parent.parent.parent(three levels abovesrc/utils.py), which works correctly when the package is installed in-place frombenchmark/.src/infer/infer_cmd.pyβ passed asMETACLAW_ROOTin the environment of every openclaw gateway and agent subprocess so that${METACLAW_ROOT}placeholders inopenclaw.jsonare resolved correctly.data/*/openclaw_cfg/openclaw.jsonβ referenced in"agentDir"and the plugin"logDir"config as${METACLAW_ROOT}/....- All
scripts/*.pyβ_METACLAW_ROOT = Path(os.environ.get("METACLAW_ROOT") or _SCRIPT_DIR.parent.parent)to derive default paths for logs, inputs, and outputs.
Recommendation: always set this to an absolute path to avoid working-directory-dependent failures.
| Variable | Required | Default | Description |
|---|---|---|---|
METACLAW_BENCH_BIN |
no | metaclaw-bench |
Explicit path to the metaclaw-bench binary |
METACLAW_BIN |
no | metaclaw |
Explicit path to the metaclaw binary (used by proxy_run.py) |
Where consumed:
METACLAW_BENCH_BINβcfg.BENCH_BINin allscripts/*.py(exceptproxy_run.py). Defaults to"metaclaw-bench", which is resolved viaPATH.METACLAW_BINβcfg.METACLAW_BINinscripts/proxy_run.py. Defaults to"metaclaw", resolved viaPATH.
Set these when the binaries are not on PATH or when you need to pin a specific installation.
| Variable | Required | Default | Description |
|---|---|---|---|
METACLAW_API_KEY_SCRIPT |
no | unset | Absolute path to a shell script that exports credentials. The script is sourced before the benchmark runs. |
Where consumed:
cfg.API_KEY_SCRIPTin allscripts/*.py. When set, the script callsload_env_from_shell(cfg.API_KEY_SCRIPT)to load the environment before constructing any subprocesses. If unset, the current shell environment is used as-is.
Useful when credentials are managed by a secrets manager or a separate auth helper.
| Variable | Required | Default | Description |
|---|---|---|---|
METACLAW_BENCH_INPUT |
no | per-script default | Absolute path to the all_tests*.json file used as benchmark input. Overrides the built-in default in every runner script simultaneously. Useful for switching between the full dataset and a smaller subset (e.g. metaclaw-bench-small) without modifying any script. |
Default values by script:
baseline_run.pyβ<METACLAW_ROOT>/benchmark/data/metaclaw-bench/all_tests.json- all other scripts β
<METACLAW_ROOT>/benchmark/data/metaclaw-bench/all_tests_metaclaw.json
Where consumed:
cfg.BENCH_INPUTin allscripts/*.pyβ resolved at import time viaos.environ.get("METACLAW_BENCH_INPUT", <default>).
| Variable | Required | Default | Description |
|---|---|---|---|
METACLAW_SKILLS_DIR |
required for skills scripts | <METACLAW_ROOT>/memory_data/skills |
Path to the directory containing pre-built skills |
Where consumed:
cfg.ORIGINAL_SKILL_DIRinscripts/proxy_passthrough_run.py,skills_only_run.py,rl_run.py,skills_memory_run.py,madmax_memory_run.py.- Each script copies this directory to a temporary location before starting the proxy so that every run begins from the same initial skill state.
- The temporary copy path is then written into
skills.dirinside the proxy YAML config viawrite_proxy_config().
Required only for scripts that use RL training (rl_only_run.py, rl_run.py,
rl_only_memory_run.py, madmax_memory_run.py).
| Variable | Description |
|---|---|
TINKER_KEY |
API key for the Tinker RL fine-tuning service |
TINKER_MODEL |
Model ID used by the RL fine-tuning step |
PRM_MODEL |
Process Reward Model ID used for RL scoring |
Where consumed:
scripts/config/rl.yaml,rl-only.yaml,rl-only-memory.yaml,madmax.yamlβ injected asrl.model,rl.tinker_api_key,rl.prm_model,rl.prm_api_keyvia${VAR}placeholders expanded inwrite_proxy_config().
These are set by the framework at runtime and are not intended to be configured by the user directly. They are documented here for reference when debugging or extending the benchmark.
Set by runner scripts (scripts/*.py) after calling find_free_port().
Injected into the metaclaw-bench subprocess environment so that
${METACLAW_PROXY_PORT} placeholders in openclaw.json resolve to the
dynamically chosen port.
Consumed in src/infer/infer_cmd.py:
- Line ~143:
proxy_port = os.environ.get("METACLAW_PROXY_PORT", "30000")β replaces${METACLAW_PROXY_PORT}in the work copy ofopenclaw.json. - Line ~1129: read again in
_trigger_train_step()to pass--porttometaclaw train-step.
A placeholder in data/*/openclaw_cfg/openclaw.json:
"workspace": "${BENCHMARK_WORKSPACE_DIR}"This placeholder is never expanded via the environment. Instead,
infer_cmd.py:_patch_agent_workspace() directly rewrites the agent's
"workspace" field in the per-test work copy of openclaw.json with the
actual workspace path for that test. The placeholder just marks the field as
needing per-test injection.
Set by src/infer/infer_cmd.py when spawning openclaw gateway run and
openclaw agent subprocesses. Each test gets a fully isolated work copy of
the openclaw state and a dedicated gateway process on a free port.
| Variable | Set in | Passed to |
|---|---|---|
OPENCLAW_CONFIG_PATH |
_start_work_gateway(), _run_openclaw_agent() |
openclaw gateway run, openclaw agent |
OPENCLAW_STATE_DIR |
_start_work_gateway(), _run_openclaw_agent() |
openclaw gateway run, openclaw agent |
OPENCLAW_GATEWAY_PORT |
_start_work_gateway(), _run_openclaw_agent() |
openclaw gateway run, openclaw agent |
Set by runner scripts (scripts/*.py) before launching proxy_run.py as a
subprocess:
proxy_env["METACLAW_CONFIG_FILE"] = config_pathproxy_run.py reads this variable and passes --config <path> to
metaclaw start, pointing it at the dynamically generated (port-patched) YAML
config for that run.