Summary — verified end-to-end
In evals-mode hyperparameter sweeps, RFRandomSearch / RFGridSearch config_set handling has two surprising behaviors that compound:
- The user-facing key
api_config (used in all RAG tutorial examples) is silently renamed to "pipeline" in the output of get_runs(). Code that introspects run["api_config"] after get_runs() returns will KeyError or fall back to None.
- Arbitrary unrecognized keys (e.g.
"prompt_id": List([1, 2, 3])) are accepted by get_runs() and propagated into the sampled run dict, but no downstream framework code reads them. The keys never reach preprocess_fn's batch, never appear in MLflow params, are stripped from the DB JSON config, and dropped on clone. Net effect: the user thinks they're sweeping an axis, but the framework treats every sampled value identically.
There is no warning, no debug log, no error on either path.
Verified end-to-end via a working notebook that uses rapidfireai's normal public API (Experiment, RFAPIModelConfig, RFRandomSearch) with real OpenAI gpt-4o-mini calls. No source-poking, no monkey-patching. Notebook: https://gist.github.com/kamran-rapidfireAI/13189883f3d1c9aa99012d8f92b68bd4
End-to-end repro — all three claims confirmed
Claim 1: api_config is silently renamed to pipeline
from rapidfireai.automl import List, RFRandomSearch
config_set = {
"api_config": List([cfg_a, cfg_b]),
"batch_size": 4,
"my_prompt_id": List(["v1", "v2", "v3"]), # custom — never consumed
}
runs = RFRandomSearch(config_set, num_runs=3).get_runs(seed=42)
print(list(runs[0].keys()))
# => ['pipeline', 'batch_size', 'my_prompt_id']
# Note: 'api_config' is gone (renamed to 'pipeline'); 'my_prompt_id' survives.
Claim 2: my_prompt_id is sampled but never reaches preprocess_fn
preprocess_fn was instrumented to write its received batch.keys() to a file. Captured output:
preprocess_fn received batch keys: ['query', 'query_id']
my_prompt_id in batch? False
The sampled value of my_prompt_id for this run was 'v3', but preprocess_fn never sees it.
Claim 3: my_prompt_id is NOT logged to MLflow
SQL query against ~/rapidfireai/db/rapidfire_mlflow.db for the run:
key | value
----------------+-------------------------
model | bug245_gpt4omini_exp
rag_k | 2
rag_search_type | similarity
No my_prompt_id row. A reviewer cannot tell from MLflow that the axis was even declared.
Why it matters
We hit this trying to sweep prompt templates via "prompt_id": List([...]). The MLflow UI showed runs with identical params (because prompt_id was never logged) and metrics differing only at the noise floor (because preprocess_fn always got the same default template). It took several hours to discover that the axis was a no-op.
The silent rename of the documented key api_config → "pipeline" in get_runs() output is a foot-gun. Code that does for run in get_runs(): cfg = run["api_config"] (which is exactly what the example notebooks' key naming suggests) silently fails.
Environment
- rapidfireai:
main (HEAD 91d94de); same behavior on 0.15.2 PyPI
- Python 3.12, Linux
- Setup used in the verification notebook: experiment_name
bug245_repro (auto-suffixed to _2), model gpt-4o-mini, 3 questions, 1 shard
Summary — verified end-to-end
In evals-mode hyperparameter sweeps,
RFRandomSearch/RFGridSearchconfig_sethandling has two surprising behaviors that compound:api_config(used in all RAG tutorial examples) is silently renamed to"pipeline"in the output ofget_runs(). Code that introspectsrun["api_config"]afterget_runs()returns willKeyErroror fall back toNone."prompt_id": List([1, 2, 3])) are accepted byget_runs()and propagated into the sampled run dict, but no downstream framework code reads them. The keys never reachpreprocess_fn's batch, never appear in MLflow params, are stripped from the DB JSON config, and dropped on clone. Net effect: the user thinks they're sweeping an axis, but the framework treats every sampled value identically.There is no warning, no debug log, no error on either path.
Verified end-to-end via a working notebook that uses rapidfireai's normal public API (
Experiment,RFAPIModelConfig,RFRandomSearch) with real OpenAI gpt-4o-mini calls. No source-poking, no monkey-patching. Notebook: https://gist.github.com/kamran-rapidfireAI/13189883f3d1c9aa99012d8f92b68bd4End-to-end repro — all three claims confirmed
Claim 1:
api_configis silently renamed topipelineClaim 2:
my_prompt_idis sampled but never reachespreprocess_fnpreprocess_fnwas instrumented to write its receivedbatch.keys()to a file. Captured output:The sampled value of
my_prompt_idfor this run was'v3', butpreprocess_fnnever sees it.Claim 3:
my_prompt_idis NOT logged to MLflowSQL query against
~/rapidfireai/db/rapidfire_mlflow.dbfor the run:No
my_prompt_idrow. A reviewer cannot tell from MLflow that the axis was even declared.Why it matters
We hit this trying to sweep prompt templates via
"prompt_id": List([...]). The MLflow UI showed runs with identical params (becauseprompt_idwas never logged) and metrics differing only at the noise floor (becausepreprocess_fnalways got the same default template). It took several hours to discover that the axis was a no-op.The silent rename of the documented key
api_config→"pipeline"inget_runs()output is a foot-gun. Code that doesfor run in get_runs(): cfg = run["api_config"](which is exactly what the example notebooks' key naming suggests) silently fails.Environment
main(HEAD91d94de); same behavior on 0.15.2 PyPIbug245_repro(auto-suffixed to_2), modelgpt-4o-mini, 3 questions, 1 shard