feat: add YAML support for config files#3075
Conversation
Greptile SummaryThis PR adds YAML parsing as a fallback when a config value cannot be parsed as JSON, using the already-vendored Key findings:
Confidence Score: 4/5Safe to merge after fixing the misleading error message; the core YAML parsing logic is correct. One P1 issue exists: valid YAML that is not a dict is reported as 'not valid JSON or YAML', which is incorrect and will confuse users. The two remaining issues are P2 style. Score is 4 to signal the P1 should be addressed first. metaflow/user_configs/config_options.py — specifically the error-message path in the YAML fallback block (lines 385–399). Important Files Changed
Reviews (1): Last reviewed commit: "feat: add YAML support for config files" | Re-trigger Greptile |
| except json.JSONDecodeError: | ||
| try: | ||
| from metaflow._vendor import yaml | ||
| read_value = yaml.safe_load(val) | ||
|
|
||
| if not isinstance(read_value, dict): | ||
| raise ValueError("YAML did not produce a dictionary") | ||
|
|
||
| except Exception as e: | ||
| msgs.append( | ||
| "configuration value for '%s' is not valid JSON or YAML: %s" | ||
| % (name, e) | ||
| ) | ||
| continue |
There was a problem hiding this comment.
Misleading error message when YAML is structurally valid but not a mapping
The broad except Exception as e: block on line 393 catches both genuine YAML parse errors and the ValueError("YAML did not produce a dictionary") raised two lines above it. This means that when a user supplies valid YAML that isn't a mapping (e.g. a list - item1\n- item2, a scalar 42, or a bare string hello), they'll receive:
configuration value for 'my_config' is not valid JSON or YAML: YAML did not produce a dictionary
This message is factually wrong — the input is valid YAML — and will confuse users who are trying to diagnose why their config is rejected.
A clean fix is to narrow the outer catch to only YAML scanner/parser errors and emit a distinct message for the non-dict case:
except json.JSONDecodeError:
from metaflow._vendor import yaml
from metaflow._vendor.yaml import YAMLError
try:
read_value = yaml.safe_load(val)
except YAMLError as e:
msgs.append(
"configuration value for '%s' is not valid JSON or YAML: %s"
% (name, e)
)
continue
if not isinstance(read_value, dict):
msgs.append(
"configuration value for '%s' is valid YAML but must be a "
"mapping (got %s)" % (name, type(read_value).__name__)
)
continueThis also surfaces a secondary risk: if the import itself ever fails, the current code swallows the ImportError and reports it as a JSON/YAML parse failure, making that failure very hard to debug.
| continue | ||
| except json.JSONDecodeError: | ||
| try: | ||
| from metaflow._vendor import yaml |
There was a problem hiding this comment.
Import should be at module level, consistent with codebase convention
The from metaflow._vendor import yaml is placed inside a per-call exception handler. Every failed JSON parse will trigger a new import statement in the hot path. More importantly, every other file in this codebase that uses the vendored YAML module imports it at the top of the file (e.g. metaflow/plugins/parsers.py line 1, metaflow/includefile.py line 13). Moving this import to the top of config_options.py alongside the existing from metaflow._vendor import click would be consistent and cleaner.
| % (name, e) | ||
| ) | ||
| continue | ||
| # TODO: Support YAML |
There was a problem hiding this comment.
This PR adds YAML support for config files.
Currently, Metaflow only supports JSON. This change adds a fallback to YAML parsing using yaml.safe_load.
Tested locally using a sample flow with YAML config.