Skip to content

Commit 296def8

Browse files
committed
narrow exception in count_input_cfg_levels to FileNotFoundError
The broad silently swallowed all errors when _resolve_if_path could not load a YAML file, masking real problems (permission errors, malformed YAML, etc.). The only legitimate failure case is FileNotFoundError: when a nested input_cfg YAML contains OmegaConf interpolations like , raw yaml.load returns them as literal strings that don't exist on disk. parse_and_combine_datasets resolves these later via OmegaConf.create(). All other errors should propagate immediately since they would also crash parse_and_combine_datasets. Also documents file-path input_cfg depth counting in the RST docs. Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
1 parent c8e2146 commit 296def8

2 files changed

Lines changed: 35 additions & 5 deletions

File tree

docs/source/audio/configs.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,28 @@ The maximum nesting depth is calculated as the maximum depth of ``input_cfg`` ke
221221
input_cfg: # Level 2 (same as above)
222222
- type: lhotse_shar
223223
224+
When ``input_cfg`` is overridden via CLI to a YAML file path (e.g.
225+
``model.train_ds.input_cfg=train_all.yaml``), the depth calculation loads the
226+
referenced file and traverses its contents to count nested ``input_cfg`` keys.
227+
This also works with multi-level file references:
228+
229+
.. code-block:: yaml
230+
231+
# train_all.yaml (referenced via input_cfg=train_all.yaml)
232+
- type: group
233+
weight: 100
234+
input_cfg: ${oc.env:MANIFEST_ROOT}/train_en.yaml # resolved at runtime
235+
- type: group
236+
weight: 200
237+
input_cfg: ${oc.env:MANIFEST_ROOT}/train_de.yaml
238+
239+
.. note::
240+
241+
Paths containing OmegaConf interpolations (e.g. ``${oc.env:MANIFEST_ROOT}``)
242+
cannot be resolved during depth counting -- they are resolved later at runtime
243+
by ``OmegaConf.create()``. Such paths are treated as a single additional
244+
nesting level.
245+
224246
**Example: Balancing Multiple Task Groups**
225247

226248
.. code-block:: yaml

nemo/collections/common/data/lhotse/cutset.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -478,9 +478,10 @@ def count_input_cfg_levels(config: Union[DictConfig, dict]) -> int:
478478
479479
String/Path values for ``input_cfg`` are treated as file references (mirroring
480480
:func:`parse_and_combine_datasets`) and loaded so that nested ``input_cfg``
481-
keys inside those files are counted. If a file cannot be loaded (e.g. the
482-
path contains unresolved OmegaConf interpolations), it is conservatively
483-
counted as one additional level.
481+
keys inside those files are counted. If the file is not found (e.g. the
482+
path contains unresolved OmegaConf interpolations such as
483+
``${oc.env:MANIFEST_ROOT}``), it is conservatively counted as one additional
484+
level. All other I/O or parsing errors propagate immediately.
484485
485486
Args:
486487
config: Configuration dictionary that may contain nested 'input_cfg' keys.
@@ -502,13 +503,20 @@ def count_input_cfg_levels(config: Union[DictConfig, dict]) -> int:
502503
_cache: dict[str, object] = {}
503504

504505
def _resolve_if_path(val):
505-
"""If *val* is a string/Path, try to load the YAML it points to."""
506+
"""If *val* is a string/Path, load the YAML file it points to.
507+
508+
Raises on I/O or parse errors except ``FileNotFoundError``, which is
509+
expected when the path contains OmegaConf interpolations (e.g.
510+
``${oc.env:MANIFEST_ROOT}/file.yaml``) that raw ``yaml.load`` returns
511+
as literal strings. ``parse_and_combine_datasets`` resolves them at
512+
runtime via ``OmegaConf.create()``.
513+
"""
506514
if isinstance(val, (str, Path)):
507515
key = str(val)
508516
if key not in _cache:
509517
try:
510518
_cache[key] = load_yaml(key)
511-
except Exception:
519+
except FileNotFoundError:
512520
logging.debug("count_input_cfg_levels: could not load %r, treating as leaf", key)
513521
_cache[key] = val
514522
return _cache[key]

0 commit comments

Comments
 (0)