Skip to content

Docs/fit dataset error#3086

Open
jakubchlapek wants to merge 5 commits into
unit8co:masterfrom
jakubchlapek:docs/fit-dataset-error
Open

Docs/fit dataset error#3086
jakubchlapek wants to merge 5 commits into
unit8co:masterfrom
jakubchlapek:docs/fit-dataset-error

Conversation

@jakubchlapek
Copy link
Copy Markdown
Collaborator

@jakubchlapek jakubchlapek commented Apr 24, 2026

Checklist before merging this PR:

  • Mentioned all issues that this PR fixes or addresses.
  • Summarized the updates of this PR under Summary.
  • Added an entry under Unreleased in the Changelog.

Fixes #3006 .

Summary

Document and log a warning that add_encoders are ignored on TorchForecastingModel.fit_from_dataset / predict_from_dataset, and note TFT categorical_embedding_sizes keying by column index when using that path.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.22%. Comparing base (9d3c27a) to head (ee55187).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3086      +/-   ##
==========================================
- Coverage   96.29%   96.22%   -0.07%     
==========================================
  Files         160      160              
  Lines       17227    17231       +4     
==========================================
- Hits        16588    16581       -7     
- Misses        639      650      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@dennisbader dennisbader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @jakubchlapek 🚀

I added some suggestions, mainly regarding the warning that would now also be raised whenever user calls fit() or predict() directly

Comment thread darts/models/forecasting/tft_model.py Outdated
``{"some_column": (64, 8)}``.
Note that ``TorchForecastingModels`` only support numeric data. Consider transforming/encoding your data
with `darts.dataprocessing.transformers.static_covariates_transformer.StaticCovariatesTransformer`.
When training via ``TorchForecastingModel.fit_from_dataset()``, categorical embeddings are resolved by
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When training via ``TorchForecastingModel.fit_from_dataset()``, categorical embeddings are resolved by
When training via ``fit_from_dataset()``, categorical embeddings are resolved by

This function can be called several times to do some extra training. If ``epochs`` is specified, the model
will be trained for some (extra) ``epochs`` epochs.

Encoders configured via ``add_encoders`` at model creation are not applied here; ``train_dataset``
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be wrapped in a "note" section (similar to how we have it in other places) to make it more obvious

self
Fitted model.
"""
if self.encoders is not None and self.encoders.encoding_available:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fit_from_dataset() is also called when calling fit() directly. So this warning will always show. We want it to only show if the user called it fit_from_dataset directly

``trainer``. For more information on PyTorch Lightning Trainers check out `this link
<https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html>`__.

Encoders configured via ``add_encoders`` at model creation are not applied here; ``dataset`` must already
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could also be a note section


self._verify_inference_dataset_type(dataset)

if self.encoders is not None and self.encoders.encoding_available:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here this would also be called when calling predict() directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Improve handling & documentation of encoders and categorical static covariates in .fit_from_dataset()

2 participants