Generate timestamps for time series predictions without test data#1536
Conversation
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/FLAML/sessions/e8fc3b11-897f-4326-8db6-abc2c8f19dd5 Co-authored-by: thinkall <3197038+thinkall@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates FLAML’s time series utilities so predictions can be “prettified” (returned as a timestamped DataFrame) even when no test_data is provided, aligning with common forecasting workflows and addressing #1506.
Changes:
TimeSeriesDataset.prettify_prediction()now auto-generates future timestamps whentest_datais empty/absent, including forDataFrame,Series, andndarraypredictions.create_forward_frame()now usespandas.tseries.frequencies.to_offset()to correctly advance anchored frequencies (e.g., weekly anchors, quarter ends) and adds clearer validation errors.- Adds targeted tests covering timestamp generation without test data and anchored-frequency forward-frame generation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
flaml/automl/time_series/ts_data.py |
Generates forward timestamps in prettify_prediction() and improves create_forward_frame() frequency handling via pandas offsets. |
test/automl/test_ts_data.py |
Adds regression tests for timestamp auto-creation and anchored frequency behavior. |
|
@copilot sync with main branch |
Fetched full history and the latest |
Reviewed PR #1508’s diff, review threads, and comments against this PR at |
The first OSS CI run on this PR uncovered 4 real test failures:
test/automl/test_ts_coverage.py::test_prettify_no_test_ndarray_raises
test/automl/test_ts_coverage.py::test_prettify_no_test_series_raises
OSS PR #1536 ("Generate timestamps for time series predictions
without test data") changed prettify_prediction to auto-generate
timestamps via create_forward_frame instead of raising
ValueError / NotImplementedError when test_data is None.
Update the two internal tests to assert the new graceful behaviour
(a DataFrame with the time column populated) instead of expecting
an exception that no longer fires.
test/nlp/test_hf_utils_coverage.py::test_summarization_with_y_true
test/nlp/test_hf_utils_coverage.py::test_summarization_without_y_true
Both fail with 'Resource punkt_tab not found' because nltk data
is not pre-downloaded on the public GitHub Actions runners.
The internal Azure DevOps pipeline (.pipelines/build.yml) explicitly
ignores test/nlp for the same reason -- mirror that behaviour by
adding --ignore=test/nlp to both notspark and spark CI invocations.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why are these changes needed?
TimeSeriesDataset.prettify_prediction()previously raised when no test data was available. This change generates future timestamps from the dataset frequency and last known timestamp, and keeps prediction values for DataFrame, Series, and ndarray inputs. It also makescreate_forward_frame()use pandas offsets for anchored frequencies.Related issue number
Closes #1506
Checks
python -m ruff check flaml/automl/time_series/ts_data.py test/automl/test_ts_data.pyinstead.