Following pandas 3.0, make Day cftime offset non-Tick-like#10650
Conversation
2f3d6c0 to
6b7f4bd
Compare
6b7f4bd to
ea43798
Compare
| def test_resample(freqs, closed, label, offset) -> None: | ||
| def test_resample_with_tick_resample_freq(freqs, closed, label, offset) -> None: | ||
| initial_freq, resample_freq = freqs | ||
| if ( |
There was a problem hiding this comment.
Since our minimum version of pandas is greater than or equal to 2.2, I took the opportunity to remove this old test-skipping logic.
| ) | ||
| @pytest.mark.parametrize("closed", [None, "left", "right"]) | ||
| @pytest.mark.parametrize("label", [None, "left", "right"]) | ||
| def test_resample_with_non_tick_resample_freq(freqs, closed, label) -> None: |
There was a problem hiding this comment.
Since the offset and origin options are not relevant when resampling to a non-Tick frequency, I split these tests off into their function to avoid warnings and avoid unnecessary duplication of tests.
| @pytest.mark.parametrize( | ||
| "freq,units", | ||
| [ | ||
| ("D", "D"), |
There was a problem hiding this comment.
Now that Day is non-Tick-like, multiplying by a float is no longer valid in pandas, so shift with a float value and Day offset will raise an error:
>>> import pandas as pd
>>> times = pd.date_range("2000", periods=5)
>>> times.shift(0.5, "D")
Traceback (most recent call last):
File "<python-input-2>", line 1, in <module>
times.shift(0.5, "D")
~~~~~~~~~~~^^^^^^^^^^
File "/Users/spencer/software/pandas/pandas/core/indexes/datetimelike.py", line 514, in shift
start = self[0] + periods * self.freq
~~~~~~~~^~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'float' and 'pandas._libs.tslibs.offsets.Day'
A similar error is raised now for attempting to shift a CFTimeIndex in this way. This is tested in b87bb35.
| da_cftime = da(cftime_index) | ||
| with pytest.raises(ValueError, match="offset must be"): | ||
| da_cftime.resample(time="2D", offset=offset) # type: ignore[arg-type] | ||
| da_cftime.resample(time="2h", offset=offset) # type: ignore[arg-type] |
There was a problem hiding this comment.
Switched to "2h", since "2D" is no longer a Tick-like frequency.
There was a problem hiding this comment.
this seems like a major break. I've certainly used "D" for resampling to a daily frequency; and almost certainly used "5D" or "7D". I think we should do a deprecation cycle.
There was a problem hiding this comment.
Resampling to any multiple of "D" is still supported. It is just that using the offset and/or origin arguments is no longer supported in that context, which is why I changed the particular test here. For what it is worth, pandas is not providing a deprecation warning for this change, though in the next version they will warn (as I emulate here) when offset and/or origin are provided and they will have no effect.
In principle, I think the problems this aimed to solve in pandas are not relevant for cftime.datetime objects, which do not take into account timezones / daylight savings time transitions. With more work, we might be able maintain the old behavior for cftime.datetime objects going forward, though given the complexity of this code, I tentatively prefer to stick as close to the pandas behavior as possible. I'm open to more discussion, however.
There was a problem hiding this comment.
ah ok i didn't realize that the resampling syntax worked.
I think the problems this aimed to solve in pandas are not relevant for cftime.datetime objects, which do not take into account timezones / daylight savings time transitions. With more work, we might be able maintain the old behavior for cftime.datetime objects going forward, though given the complexity of this code, I tentatively prefer to stick as close to the pandas behavior as possible.
I saw that discussion, and agree with your approach. It seems the least burden in the long run, at least until someone complains ;)
There was a problem hiding this comment.
No worries, I appreciate you reviewing this. I will go ahead and merge.
| timedelta_result = da_cftime.resample(time="2h", offset=timedelta).mean() | ||
| string_result = da_cftime.resample(time="2h", offset=string).mean() |
There was a problem hiding this comment.
Switched to "2h", since "2D" is no longer a Tick-like frequency.
| (Second(), 3, Second(n=3)), | ||
| (Millisecond(), 3, Millisecond(n=3)), | ||
| (Microsecond(), 3, Microsecond(n=3)), | ||
| (Day(), 0.5, Hour(n=12)), |
There was a problem hiding this comment.
Now that Day is non-Tick-like, multiplying by a float is no longer valid in pandas:
>>> import pandas as pd
>>> 2.0 * pd.tseries.frequencies.Day(2)
Traceback (most recent call last):
File "<python-input-2>", line 1, in <module>
2.0 * pd.tseries.frequencies.Day(2)
~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'float' and 'pandas._libs.tslibs.offsets.Day'
A similar error is raised now for the cftime version of this offset. This is tested in b87bb35.
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
In pandas 3.0,
Daywill no longer be considered aTick-like frequency (pandas-dev/pandas#61985). This PR ports this change to the cftime version of this offset. The main implication in the cftime / xarray context is that it means theoffsetandoriginoptions inresamplewill no longer have an effect when resampling to a daily frequency. As in pandas-dev/pandas#62101, warnings will be emitted if non-default values are passed.whats-new.rst