Skip to content

GH-<id>: [Python] Assert assume_timezone nonexistent/ambiguous results in test_compute#50368

Draft
anxkhn wants to merge 1 commit into
apache:mainfrom
anxkhn:test/assume-timezone-missing-assert
Draft

GH-<id>: [Python] Assert assume_timezone nonexistent/ambiguous results in test_compute#50368
anxkhn wants to merge 1 commit into
apache:mainfrom
anxkhn:test/assume-timezone-missing-assert

Conversation

@anxkhn

@anxkhn anxkhn commented Jul 4, 2026

Copy link
Copy Markdown
### Rationale for this change

In `python/pyarrow/tests/test_compute.py`, `test_assume_timezone` builds an
`expected` array and compares it against the kernel `result` with `.equals()`, but
four of those comparisons drop the `assert`, so the boolean result is discarded and
the comparison verifies nothing. Because of this, the `assume_timezone` output
branches for DST-nonexistent times (`nonexistent="shift_forward"` /
`"shift_backward"`) and DST-ambiguous times (`ambiguous` earliest / latest) are
executed but never actually checked. A regression that produced wrong timestamps on
those branches would still let the test pass. The earlier assertions in the same
test already use the correct `assert result.equals(pa.array(expected))` form, which
shows the four bare comparisons were meant to be assertions too.

### What changes are included in this PR?

Prefix the four bare `.equals()` comparisons in `test_assume_timezone` with
`assert`, matching the form already used elsewhere in the same test. This is a
test-only change; no production or kernel code is touched.

### Are these changes tested?

Yes. The change is itself a test fix: with the asserts in place,
`pytest python/pyarrow/tests/test_compute.py -k assume_timezone` still passes on the
current kernel (confirming the four expected/result pairings are correct) and would
now fail if the `assume_timezone` DST-nonexistent or DST-ambiguous branch regressed.
The test is gated on `@pytest.mark.pandas` and `@pytest.mark.timezone_data`, so it
runs in the CI jobs that have pandas and timezone data.

### Are there any user-facing changes?

No.

This change was made with the help of an AI coding assistant; I reviewed the diff,
confirmed each expected/result pairing against the C++ kernel semantics and pandas'
`tz_localize` behavior, and I own the change.

Notes on the PR body:

  • The last paragraph is the AI-usage disclosure that Arrow's AI-generated-code
    guidance asks for. Keep it: the policy explicitly wants authors to "be upfront
    about AI usage." It leaks nothing about the local environment.
  • Do NOT tag or ping any maintainer in the PR (Arrow policy: AI-assisted PRs must
    not ping maintainers).
  • Arrow uses squash-merge and the merge bot adds Signed-off-by; do not commit
    with -s.

The diff (for reference)

diff --git a/python/pyarrow/tests/test_compute.py b/python/pyarrow/tests/test_compute.py
@@ def test_assume_timezone():
         timezone, nonexistent="shift_forward"))
     result = pc.assume_timezone(
         nonexistent_array, options=options_nonexistent_latest)
-    expected.equals(result)
+    assert expected.equals(result)

     expected = pa.array(nonexistent.tz_localize(
         timezone, nonexistent="shift_backward"))
     result = pc.assume_timezone(
         nonexistent_array, options=options_nonexistent_earliest)
-    expected.equals(result)
+    assert expected.equals(result)
@@
     expected = ambiguous.tz_localize(timezone, ambiguous=[True, True, True])
     result = pc.assume_timezone(
         ambiguous_array, options=options_ambiguous_earliest)
-    result.equals(pa.array(expected))
+    assert result.equals(pa.array(expected))

     expected = ambiguous.tz_localize(timezone, ambiguous=[False, False, False])
     result = pc.assume_timezone(
         ambiguous_array, options=options_ambiguous_latest)
-    result.equals(pa.array(expected))
+    assert result.equals(pa.array(expected))

Local commit message (already on the branch, Conventional Commits, leak-clean):

test(python): assert assume_timezone nonexistent/ambiguous results

Four comparisons in test_assume_timezone computed a result with .equals()
but dropped the assert, so the boolean was discarded and nothing was
verified. As a result the assume_timezone kernel's DST nonexistent
(shift_forward/shift_backward) and ambiguous (earliest/latest) output
branches were exercised but never checked, letting a regression pass.

Prefix the four bare comparisons with assert, matching the
assert result.equals(pa.array(expected)) form already used earlier in
the same test.

(The commit subject uses the test(python): Conventional-Commits form for local
history; the upstream PR title must be the GH-<id>: [Python] ... form above.
Arrow's squash-merge takes the PR title + body as the final commit message, so the
local subject is not what lands on main.)

Four comparisons in test_assume_timezone computed a result with .equals()
but dropped the assert, so the boolean was discarded and nothing was
verified. As a result the assume_timezone kernel's DST nonexistent
(shift_forward/shift_backward) and ambiguous (earliest/latest) output
branches were exercised but never checked, letting a regression pass.

Prefix the four bare comparisons with assert, matching the
assert result.equals(pa.array(expected)) form already used earlier in
the same test.
Copilot AI review requested due to automatic review settings July 4, 2026 23:19
@anxkhn anxkhn requested review from AlenkaF, raulcd and rok as code owners July 4, 2026 23:19
@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes test_assume_timezone in python/pyarrow/tests/test_compute.py so that four previously non-asserted .equals() comparisons actually validate the assume_timezone results for DST-nonexistent and DST-ambiguous inputs.

Changes:

  • Add missing assert keywords for two nonexistent-time (shift_forward / shift_backward) result comparisons.
  • Add missing assert keywords for two ambiguous-time (earliest / latest) result comparisons.

@kou kou marked this pull request as draft July 5, 2026 02:52
@kou

kou commented Jul 5, 2026

Copy link
Copy Markdown
Member

Please read https://arrow.apache.org/docs/dev/developers/overview.html carefully and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants