Skip to content

SNOW-2381033: Deflake hybrid tests by using separate tables.#3830

Merged
sfc-gh-mvashishtha merged 4 commits into
mainfrom
mvashishtha/SNOW-2381033/deflake-hybrid-tests-depending-on-shared-table
Oct 2, 2025
Merged

SNOW-2381033: Deflake hybrid tests by using separate tables.#3830
sfc-gh-mvashishtha merged 4 commits into
mainfrom
mvashishtha/SNOW-2381033/deflake-hybrid-tests-depending-on-shared-table

Conversation

@sfc-gh-mvashishtha

@sfc-gh-mvashishtha sfc-gh-mvashishtha commented Oct 1, 2025

Copy link
Copy Markdown
Contributor

Prior to this commit, a few test cases depended on the module-scoped fixtureinit_transaction_tables, which would

  1. create or replace table REVENUE_TRANSACTIONS
  2. insert values into it.

The table was shared by different pytest processes, so there was a race condition where a test case might use a REVENUE_TRANSACTIONS table with 0 rows, or with double the number of rows.

In this commit, use a smaller, separate table for each test module. This change doesn't slow down pytest -n12 tests/integ/modin/hybrid/test_switch_operations.py on my laptop.

pytest -n12 tests/integ/modin/hybrid/test_switch_operations.py reliably failed on my machine prior to this commit, but now it passes.

Signed-off-by: sfc-gh-mvashishtha <mahesh.vashishtha@snowflake.com>
@sfc-gh-mvashishtha sfc-gh-mvashishtha added NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs labels Oct 1, 2025
@sfc-gh-mvashishtha sfc-gh-mvashishtha marked this pull request as ready for review October 1, 2025 23:56
@sfc-gh-mvashishtha sfc-gh-mvashishtha requested a review from a team as a code owner October 1, 2025 23:56

@sfc-gh-joshi sfc-gh-joshi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do remember the initialization fixture taking a while, and I see you made the fixture generate fewer rows. As long as it's still above whatever switching thresholds we're testing it should be OK.

The "one table per test approach" is fine, but I think we ideally would find a way to initialize the table once during setup instead of re-creating it every time the fixture is initialized (maybe a single CTAS if not exists?), since none of the tests mutate it. I thought that's how module-scoped fixtures work, but I guess that's not the case.

Signed-off-by: sfc-gh-mvashishtha <mahesh.vashishtha@snowflake.com>
@sfc-gh-mvashishtha

Copy link
Copy Markdown
Contributor Author

The "one table per test approach" is fine, but I think we ideally would find a way to initialize the table once during setup instead of re-creating it every time the fixture is initialized

I agree.

I thought that's how module-scoped fixtures work, but I guess that's not the case.

They do work that way, but in this case each pytest process replaces/creates and inserts into the same table in the same namespace. We could alternatively create a (snowflake-session-scoped) temporary table once per pytest session, though that has the disadvantage that test cases could mutate the table. I've now updated this PR so it does that (except I had to use module scope since the session fixture has module-level scope).

@sfc-gh-mvashishtha

Copy link
Copy Markdown
Contributor Author

@sfc-gh-joshi never mind; the problem with the pytest-module-level fixture is that we could get the race condition again if different test files use the fixture. the session fixture is at the pytest module level, so I prefer the function-scoped version. It's not too slow, for now, either.

@sfc-gh-mvashishtha

Copy link
Copy Markdown
Contributor Author

we could get the race condition again if different test files use the fixture.

no; each module gets its own session so we should be ok. un-revert.

"""
CREATE OR REPLACE TABLE revenue_transactions (
f"""
CREATE OR REPLACE TEMP TABLE {module_scoped_test_table_name} (

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TEMP is not necessary now that we're using a different table name in each pytest module, but that's ok.

@sfc-gh-mvashishtha sfc-gh-mvashishtha merged commit d02aa5c into main Oct 2, 2025
30 checks passed
@sfc-gh-mvashishtha sfc-gh-mvashishtha deleted the mvashishtha/SNOW-2381033/deflake-hybrid-tests-depending-on-shared-table branch October 2, 2025 17:46
@github-actions github-actions Bot locked and limited conversation to collaborators Oct 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md NO-PANDAS-CHANGEDOC-UPDATES This PR does not update Snowpark pandas docs snowpark-pandas

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants