SNOW-2257191: Bugfix join bug due to dataframe alias by sfc-gh-aalam · Pull Request #3685 · snowflakedb/snowpark-python

sfc-gh-aalam · 2025-08-21T01:03:27Z

Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

Fixes SNOW-2257191
Fill out the following pre-review checklist:
- I am adding a new automated test(s) to verify correctness of my new code
  - If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
- I am adding new logging messages
- I am adding a new telemetry message
- I am adding new credentials
- I am adding a new dependency
- If this is a new feature/behavior, I'm adding the Local Testing parity changes.
- I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
- If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
Please describe how your code solves the related issue.

In this PR we fix update of df_aliased_col_name_to_real_col_name child to parent by making sure all dictionaries within the default dict are copied by value instead of reference.

…troduce `_spark_session_tz` param (#3659)

…function name is out of spec (#3691)

…ument` in functions (#3697)

…ine (#3610) Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>

…3706) This is working towards running most of our snowpandas tests with hybrid mode.

…sts for the integration module (#3715) The new test parameter is called '--enable_modin_hybrid_mode' which is only applied to the integ modin module. This is not used yet; but it allows for enabling hybrid in an adhoc way. Eventually there will be a new pre-commit test which enables hybrid just for the integration modin module. This change also disables the sql_counter when running under hybrid mode; because virtually no sql queries are issued.

…nctions) (#3721)

…3734)

…das (#3717) SNOW-2305345 - Eliminate duplicate casing parameter checks in snowpandas While working on SHOW OBJECT usage to see if we can fetch row size quickly I noticed we issue SHOW PARAMETERS LIKE 'QUOTED_IDENTIFIERS_IGNORE_CASE' IN SESSION queries every time we fetch the session. This is done to issue a warning, but we really only need to do this once.

…#3948)

…#3975) While testing #3973, I noticed that aggregations on single-column frames/series were producing queries with JSON serialization and unnecessary UNPIVOT operations. The QC's `transpose_single_row` helper method is used in aggregations to skip a PIVOT operation used in the general transpose case, but for transposing a 1x1 frame, we don't even need to UNPIVOT and need only re-label the index since we already know that the column's dtype will not change. This PR adds a fast path for 1x1 `transpose_single_row` operations, which replaces JSON/UNPIVOT operations with simple projections. It produces some modest performance improvements for operations on a 2000x1 frame: - `DataFrame.count`: 1.48s -> 1.31s (11.2% improvement) - `DataFrame.describe`: 2.64s -> 2.36s (10.9% improvement) - `DataFrame.nunique`: 1.25s -> 1.21s (3.4% improvement) These improvements are likely to be more noticeable on frame produced from more complex queries. This PR also adds explicit row count caching for the general transpose case. We currently cannot directly use the `transpose_single_row` path for the `transpose` API itself since the helper function drops the column labels of the result.

…3978)

…ter pandas (#3964)

…rt 3 (#3947)

…on functions. (#3977)

…of driver reference on top level (#3897)

…3985)

…ces) in faster pandas (#3984)

…umn names like '"ab"' and 'ab' (#3986)

…k/weekday/dayofyear/isocalendar (already supported in faster pandas) (#3992)

…aster pandas (#3991)

)

…#3993)

sfc-gh-aalam added the NO-CHANGELOG-UPDATES This pull request does not need to update CHANGELOG.md label Aug 21, 2025

sfc-gh-kgadomski and others added 29 commits August 25, 2025 09:40

SNOW-1943633 - fix displaying timestamps for _show_string_spark, in…

75d623d

…troduce `_spark_session_tz` param (#3659)

SNOW-2275424: fixudtf_ingestion does not work in stored proc because …

7f1bd1c

…function name is out of spec (#3691)

NO-SNOW: fix doc test (#3707)

52a3216

SNOW-1791191: time travel support v2 (#3674)

2a9dc3d

SNOW-2296406: Support ai_extract, ai_transcribe and `ai_parse_doc…

86ced20

…ument` in functions (#3697)

SNOW-2223084: update cloudpickle to v3.1.1 (#3584)

bf7928f

SNOW-2213898 - Update assertion code to convert hybrid Pandas df (#3704)

5cdb0b1

SNOW-2230553 - Reduce Telemetry Overhead when running w/ a Pandas Eng…

873c124

…ine (#3610) Co-authored-by: graphite-app[bot] <96075541+graphite-app[bot]@users.noreply.github.com>

SNOW-2292908: Skip flaky modin hybrid tests on Windows (#3698)

3e9337a

SNOW-2262972: Refactor function.py file (#3655)

3d0158f

SNOW-2213898 - Update assertion code to handle test_attrs assertions (#…

37a384c

…3706) This is working towards running most of our snowpandas tests with hybrid mode.

SNOW-2296598: Estimate row count only when Hybrid is enabled (#3708)

8627b9e

SNOW-641374: support directory (#3709)

dbd6805

NO-SNOW: Pin pytest-rerunfailures due to be release (#3720)

95c2bba

SNOW-2249728: Add support for scalar functions (batch 1) (#3714)

8d6db91

SNOW-2292338 AST support for time travel (#3716)

11cafe0

SNOW-2166197: add support for jdbc (#3591)

2af770b

SNOW-2268238: support copy files (#3712)

cb2fd64

NO-SNOW: fix no pandas test failure (#3727)

da411b8

AST on TimestampTimeZone type casting (#3729)

278a28c

NO SNOW: fix test error because of naming (#3723)

c5f1a7c

SNOW-2301201: use server cursor to fetch data (#3726)

29e796e

SNOW-2298578: Add support for scalar functions (Bitwise expression fu…

42f4745

…nctions) (#3721)

NO-SNOW: Fix tests (#3731)

b425f9d

NO SNOW: align behavior of session_init_statement in dbapi and jdbc (#…

0857605

…3734)

NO-SNOW: Actually skip hybrid tests on Windows (#3737)

a1a4bb2

NO-SNOW: add missing doc update (#3738)

6f7b9d4

sfc-gh-stramer and others added 26 commits October 30, 2025 14:50

SNOW-2442548 Include VARIANT cast requirement for array_contains(...) (…

b135afc

…#3948)

SNOW-2405395: Add ai_translate (#3969)

4057ed1

SNOW-2443666: Remove parameter ENABLE_ARRAYS_ZIP_FUNCTION (#3980)

f2c7ccb

NO-SNOW: fix type check before AST in sort() (#3982)

94b2ee3

Add support for scalar string and binary functions - part 1 (#3905)

521f4f2

SNOW-2504821: Add support for cumsum/cummin/cummax in faster pandas (#…

b8db0ba

…3978)

SNOW-2391351: Avoid joins for drop_duplicates when keep!=False in fas…

a8c954e

…ter pandas (#3964)

SNOW-2346239: update GH action to use key pair auth (#3981)

34f4674

SNOW-2437173 Enable asfreq for autoswitching on unsupported args (#3976)

3d0b177

SNOW-2455523: Add support for scalar string and binary functions - pa…

c75a502

…rt 3 (#3947)

SNOW-2500535: Add scalar support for numeric and conditional expressi…

eff0dd1

…on functions. (#3977)

SNOW-2430625:encapsulate local ingestion and udtf ingestion, get rid …

3cee8b4

…of driver reference on top level (#3897)

[SNOW-2437420] Add experiment to lineage versioned domains (#3934)

6d0d155

SNOW-2675533: extract local ingestion into Utils (#3987)

05f060d

SNOW-2644834: Add support for 11 groupby functions in faster pandas (#…

ba44a06

…3985)

SNOW-2643972: Add support for groupby properties (groupby.groups/indi…

3e70d50

…ces) in faster pandas (#3984)

SNOW-2540864: Fix df.rename not working for quoted and non quoted col…

d52176e

…umn names like '"ab"' and 'ab' (#3986)

SNOW-2676991: Add support for to_snowflake in faster pandas (#3988)

8d3ad6d

SNOW-2676993: Add support for to_snowpark in faster pandas (#3989)

8efd426

SNOW-2679281: Add missing tests and changelog entries for dt.dayofwee…

97b0906

…k/weekday/dayofyear/isocalendar (already supported in faster pandas) (#3992)

SNOW-2679277: Add support for groupby.get_group/resample/rolling in f…

2108145

…aster pandas (#3991)

SNOW-2677419: Add support for resample functions in faster pandas (#3990

70a7d67

)

SNOW-2268207: Support lateral join (#3971)

79af340

SNOW-2430412: Remove experimental warning of cte_optimization_enabled (…

45929c9

…#3993)

merge

74f91f2

sfc-gh-aalam closed this Nov 12, 2025

sfc-gh-aalam force-pushed the aalam-SNOW-2257191-cte-join-bugfix branch from 50b5998 to 74f91f2 Compare November 12, 2025 23:20

github-actions Bot locked and limited conversation to collaborators Nov 12, 2025

sfc-gh-aalam deleted the aalam-SNOW-2257191-cte-join-bugfix branch November 12, 2025 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SNOW-2257191: Bugfix join bug due to dataframe alias#3685

SNOW-2257191: Bugfix join bug due to dataframe alias#3685
sfc-gh-aalam wants to merge 2829 commits into
mainfrom
aalam-SNOW-2257191-cte-join-bugfix

sfc-gh-aalam commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

sfc-gh-aalam commented Aug 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants