Skip to content

SNOW-3718303: Correctly escape special characters in comment/collate/subfield/flatten SQL#4271

Merged
sfc-gh-aling merged 4 commits into
mainfrom
SNOW-3718303-sql-literal-escaping
Jul 2, 2026
Merged

SNOW-3718303: Correctly escape special characters in comment/collate/subfield/flatten SQL#4271
sfc-gh-aling merged 4 commits into
mainfrom
SNOW-3718303-sql-literal-escaping

Conversation

@sfc-gh-aling

Copy link
Copy Markdown
Collaborator
  1. Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes SNOW-3718303, SNOW-3718304, SNOW-3718305, SNOW-3718306

Values passed to DataFrame comment (create_or_replace_view / dynamic table / save_as_table), Column.collate, Column.getitem subfield keys, and DataFrame/Session.flatten PATH were concatenated into SQL string literals without escaping backslashes (and, for collate/subfield/flatten, single quotes), so values containing those characters produced invalid SQL. Adds a dedicated escape_quotes_and_backslashes helper applied only at those sinks; the shared escape_single_quotes is unchanged so other callers are unaffected. Column.collate keeps its pre-quoted-spec handling so legitimate specs behave identically. Adds unit and integ tests.

  1. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
      • If this test skips Local Testing mode, I'm requesting review from @snowflakedb/local-testing
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am adding new credentials
    • I am adding a new dependency
    • If this is a new feature/behavior, I'm adding the Local Testing parity changes.
    • I acknowledge that I have ensured my changes to be thread-safe. Follow the link for more information: Thread-safe Developer Guidelines
    • If adding any arguments to public Snowpark APIs or creating new public Snowpark APIs, I acknowledge that I have ensured my changes include AST support. Follow the link for more information: AST Support Guidelines
  2. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

…subfield/flatten SQL

Values passed to DataFrame comment (create_or_replace_view / dynamic table / save_as_table), Column.collate, Column.__getitem__ subfield keys, and DataFrame/Session.flatten PATH were concatenated into SQL string literals without escaping backslashes (and, for collate/subfield/flatten, single quotes), so values containing those characters produced invalid SQL. Adds a dedicated escape_quotes_and_backslashes helper applied only at those sinks; the shared escape_single_quotes is unchanged so other callers are unaffected. Column.collate keeps its pre-quoted-spec handling so legitimate specs behave identically. Adds unit and integ tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.52%. Comparing base (e773089) to head (98645b0).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4271   +/-   ##
=======================================
  Coverage   95.52%   95.52%           
=======================================
  Files         171      171           
  Lines       44305    44358   +53     
  Branches     7565     7577   +12     
=======================================
+ Hits        42322    42375   +53     
  Misses       1221     1221           
  Partials      762      762           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sfc-gh-aling and others added 2 commits July 1, 2026 16:37
… quotes

Column.__getitem__ historically emitted the VARIANT/OBJECT subfield key
verbatim between single quotes, so callers were expected to double their
own single quotes (see test_column_suite.py::test_subfield). Unconditionally
running the key through escape_quotes_and_backslashes double-escaped those
already-doubled quotes, changing the resolved key and breaking existing
callers.

Add escape_subfield_key(): if every single quote in the key is already
doubled, preserve them as-is (escaping only backslashes so \t/\n stay
literal); otherwise fully escape. This keeps the historical contract
byte-identical, makes a raw apostrophe work too, and still neutralizes
any key that would otherwise break out of the literal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@sfc-gh-aling sfc-gh-aling marked this pull request as ready for review July 1, 2026 17:42
@sfc-gh-aling sfc-gh-aling requested review from a team as code owners July 1, 2026 17:42
@sfc-gh-aling sfc-gh-aling merged commit c9b9303 into main Jul 2, 2026
48 of 53 checks passed
@sfc-gh-aling sfc-gh-aling deleted the SNOW-3718303-sql-literal-escaping branch July 2, 2026 05:02
@github-actions github-actions Bot locked and limited conversation to collaborators Jul 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants