Skip to content

SNOW-3718333: escape backslashes and single quotes in stage/file path SQL generation#4274

Merged
sfc-gh-aling merged 6 commits into
mainfrom
SNOW-3718333-escape-stage-path
Jul 2, 2026
Merged

SNOW-3718333: escape backslashes and single quotes in stage/file path SQL generation#4274
sfc-gh-aling merged 6 commits into
mainfrom
SNOW-3718333-escape-stage-path

Conversation

@sfc-gh-aling

Copy link
Copy Markdown
Collaborator

Summary

Stage and file paths passed to COPY INTO / PUT / GET were escaped for single quotes but not backslashes, so a path containing a backslash immediately followed by a single quote produced invalid SQL. normalize_path now escapes backslashes before single quotes so the path stays a single string literal.

Changes

  • _internal/utils.py: escape \ before ' in normalize_path.
  • Unit tests (tests/unit/test_internal_utils.py) and integ tests for DataFrame.write.csv and Snowpark-pandas DataFrame.to_csv.
  • A follow-up commit adjusts two of the new tests for platform / stage-storage behavior (Windows local-path normalization; a literal backslash is not preserved as a stage directory separator), so they assert the escaping guarantee without relying on a backslash round-trip.

sfc-gh-aling and others added 2 commits July 1, 2026 20:40
… SQL generation

Stage and file paths passed to COPY INTO / PUT / GET were escaped for single
quotes but not backslashes, so a path containing a backslash followed by a
single quote produced invalid SQL. normalize_path now escapes backslashes
before single quotes so the path stays a single string literal. Adds unit tests
and integ tests covering Snowpark write.csv and Snowpark-pandas to_csv with
quote/backslash paths.
The escaping fix is correct; two newly-added tests encoded assumptions
that don't hold in CI:

- test_normalize_path_escapes_backslash_and_quote asserted backslashes
  round-trip for is_local=True, but on Windows local paths have
  backslashes normalized to '/' before escaping (pre-existing behavior).
  Mirror that transform in the expected value; the early-termination
  guarantee is still checked on every platform.

- test_writer_csv_stage_path_escapes_special_characters read back a
  backslash-containing stage path, but a literal backslash is not
  preserved as a directory separator by stage storage. Assert the writes
  succeed (valid SQL, path treated as literal data) instead of a
  read-back round-trip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Use BACKSLASH/SINGLE_QUOTE constants for the escape replacements and
trim the comment. Behavior is unchanged; this only removes the Python
escape double-counting that made the original one-liner hard to read.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Snowflake string literal; the reverse order would let an escaped quote
# close the literal early and produce invalid SQL. Constants keep the
# replacements readable (no Python escape double-counting).
BACKSLASH = "\\"

@sfc-gh-aling sfc-gh-aling Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfc-gh-yuwang I'm responding your question in this thread

I have a dumb question here, looking at the old logic, it looks like ' would be written as \' because a single quote ' is replaced with \', which is different from what the comment described?

In Python, \ is the escape char, so \\ is used represent a single \.
I have updated the code to make it clear

@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.52%. Comparing base (c9b9303) to head (b3d7334).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4274   +/-   ##
=======================================
  Coverage   95.52%   95.52%           
=======================================
  Files         171      171           
  Lines       44358    44360    +2     
  Branches     7577     7577           
=======================================
+ Hits        42375    42377    +2     
  Misses       1221     1221           
  Partials      762      762           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sfc-gh-aling and others added 3 commits July 1, 2026 17:08
Stage storage does not preserve a literal backslash as a path-separator
character, so asserting the round-tripped name endswith "o'clock\dir/..."
always fails. Switch part (a) to a plain single-quote path that does
round-trip verbatim; keep part (b) asserting only that the write
succeeds (valid SQL), not the exact name on LIST.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@sfc-gh-aling sfc-gh-aling merged commit 809b627 into main Jul 2, 2026
28 of 31 checks passed
@sfc-gh-aling sfc-gh-aling deleted the SNOW-3718333-escape-stage-path branch July 2, 2026 06:49
@github-actions github-actions Bot locked and limited conversation to collaborators Jul 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants