Skip to content

SNOW-3066557: UnicodeDecodeError when executing SQL files with UTF-8 encoding on Japanese Windows #2759

@KazunoriMatsuzawa

Description

@KazunoriMatsuzawa

SnowCLI version

3.15.0

Python version

3.11 (embedded in PyApp)

Platform

Windows 10/11 (Japanese locale, CP932 default encoding)

What happened

Snowflake CLI fails to execute SQL files containing non-ASCII characters (Japanese comments) with UTF-8 encoding on Japanese Windows environment, despite setting PYTHONUTF8=1 and PYTHONIOENCODING=utf-8.

Console output

Actual Result:
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

Error Traceback:
File "...\snowflake\cli\_plugins\sql\statement_reader.py", line 233, in files_reader
    stmts = split_statements(io.StringIO(f.read()), remove_comments)
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

How to reproduce

Steps to Reproduce:

Create a SQL file with UTF-8 encoding containing Japanese comments:
Save the file as test.sql (UTF-8 encoding)

Run the command:

Actual Result:

Error Traceback:

Expected Result:
SQL file should be read with UTF-8 encoding and executed successfully.

What I've Tried:

✅ Set PowerShell to UTF-8 (chcp 65001) - No effect
✅ Set environment variables PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 - No effect
✅ Modified PowerShell profile with UTF-8 settings - No effect
✅ Removed Japanese comments from SQL file - Workaround successful
Root Cause Analysis:

The issue occurs in statement_reader.py:233 where SecurePath.read() is called without specifying encoding. On Japanese Windows, the default encoding is CP932, not UTF-8.

The environment variables PYTHONUTF8 and PYTHONIOENCODING don't affect PyApp-bundled Python executables, as they use their own embedded Python runtime.

Proposed Solution:

Explicitly specify UTF-8 encoding when opening SQL files in statement_reader.py:

Or add encoding parameter support to SecurePath.read_text() method.

Related Documentation:

Python's open() encoding parameter
Similar issue was resolved for icacls command outputs (SnowflakeCLI_EncodingError.md in user documentation)
Workaround:
Remove non-ASCII characters from SQL files or use SnowSQL instead of Snowflake CLI.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions