Feature | Use hardcoded LCID mappings when decoding strings#4212
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
benrr101
left a comment
There was a problem hiding this comment.
In general, I like the idea. The only thing I would like changed is to give more labels to the magic values. Right now it looks like a book of mysterious numbers with no real explanation. They ultimately correspond with a locale of some kind, yeah? So it'd be nice to be able to see which values correspond to which locale.
|
Thanks @benrr101. Yes - it maps LCID (defined here) to code page IDs (except for LCIDs 0x827 and 0x2409, which don't appear in the specs or are marked as reserved, but still have an entry in the LCID table.) The corresponding mapping here references the MS-LCID spec, which I can do. Many of the code pages are named Do you just want the link to MS-LCID, or something more structural? Edit - I've just added the LCID descriptions, and I think it looks slightly clearer now. What do you think? |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates collation/LCID-to-codepage resolution in TdsParser to use hardcoded mappings (aligned with other SQL Server drivers) instead of relying on CultureInfo.GetCultureInfo, helping move toward compatibility with globalization invariant mode.
Changes:
- Replaced
TdsParser.GetCodePagelogic with a call toLocalesHelper.TryGetCodePage(...). - Introduced
LocalesHelpercontaining hardcoded LCID→codepage and sortId→codepage mapping tables. - Removed the legacy
TdsEnums.CODE_PAGE_FROM_SORT_IDtable.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsParser.cs | Switches codepage resolution to use the new hardcoded mapping helper. |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/TdsEnums.cs | Removes the old sort-id→codepage mapping table. |
| src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/LocalesHelper.cs | Adds hardcoded LCID/sortId mapping tables and lookup logic. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4212 +/- ##
==========================================
- Coverage 65.96% 64.10% -1.87%
==========================================
Files 275 272 -3
Lines 42993 65961 +22968
==========================================
+ Hits 28361 42285 +13924
- Misses 14632 23676 +9044
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
benrr101
left a comment
There was a problem hiding this comment.
Much better - I appreciate it! :D
Description
This builds on #4051 and #584 by changing the mapping from collations/sort IDs to code pages, replacing it with a static list of valid mappings rather than a call to
CultureInfo.GetCultureInfo.This is performance-neutral, but it also brings us closer to supporting globalization invariant mode.
I've simply lifted the full set of mappings from the JDBC driver (here) and confirmed that the existing collation transcoding tests continue to pass on the newest SQL 2025 CU and on a SQL Azure instance.
Issues
Tested by #4051.
Addresses a backlog item listed in #584.
Contributes to #3742.
Testing
Manual testing against an on-prem SQL2025 instance and against a SQL Azure instance passes.