Feature | Support SQL Graph column aliases in SqlBulkCopy#3677
Conversation
Aliases are: $to_id, $from_id, $node_id, $edge_id
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
Pull Request Overview
This PR implements support for SQL Graph column aliases in SqlBulkCopy, allowing users to map data to SQL Server's pseudo-columns ($edge_id, $from_id, $to_id, $node_id) which are synthesized in SQL Graph tables but map to dynamically-named physical columns internally.
- Adds column alias resolution logic that maps virtual column names to their physical counterparts
- Implements backward compatibility by prioritizing physical columns when they have the same name as aliases
- Updates the initial query to retrieve column alias mappings from SQL Server metadata
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| SqlGraphTables.cs | Adds comprehensive test coverage for column alias functionality and edge cases |
| CopyAllFromReader.cs | Updates expected statistics values to account for additional database queries |
| SqlBulkCopyColumnMapping.cs | Introduces MappedDestinationColumn property to handle alias-to-physical column mapping |
| SqlBulkCopy.cs | Implements core column alias resolution logic and updates metadata query |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3677 +/- ##
==========================================
- Coverage 65.99% 65.92% -0.07%
==========================================
Files 276 271 -5
Lines 42951 66123 +23172
==========================================
+ Hits 28344 43593 +15249
- Misses 14607 22530 +7923
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
This pull request has been marked as stale due to inactivity for more than 30 days. If you would like to keep this pull request open, please provide an update or respond to any comments. Otherwise, it will be closed automatically in 7 days. |
|
This PR is not stale (although it should only be merged after #3719.) |
|
This pull request has been marked as stale due to inactivity for more than 30 days. If you would like to keep this pull request open, please provide an update or respond to any comments. Otherwise, it will be closed automatically in 7 days. |
|
This PR is not stale (although it should only be merged after #3719.) |
|
This pull request has been marked as stale due to inactivity for more than 30 days. If you would like to keep this pull request open, please provide an update or respond to any comments. Otherwise, it will be closed automatically in 7 days. |
|
This PR is not stale. |
|
This pull request has been marked as stale due to inactivity for more than 30 days. If you would like to keep this pull request open, please provide an update or respond to any comments. Otherwise, it will be closed automatically in 7 days. |
|
Thanks @paulmedynski. The approach needed to vary slightly as a result of considering/testing on SQL 2016 and Azure Synapse, but I've merged main and made my changes. |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
@edwardneal can you take a look at the test failure for Mac for AdapterTest.FillShouldAllowRetryLogicProviderToBeInvoked and see if it's related to your changes? It appears to be failing consistently for last two runs.
paulmedynski
left a comment
There was a problem hiding this comment.
Just one question about one of the tests.
Head branch was pushed to by a user without write access
|
Thanks both. @apoorvdeshmukh I've looked, but I'm fairly confident it's unrelated. The test code (link) references SqlDataAdapter and is completely timing-based - in this case, it sets a one second timeout and tells the server to wait for two seconds. |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
Description
This fixes an older issue requesting support for using SQL Graph column aliases as destinations when mapping columns in
SqlBulkCopy.SQL Server synthesises four pseudo-columns in SQL Graph:
$edge_id,$from_id,$to_id,$node_id. These appear in node and edge tables. It's possible to query these columns, but internally they map to dynamically-named physical columns. Bulk copies only accept physical column names, so this PR introduces the concept of column aliases to bridge that gap when importing data in a similar way to SQL Server (running statements against the destination table manually.)Column aliases are designed to be additive: if a physical column named
$to_idalready exists in a SQL Graph table, the physical column will always be used as the destination. For backwards compatibility purposes, if another source field has already been mapped to a column alias' physical column name, other source fields which might be mapped to the column alias aren't re-mapped;SqlBulkCopywill assume that the user has already done the work needed to get the physical column names and won't trample it.The end result should be as defined in
SqlGraphTables.WriteToServer_CopyToAliasedColumnName_Succeeds: mapping from a field in a DataTable/etc. to$to_idor$from_idwill "just work".The only difference between the way that SQL Server handles these column aliases and the way we handle them is in an edge case: a user creates a SQL Graph table with a column which has an identical name to the column alias. In that specific case, a user can access the column alias by querying
$to_idand the physical column by querying[$to_id]. On the other hand,SqlBulkCopywill handle this case by always mapping the data to the physical column. This difference is so that people presently mapping fields to$to_idaren't forced to change their code so that it maps to[$to_id]. I don't think this is a common use case though - I'd struggle to imagine a situation where anyone needs to do this.Issues
Fixes #123.
Testing
All existing tests pass. I've added two more which check the use cases I can think of: