Is your feature request related to a problem? Please describe.
SqlClient supports CLR-defined user-defined types via runtime type resolution using assembly-qualified names supplied by SQL Server. These assembly-qualified names are defined in MS-TDS, section 2.2.5.5.2.
This assembly-qualified name is resolved to a concrete Type via Type.GetType, and it can be deserialized using one of two methods. The method is selected based upon an attribute attached to the type definition.
Method 1: custom serialization
This requires the type to implement IBinarySerialize. When reading an object from SQL Server, the type's default constructor is invoked, and SqlClient calls IBinarySerialize.Read. When passing it as a parameter, SqlClient calls IBinarySerialize.Write on the object.
SqlClient implements this using the BinarySerializeSerializer class.
Method 2: native serialization
When this is used, SqlClient follows the MS-SSCLRT protocol specification, adhering to the constraints in section 2.3.1.
The serializer for this is NormalizedSerializer. This uses a BinaryOrderedUdtNormalizer instance to get each instance field, sort them by Marshal.OffsetOf and write the field values.
Problem
The problem is that neither of these methods is compatible with IL trimming. Although the types have been annotated, the underlying tension is protocol-level: there's no way for the IL trimmer to know what types it should preserve because this type information comes from the database at runtime.
This tension materialises at a few key locations:
- If the CLR UDT is never statically referenced then the type may be removed during trimming, preventing SqlClient from resolving it from SQL Server's type metadata via
Type.GetType.
- If a field on the object is not statically observed as used, it could be trimmed away completely. Native serialization will silently serialize an instance of an incomplete type.
- Similarly to the above - if the parameterless constructor on an otherwise-used CLR UDT isn't ever statically visible, it could be trimmed away. The type metadata may exist, but when SqlClient calls
Activator.CreateInstance, the constructor won't be found and the type can't be instantiated.
Of these, the issue with certain fields being trimmed is the most serious. A trimmed type or constructor will throw an exception, but silently serialising an incomplete type would result in corrupt data - a situation which SQL Server doesn't verify.
The fact that the UDT's assembly-qualified name originates from SQL Server metadata also means that there's no call site which we can annotate to preserve these types - and even if one did exist, the annotations would only apply to the single type specified. This means they wouldn't cover more complex scenarios where a UDT can consist of multiple nested structs.
Describe the solution you'd like
I'm proposing a set of APIs which clients can call to use CLR-defined UDTs safely:
using System.Diagnostics.CodeAnalysis;
namespace Microsoft.Data.SqlClient;
public sealed class SqlUserDefinedTypeRegistration
{
private const DynamicallyAccessedMemberTypes PreservedMembers =
DynamicallyAccessedMemberTypes.PublicParameterlessConstructor | DynamicallyAccessedMemberTypes.Interfaces
| DynamicallyAccessedMemberTypes.PublicFields | DynamicallyAccessedMemberTypes.NonPublicFields
| DynamicallyAccessedMemberTypes.PublicMethods | DynamicallyAccessedMemberTypes.NonPublicMethods;
private SqlUserDefinedTypeRegistration() { }
// Declares preservation requirements for a type which is used as a SQL Server CLR UDT.
public static SqlUserDefinedTypeRegistration Register<[DynamicallyAccessedMembers(PreservedMembers)] T>();
// Declares an additional CLR type dependency of the registered CLR UDT.
public SqlUserDefinedTypeRegistration WithDependency<[DynamicallyAccessedMembers(PreservedMembers)] T>();
}
This API surface wouldn't replace the existing CLR UDT resolution surface. Its only purpose is to declare CLR UDT preservation requirements in a form which can be observed and respected by the IL linker at build/publish time. It doesn't do anything at runtime, and would be safe to run from a .NET Framework (or a non-trimmed .NET Core) application.
Client applications would be expected to invoke these methods on a code path which the IL linker sees as reachable (such as Main() or a module initializer) in order to preserve the UDTs they want to read:
// Return value discarded - Address doesn't need any additional field types.
SqlUserDefinedTypeRegistration.Register<Address>();
// Use the returned SqlUserDefinedTypeRegistration to record the types of nested fields.
SqlUserDefinedTypeRegistration.Register<LineIntersection>()
.WithDependency<Line>()
.WithDependency<Point>();
[SqlUserDefinedType(Format.UserDefined, IsByteOrdered = false, MaxByteSize = 500)]
public class Address : IBinarySerialize
{
public string Line1;
public string Line2;
void IBinarySerialize.Read(BinaryReader r) { }
void IBinarySerialize.Write(BinaryWriter r) { }
}
public struct Point
{
public int X;
public int Y;
}
public struct Line
{
public Point Start;
public Point End;
}
[SqlUserDefinedType(Format.Native)]
public struct LineIntersection
{
public Line FirstLine;
public Line SecondLine;
}
Clients call Register to register the "top-level" UDT - the one directly registered within SQL Server via CREATE TYPE. If that top-level UDT has fields which are also struct types, those types are registered as dependencies via calls to WithDependency. The second step is necessary because DynamicallyAccessedMembers only applies preservation requirements to the type we attach it to. Preserving the fields of Line doesn't automatically preserve the constructor, fields and methods of Point - and the linker only sees compile-time annotations so any runtime discovery method (such as recursively discovering fields) is too late.
It's worth noting that this lack of recursion means that unusually complex UDTs may need multiple WithDependency calls - one for the type of a field on the UDT, another for a field within that type. The example demonstrates this principle with LineIntersection, Line and Point.
We would preserve the following member types:
- Parameterless constructor, since this is necessary to support
Activator.CreateInstance;
- Interfaces, since this determines which (de)serialization method to use;
- Public and non-public fields, because these are the fields which native serialization operates on;
- Public and non-public methods, to preserve
IBinarySerialize.Read and IBinarySerialize.Write (whether they're implemented implicitly or explicitly.)
This is slightly coarser than we'd strictly need - but DynamicallyAccessedMembers doesn't provide a way to only preserve specific interface implementations. We don't force the trimmer to preserve properties here: IBinarySerialize-based implementations should reference them anyway, and native serialization only operates over fields.
Finally, although we preserve IBinarySerialize.Read / .Write, client applications will be responsible for making sure that those two methods are only performing operations which are compatible with trimming.
At the point of implementation, I would expect that Register and WithDependency would flow typeof(T) into a private method with an identically-annotated Type parameter. This would enable the linker to observe the preservation requirements.
Describe alternatives you've considered
Provide an assembly-scoped attribute
We're using SqlUserDefinedTypeRegistration to perform type declaration tasks which are nothing more than signals to the IL trimmer. This sort of declarative task would typically be better as an attribute on the assembly.
I've ruled this out because the IL trimmer won't perform data flow analysis into this attribute's constructor parameters at link time.
Drop WithDependency method
There's an argument to be made that we could simply call Register multiple times. This would preserve the types, but it leaves two semantic gaps open.
- When using native serialization, it's possible to nest structs. SqlClient only respects the serialization method specified on the top-level type - so a struct which implements
IBinarySerialize could be serialized in two different ways depending on whether we serialize the struct directly, or as a member of a struct which uses native serialization. It's helpful to have a way to expose this semantic relationship in code.
- If a client registers each struct in the UDT independently then they're treating it as a set of types which should be preserved during UDT (de)serialization. They're essentially configuring an internal implementation detail of SqlClient. Having a
Register / WithDependency pair means that clients are registering a specific UDT for preservation by describing its serialization graph, and allowing SqlClient to encapsulate its own implementation.
For these reasons, the API proposal provides a way to describe this relationship between types in code.
// Describe a type TopLevelStruct with a field of type MemberStructType.
SqlUserDefinedTypeRegistration.Register<TopLevelStruct>()
.WithDependency<MemberStructType>();
// Describe TopLevelStruct and MemberStructType as independent UDTs.
SqlUserDefinedTypeRegistration.Register<TopLevelStruct>();
SqlUserDefinedTypeRegistration.Register<MemberStructType>();
Move API to SqlConnection
I don't want to prematurely dismiss this particular alternative, so this might need some discussion. The two methods I'm describing are linker-level preservation requirements; I felt that this was unrelated to SqlConnection's functionality, so chose to place the methods on a new type.
It's worth noting that SqlConnection already contains a RegisterColumnEncryptionKeyStoreProviders method and Register would have a somewhat similar purpose. The key difference is that RegisterColumnEncryptionKeyStoreProviders operates at runtime, while Register is a signal to the linker, and I found this a compelling enough reason to treat them differently.
Registry pattern
I considered exposing a SqlUserDefinedTypesRegistry object with Register<T> and WithDependency<T> methods, then providing a UserDefinedTypes property on SqlCommand. This would mean that we could also front-load the process of getting a list of fields and building serializers.
This is attractive because it aligns closely to SqlClient's existing process-wide mapping from Type to serializer, but the semantics of this would need to specifically be that this registry would contain all of the UDTs which were expected from executing a specific command. If UserDefinedTypes is null, we'd expect a (potentially incomplete) fallback to the current process-wide Type/serializer mapping; if UserDefinedTypes is non-null and the assembly-qualified name doesn't exist, an exception would be thrown.
I decided against it because I think the design would be overkill. Our core requirement is to make sure that UDTs we care about (and their relevant metadata and associated types) are preserved by the IL trimmer by indicating that they may be referenced dynamically by SQL Server. I don't personally see the need to expand the API surface with a full registry pattern for that.
There were also additional considerations around whether we'd want to be able to unregister types, and what should happen if someone adds an entry to a registry while a command was executing (for example, between a successful call to SqlCommand.ExecuteReader and a call to SqlDataReader.GetFieldValue<T>. While both of these are solvable, they introduce complexity and made the idea of SqlUserDefinedTypesRegistry harder to justify.
Client-specified DynamicDependency attributes
Clients could technically achieve a similar outcome by using a [DynamicDependency] attribute. However, this tightly couples each client to the implementation details of SqlClient's UDT serialization and deserialization logic and requires them to modify theirs if our implementation changes. In an environment which uses IL trimming with DynamicDependency, the type of reflection which SqlClient performs is part of its public contract with client applications.
The proposed API encapsulates this reflection contract within SqlClient, pulling ownership of it back within the library.
Additional context
Although I've described this as primarily enabling SqlClient to operate in IL-trimmed environments, it's also worth noting that Native AOT environments have very similar constraints - we can't assume that we'll be able to dynamically discover CLR UDTs at runtime. In both cases, the result of this API proposal will cause the linker to preserve the type metadata, its members and its default constructor, enabling Type.GetType, Activator.CreateInstance and reflection-based field access to function.
This API would address one of the major blockers for UDT support in trimmed/Native AOT environments; it's thus part of #1947.
Is your feature request related to a problem? Please describe.
SqlClient supports CLR-defined user-defined types via runtime type resolution using assembly-qualified names supplied by SQL Server. These assembly-qualified names are defined in MS-TDS, section 2.2.5.5.2.
This assembly-qualified name is resolved to a concrete
TypeviaType.GetType, and it can be deserialized using one of two methods. The method is selected based upon an attribute attached to the type definition.Method 1: custom serialization
This requires the type to implement
IBinarySerialize. When reading an object from SQL Server, the type's default constructor is invoked, and SqlClient callsIBinarySerialize.Read. When passing it as a parameter, SqlClient callsIBinarySerialize.Writeon the object.SqlClient implements this using the
BinarySerializeSerializerclass.Method 2: native serialization
When this is used, SqlClient follows the MS-SSCLRT protocol specification, adhering to the constraints in section 2.3.1.
The serializer for this is
NormalizedSerializer. This uses aBinaryOrderedUdtNormalizerinstance to get each instance field, sort them byMarshal.OffsetOfand write the field values.Problem
The problem is that neither of these methods is compatible with IL trimming. Although the types have been annotated, the underlying tension is protocol-level: there's no way for the IL trimmer to know what types it should preserve because this type information comes from the database at runtime.
This tension materialises at a few key locations:
Type.GetType.Activator.CreateInstance, the constructor won't be found and the type can't be instantiated.Of these, the issue with certain fields being trimmed is the most serious. A trimmed type or constructor will throw an exception, but silently serialising an incomplete type would result in corrupt data - a situation which SQL Server doesn't verify.
The fact that the UDT's assembly-qualified name originates from SQL Server metadata also means that there's no call site which we can annotate to preserve these types - and even if one did exist, the annotations would only apply to the single type specified. This means they wouldn't cover more complex scenarios where a UDT can consist of multiple nested structs.
Describe the solution you'd like
I'm proposing a set of APIs which clients can call to use CLR-defined UDTs safely:
This API surface wouldn't replace the existing CLR UDT resolution surface. Its only purpose is to declare CLR UDT preservation requirements in a form which can be observed and respected by the IL linker at build/publish time. It doesn't do anything at runtime, and would be safe to run from a .NET Framework (or a non-trimmed .NET Core) application.
Client applications would be expected to invoke these methods on a code path which the IL linker sees as reachable (such as
Main()or a module initializer) in order to preserve the UDTs they want to read:Clients call
Registerto register the "top-level" UDT - the one directly registered within SQL Server viaCREATE TYPE. If that top-level UDT has fields which are also struct types, those types are registered as dependencies via calls toWithDependency. The second step is necessary becauseDynamicallyAccessedMembersonly applies preservation requirements to the type we attach it to. Preserving the fields ofLinedoesn't automatically preserve the constructor, fields and methods ofPoint- and the linker only sees compile-time annotations so any runtime discovery method (such as recursively discovering fields) is too late.It's worth noting that this lack of recursion means that unusually complex UDTs may need multiple
WithDependencycalls - one for the type of a field on the UDT, another for a field within that type. The example demonstrates this principle withLineIntersection,LineandPoint.We would preserve the following member types:
Activator.CreateInstance;IBinarySerialize.ReadandIBinarySerialize.Write(whether they're implemented implicitly or explicitly.)This is slightly coarser than we'd strictly need - but
DynamicallyAccessedMembersdoesn't provide a way to only preserve specific interface implementations. We don't force the trimmer to preserve properties here:IBinarySerialize-based implementations should reference them anyway, and native serialization only operates over fields.Finally, although we preserve
IBinarySerialize.Read/.Write, client applications will be responsible for making sure that those two methods are only performing operations which are compatible with trimming.At the point of implementation, I would expect that Register and WithDependency would flow
typeof(T)into a private method with an identically-annotated Type parameter. This would enable the linker to observe the preservation requirements.Describe alternatives you've considered
Provide an
assembly-scoped attributeWe're using
SqlUserDefinedTypeRegistrationto perform type declaration tasks which are nothing more than signals to the IL trimmer. This sort of declarative task would typically be better as an attribute on the assembly.I've ruled this out because the IL trimmer won't perform data flow analysis into this attribute's constructor parameters at link time.
Drop
WithDependencymethodThere's an argument to be made that we could simply call
Registermultiple times. This would preserve the types, but it leaves two semantic gaps open.IBinarySerializecould be serialized in two different ways depending on whether we serialize the struct directly, or as a member of a struct which uses native serialization. It's helpful to have a way to expose this semantic relationship in code.Register/WithDependencypair means that clients are registering a specific UDT for preservation by describing its serialization graph, and allowing SqlClient to encapsulate its own implementation.For these reasons, the API proposal provides a way to describe this relationship between types in code.
Move API to
SqlConnectionI don't want to prematurely dismiss this particular alternative, so this might need some discussion. The two methods I'm describing are linker-level preservation requirements; I felt that this was unrelated to
SqlConnection's functionality, so chose to place the methods on a new type.It's worth noting that
SqlConnectionalready contains aRegisterColumnEncryptionKeyStoreProvidersmethod andRegisterwould have a somewhat similar purpose. The key difference is thatRegisterColumnEncryptionKeyStoreProvidersoperates at runtime, whileRegisteris a signal to the linker, and I found this a compelling enough reason to treat them differently.Registry pattern
I considered exposing a
SqlUserDefinedTypesRegistryobject withRegister<T>andWithDependency<T>methods, then providing aUserDefinedTypesproperty onSqlCommand. This would mean that we could also front-load the process of getting a list of fields and building serializers.This is attractive because it aligns closely to SqlClient's existing process-wide mapping from Type to serializer, but the semantics of this would need to specifically be that this registry would contain all of the UDTs which were expected from executing a specific command. If
UserDefinedTypesis null, we'd expect a (potentially incomplete) fallback to the current process-wide Type/serializer mapping; ifUserDefinedTypesis non-null and the assembly-qualified name doesn't exist, an exception would be thrown.I decided against it because I think the design would be overkill. Our core requirement is to make sure that UDTs we care about (and their relevant metadata and associated types) are preserved by the IL trimmer by indicating that they may be referenced dynamically by SQL Server. I don't personally see the need to expand the API surface with a full registry pattern for that.
There were also additional considerations around whether we'd want to be able to unregister types, and what should happen if someone adds an entry to a registry while a command was executing (for example, between a successful call to
SqlCommand.ExecuteReaderand a call toSqlDataReader.GetFieldValue<T>. While both of these are solvable, they introduce complexity and made the idea ofSqlUserDefinedTypesRegistryharder to justify.Client-specified
DynamicDependencyattributesClients could technically achieve a similar outcome by using a
[DynamicDependency]attribute. However, this tightly couples each client to the implementation details of SqlClient's UDT serialization and deserialization logic and requires them to modify theirs if our implementation changes. In an environment which uses IL trimming withDynamicDependency, the type of reflection which SqlClient performs is part of its public contract with client applications.The proposed API encapsulates this reflection contract within SqlClient, pulling ownership of it back within the library.
Additional context
Although I've described this as primarily enabling SqlClient to operate in IL-trimmed environments, it's also worth noting that Native AOT environments have very similar constraints - we can't assume that we'll be able to dynamically discover CLR UDTs at runtime. In both cases, the result of this API proposal will cause the linker to preserve the type metadata, its members and its default constructor, enabling
Type.GetType,Activator.CreateInstanceand reflection-based field access to function.This API would address one of the major blockers for UDT support in trimmed/Native AOT environments; it's thus part of #1947.