Releases: data-solution-automation-engine/data-solution-automation-metadata-schema
Data Solution Automation Schema v2.1.0
This release renames the project from Data Warehouse Automation to Data Solution Automation and ships the substantial schema improvements that have accumulated since v2.0.3.
The C# library is now published as DataSolutionAutomation on NuGet. The previous DataWarehouseAutomation package is deprecated and remains available for backward compatibility.
Project rename
- C# namespace:
DataWarehouseAutomation.*→DataSolutionAutomation.*;DwaModel→DsaModel. - NuGet package id:
DataWarehouseAutomation→DataSolutionAutomation. - JSON Schema files:
interfaceDataWarehouseAutomationMetadataV*.json→interfaceDataSolutionAutomationMetadataV*.json. - Solution and project folders renamed;
RunDwhAutomation→RunDsaAutomation. - GitHub repository renamed to
data-solution-automation-metadata-schema(old URLs auto-redirect).
New features
- JSON Schema v2.1 (
interfaceDataSolutionAutomationMetadataV2_1.json) published alongside v2.0 for backward compatibility. - Business Key Definitions are now ordered. Components are an ordered list of references to data items (
BusinessKeyComponent), with explicitOrdinalPosition. Backward-compatible reading of the legacybusinessKeyComponentMappingsform is retained. - Relationships and cardinality:
relatedDataObjectsis replaced by a richerrelationshipscollection of typedRelationshipobjects withCardinality(object form withfromRange/toRange) and an optionalRelatedDataObjectIdfor ID-only references. - Template references: new
TemplateMappingtype plumbed throughDataObjectMappingList,DataObjectMapping,DataObject, andDataConnection— lets metadata travel with the templates that generate code from it. - Inlined queries:
queryCode/queryLanguagenow sit directly onDataObjectandDataItem, retiring the separateDataObjectQuery/DataItemQuerytypes. DataItemMappingRef: lightweight identifier-based references to data item mappings for use insideRelationship(avoids re-embedding the full mapping).- New property additions:
DataClassification:group(e.g. "Solution Layer", "Logical", "Physical") andscope(where the classification applies).DataItem:isNullable.DataItemMapping:nameandnotes.DataObjectMappingList:id,notes,templateMappings.BusinessKeyDefinition:ordinalPosition.
- New
{{hasClassification}}Handlebars helper for evaluating whether a classification list contains a given value.
Breaking changes
- C# namespace changed (
DataWarehouseAutomation→DataSolutionAutomation); downstream code must updateusingstatements and<PackageReference>. DataObjectQueryandDataItemQuerytypes removed — setqueryCodedirectly on the baseDataObject/DataIteminstead.IDataObject,IDataItem,IMetadatainterfaces removed — they existed only to support the polymorphic Query types.relatedDataObjectsis nowrelationshipsonDataObjectMapping.businessKeyComponentMappingsis retained for backward-compatible reading but new metadata should usebusinessKeyComponents.
Documentation
- New docs site built with Astro and Starlight, replacing the previous DocFX setup. Builds on Ubuntu via GitHub Actions, deploys to GitHub Pages: https://data-solution-automation-engine.github.io/data-solution-automation-metadata-schema/.
- Schema reference pages are auto-generated from the C# class library (the canonical source of truth) at build time.
- Overview, getting-started, FAQ, and Handlebars helper documentation migrated and refreshed.
Tooling
- License metadata (LGPL-3.0-or-later) is now declared in the NuGet package.
- The package README is now sourced from the repository root README.
- Validation samples cleaned up and re-validated against the v2.1 schema.
Additional context: blog post on v2.1.0 schema improvements.
Data Warehouse Automation Schema v2.0.3
Get it on https://www.nuget.org/packages/DataWarehouseAutomation. Version name has been synced to NuGet package (v2.0.3).
New features
- Upgrade of the code to .Net 8.0.
- A new HasClassification handlebars extension / helper has been added to allow for true/false checks if a certain classification is available in the selected objects. Corresponding examples have been updated to showcase.
- Allowing the TargetDataItem to be an IDataItem, meaning it can be either a logic or column same as the source data items. This was a limitation without real reason, it would be possible to move logic into logic this way.
- Allowing the TargetDataObject to be an IDataObject, meaning it can be either a logic or column same as the source data objects.
- Related Data Objects are now also of type IDataObject, not limiting to files or tables only.
- Data Object Mappings can now have a list of data items irrespective of the data item mappings, this allows logic / transformation components to be defined in a reusable manner.
- A Data Object Query can now have a list of IDataItems as well. Same as above, this allows columns to be specified for transformation objects.
Breaking changes
- N/A
Bug fixes
- Tidying up of directory structures, moving object model into dedicated space (DwaModel)
Data Warehouse Automation Schema v2.0
New features
- Documentation page, which triggers on any commit into 'main' and updates the Github pages: https://data-solution-automation-engine.github.io/data-warehouse-automation-metadata-schema/.
- Introduction of various interfaces in the object model, to enforce standards and make it easier to work with different types of objects. Conceptually this replaces the dynamic types.
- Addition of various HandleBars helpers.
- All samples / templates updated and refreshed.
A core change is that the 'Data Query' concept has been split out into a query at Data Item level (DataItemQuery) and one at Data Object level (DataObjectQuery). Both types share the same interface, and have various shared properties.
However, they also have their own properties that are only relevant for their 'level'. For example, a connection applies to the Data Object Query, but not to the Data Item Query. Similarly, an ordinal position applies to the Data Item Query, but not to the Data Object Query.
Breaking changes
For the introduction of the interfaces, various minor inconsistencies have been corrected which will have an impact on the templates used. The templates and samples have been corrected, but any projects using this latest version have to be mindful that some property names have changes.
Data Object Mapping
- 'name' has been made mandatory
- 'mappingClassifications' has been renamed to 'classifications'
- 'businessKeys' has been renamed to 'businessKeyDefinitions'
Data Classification
- The Classification object/class has been renamed to 'DataClassification'
Extension
- 'description' has been renamed to 'notes'
Data Object
- 'dataObjectName' has been renamed to 'name'
- 'dataObjectConnection' has been renamed to 'dataConnection'
- 'dataObjectClassifications' has been renamed to classifications
Data Query
- Split into 'DataItemQuery' and 'DataObjectQuery'
- 'dataQueryName' has been renamed to 'name'
- 'dataQueryCode' has been renamed to 'queryCode'
- 'dataQueryLanguage' has been renamed to 'queryLanguage'
Data Connection
- 'connectionstring' has been renamed to 'name'
Business Key Definition
- 'businessKeyClassification' has been renamed to 'classifications'
Data Item
- 'dataItemClassification' has been renamed to 'classifications'
Bug fixes
N/A
Data Warehouse Automation Interface v1.3.4
New features
- Published as NuGet package https://www.nuget.org/packages/DataWarehouseAutomation/1.3.4.
- Additional code generation helper functions (#exists, #targetDataItemExists.
- Various new samples for complex transformations.
- Targeting .Net 7.
- Validation feature has been updated, moved to the class library (was a separate executable).
Breaking changes
N/A
Bug fixes
N/A
Data Warehouse Automation Interface v1.3.1
- Added NULLable string id fields, changing from integer to string wherever applicable. The idea is that each segment can be uniquely identifies, if the developer choses to do so. Because the id can be a string or guid (or number) this is a string type. This means that any existing integer id fields must be re-generated as string values.
- Moved away from Newtonsoft in favor of Text.Json where possible for future proofing. This is with the exception of the validator (testing project), which still uses Json.Net.
- Updated RunDwhAutomation command line with the latest versions of the libraries.
Data Warehouse Automation Interface v1.3
The major addition for this release is the addition of the dataObject property at dataItem level. This means that for any attribute or column, you can specify to what (parent) data object it belongs. This is especially useful in dataItemMappings, where sometimes multiple columns from multiple objects are mapped to a single target.
Having the dataObject information available will allow you to construct fully-qualified names for generated output code.
Data Warehouse Automation Interface v1.2.2
Another convenience release to address some minor bugs and update libraries. No new functionality added.
The Handlebars libraries have been updated, and because of this some of the helper functions needed a change. This causes some smaller changes in the templates.
- Custom block helpers (e.g. stringcompare, stringdiff) need a hash (#) for the name, so this is now {{#stringcompare}} instead of the former {{stringcompare}}. This is required by the new Handlebars update and makes it consistent with built-in helpers.
- {#if} blocks need the name in double quotes. So {{#if filterCriterion}} must become {{#if "filterCriterion}}. The reason for this is that the deserialisation using JObject (as opposed to loading straight into the DataWarehouseAutomation class) requires double quotes around the objects otherwise these will not be found.
- Testing of all custom Handlebars functions.
Data Warehouse Automation Interface v1.2.1
Very minor updates including additional samples and library updates.
Also added some missing extensions to the templating engine (Handlebars).
Data Warehouse Automation Interface v1.2
Changes for this update most notably include switching to a different deserialisation mode for the examples and command line utility. Using JObjects does not require exact matching of the intended class structure to load / deserialise the Json files. In fact, this way any Json file can be deserialised and matched to whatever pattern.
Other changes:
- Renamed dataObjectClassification to dataObjectClassifications to be consistent with the list/collection nature of the property and with other similar approaches throughout the class library.
- Some lists have been updated to List to allow for multiple types/classes to be added to the list.
- Free form sample added (e.g. different structure than class library).
- Added Command Line Interface / utility so code generation can be run from command line.
Data Warehouse Automation Interface v1.2
Changes compared to v1.1
- Added extension object / class to facilitate adding any kind of context or information to any of the main objects. This means any developer can add his or her labels to the objects and use these in the patterns. As per #10.
- Improved support for complex transformations, by supporting source Data Objects and Data Items are defined as lists/arrays. This means multiple sources can be mapped to a target at both Data Object and Data Item level (#9).
- Supported OneOf features defined in Json for Data Object source and Data Item source. This was already defined in the schema, but not properly supported in the class library. Now both the DataQuery and / or DataObject/Item can be used interchangeably. This provides additional support for complex transformations. As per #11.
- Updated all examples and all regression test files.
- Included some new examples, such as how to build more complex logic and how to use extensions.
- Improved coding standards, all lists are now plural as opposed to being called Lists. So for instance DataObjectMappingList is now DataObjectMappings (#6). This also means other lists are renamed such as SourceDataObject => SourceDataObjects and SourceDataItem => SourceDataItems.
- Data Item Mappings are now optional for a Data Object Mapping (#7).