All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
datacontract testnow logs the Data Contract CLI version and whether it ran as a local CLI or through the FastAPI server (including the request URL) as part of the test result logs
- Schema checks now resolve each property by its
physicalNamewhen set (falling back toname), matching the existing table-level resolution and the SQL/BigQuery exporters. Previously a property whose logicalnamediffered from its physical column (e.g.name: brandwithphysicalName: BRAND) failed the presence and type checks even though the physical column existed (#1246)
- new
datacontract dbt synccommand: generate dbt tests from an ODCS contract, then rundbt testfor them, and optionally publish the results to Entropy Data (#1222, #1235) redshiftserver type fordatacontract test(requirespip install datacontract-cli[redshift]). (#1236)
- SQL type converter: emit canonical
decimal/numericper dialect (Postgres →numeric, MySQL →decimal) sotest's column-type check matchesinformation_schema(#1237)
impalaextra (pip install datacontract-cli[impala]) — pulls insoda-core-impala. Impala engine support landed in #965 but the install extra was never added; users had to installsoda-core-impalamanually. Also included in[all].
- breaking: drop the
dbtextra and thedbt-coredependency.import dbtnow readsmanifest.jsondirectly with no third-party dependency, and works without installing any extra. Minimum supported manifest schema version is v9 (dbt 1.5+). Users who installeddatacontract-cli[dbt]should switch to plaindatacontract-cli. - breaking: the
protobufextra now requires theprotoccompiler installed on the system. Replaces the bundledgrpcio-tools(~50 MB of platform-specific protoc binaries) with the lighterprotobufruntime (>=3.20,<7.0).import protobufraises a clear error with platform-specific install hints ifprotocis not onPATH. Install withbrew install protobuf(macOS),sudo apt install protobuf-compiler(Debian/Ubuntu), etc. — see README.
- README install table: add missing
csv,excel, andoracleextras. The matching[project.optional-dependencies]entries already existed but were undocumented. - quality: support
{object}and${object}placeholder in SQL quality queries as the ODCS-spec name for the current schema object (alias for{model}/{table}) (#676) changelogcommand help text now advertises(url or path)for V1/V2 arguments, clarifying that HTTP/HTTPS URLs are accepted (#1162)- breaking:
testcommand now exits non-zero when a server is specified, but soda-core fails to connect or authenticate (#1181) - correct swapped
check_typelabelsmodel_qualty_sqlandfield_quality_sql(#1187) import sparknow emits a native Spark SQL physicalType (e.g.string) instead of Python repr (e.g.StringType()). Contracts imported using Spark in v0.11.0–v0.12.1 did not perform type checks and must be re-imported. (#1048)- Re-add
setuptoolsas a base dependency. soda-core'senv_helper.pyimportsfrom distutils.util import strtobool;distutilswas removed from stdlib in Python 3.12 and stripped from python-build-standalone 3.11 builds.setuptoolsprovides thedistutilsshim. Previously pulled in transitively viagrpcio-tools; now required explicitly. Reverts #1199 — see soda-core#2091. - SLA freshness checks now quote column identifiers with special characters (#1202)
- update field / model quotation for Impala, dataframe, and Kafka (#1202)
- make
--schemaa deprecated alias for--json-schemato (will be removed in v0.13.0)
This release introduces several changes to improve the usability of datacontract-cli for AI Agents.
- Breaking: Several changes in the CLI syntax (#1157):
Fix in v0.12.1: re-added
--schemaas alias for the new--json-schema(will be removed in v0.13.0)
| Command | Old option | New option |
|---|---|---|
lint, test, ci, publish, catalog |
--schema <PATH> (will work until v0.13.0) |
--json-schema <PATH> |
export, import |
--format <FORMAT> <OPTIONS> |
<FORMAT> <OPTIONS> (drop --format) |
| Export options: | ||
export --format dbt |
--format dbt |
dbt-models (format renamed) |
export --format great-expectations |
--sql-server-type <TYPE> |
--dialect <TYPE> |
export --format rdf |
--rdf-base <URI> |
--base <URI> |
export --format sql |
--sql-server-type <TYPE> |
--dialect <TYPE> |
export --format sql-query |
--sql-server-type <TYPE> |
--dialect <TYPE> |
| Import options: | ||
import --format bigquery |
--bigquery-[project|dataset|table] <NAME> |
--[project|dataset|table] <NAME> |
import --format dbt |
--dbt-model <NAME> |
--model <NAME> |
import --format glue |
--source <NAME>, --glue-table <NAME> |
--database <NAME>, --table <NAME> |
import --format iceberg |
--iceberg-table <NAME> |
--table <NAME> |
import --format unity |
--unity-table-full-name <NAME> |
--table <NAME> |
import --format spark |
--source <NAMES> |
--tables <NAMES> |
import |
--template |
dropped (was a no-op) |
The --schema option (referring to the ODCS JSON schema) was renamed to --json-schema to avoid confusion with --schema-name, which refers to the schema within the data contract to test for.
- Error messages for uncaught exceptions are shortened now. Pass
--debug(or setDATACONTRACT_CLI_DEBUG=1) to see the full traceback. (#1175) - Add example calls to
--helpoutputs (#1176) - Add explicit errors when required env vars for soda connections are missing (#1177)
- Validate some of the CLI options against their allowed values instead of accepting any string (#1178)
- Added
--checksoption totestcommand to selectively run check categories:schema,quality,servicelevel(#678) - Added
--schema-nameoption totestcommand to test a specific schema instead of all schemas (#1079,#1085 @kelsoufi-sanofi)
- Move
precision/scalefornumbertypes fromlogicalTypeOptionstocustomProperties(#1145,#1160 @davidb-tada) - Emit placeholder server values in SQL importer so generated contracts pass lint (#1146,#1152 @Ai-chan-0411)
- Fix Protobuf export for arrays of objects and improve message/enum naming to UpperCamelCase (#1012 @Schokuroff)
- Exit with code 1 when
--servername is not found (#1153,#1161 @Ai-chan-0411)
Thanks to @kelsoufi-sanofi for the new --schema-name option on test, and to @Schokuroff, @Ai-chan-0411, and @davidb-tada for their contributions.
- Added
cicommand for CI/CD-optimized test runs: multi-file support, GitHub Actions annotations and step summary, Azure DevOps annotations,--fail-onflag,--jsonoutput (#1114) - Added
changelogcommand and API endpoint (#1118 @davidb-tada) - Added opt-in
--all-errorsmode fordatacontract lintto report all JSON Schema validation errors, with matchingall_errorssupport in the Python library and API (#1125 @jmbenedetto) - Added
--schema-nameoption to custom model export (#978 @AntoineGiraud)
- Avro importer now raises an error for union fields with multiple non-null types, which are not supported by ODCS (#1124)
- Fix SQL export generating multiple PRIMARY KEY constraints for composite keys (#1026,#1092 @barry0451 @dwestheide)
- Preserve parametrized physicalTypes for SQL export (#1086,#1093 @barry0451 @alexander-griesbeck)
- Fix incorrect SQL type mappings: SQL Server
double/jsonb, MySQL barevarchar, missing Trino types (#1110) - Fix markdown export breaking table structure when extra field values contain pipe characters (#832,#1117 @barry0451 @grepwood)
- Fix dbt import using incorrect physicalType instead of actual materialization type (#1136)
- Remove unnecessary numpy dependency from databricks and kafka extras (#1135 @kayhendriksen)
Special thanks to @davidb-tada for the outstanding contribution of the new changelog command and API endpoint! Also thanks to @barry0451 for multiple quality fixes across the SQL exporter and markdown export, and to @AntoineGiraud and @jmbenedetto for their feature contributions.
- Escape single quotes in string values for SodaCL checks (#1090)
- Escape BigQuery field and model names with backticks for SodaCL checks (#736)
- Escape Databricks model names with backticks for SodaCL checks
- Fixed catalog export SpecView not having a tags property for the index.html template (#1059)
- Fix SQL importer type mappings: binary types, datetime/time, uuid now map to correct ODCS logicalType and format (#790)
- Added support for MySQL for data contract tests (#1101)
- Support additional PyArrow types in Parquet importer (#1091)
- Populate
logicalTypeOptions.formatfor SQL import from binary and uuid types (#790) - Snowflake DDL import with tags, descriptions, and template variable handling (#790)
- Fix parser error for CSV / Parquet table names containing special characters (#1066)
- Fix BigQuery export failing with "Unsupported type" for parameterized physicalType like
NUMERIC(18, 4)(#1083)
- Added JSON output format for test results (
--output-format json) - Added Azure AD / Entra ID authentication support for SQL Server and Microsoft Fabric
- Fix BigQuery import for repeated fields (#1017)
- Make Markdown export compatible with XHTML by replacing
<br>with<br />(#1030) - Add ADC/WIF and impersonation support for BigQuery (#1064)
- Fix Snowflake quoted identifiers by enabling double-quote quoting (#1053)
- Fix retention duration crash for numeric ODCS values (#1051)
- Fix physicalType bypass for precision and scale conversion (#1043)
- Fix mkdir TOCTOU race causing silent JUnit write failure (#1050)
- Fix validation failure for field names with special chars on Databricks (#1049)
- Add Azure support for field name quoting in schema checks (#1025)
- Made
duckdban optional dependency. Install withpip install datacontract-cli[duckdb]for local/S3/GCS/Azure file testing. - Removed unused
fastparquetandnumpycore dependencies.
- Include searchable tags in catalog index.html
- Fixed example(s) field mapping for Data Contract Specification importer (#992).
- Spark exporter now supports decimal precision/scale via
customPropertiesor parsing fromphysicalType(e.g.,decimal(10,2)) (#996) - Fix catalog/HTML export failing on ODCS contracts with no schema or no properties (#971)
- Fix
datacontract initto generate ODCS format instead of deprecated Data Contract Specification (#984) - Fix ODCS lint failing on optional relationship
typefield by updating open-data-contract-standard to v3.1.2 (#971) - Restrict DuckDB dependency to < 1.4.0 (#972)
- Fixed schema evolution support for optional fields in CSV and Parquet formats. Optional fields marked with
required: falseare no longer incorrectly treated as required during validation, enabling proper schema evolution where optional fields can be added to contracts without breaking validation of historical data files (#977) - Fixed decimals in pydantic model export. Fields marked with
type: decimalwill be mapped todecimal.Decimalinstead offloat. - Fix BigQuery test failure for fields with FLOAT or BOOLEAN types by mapping them to equivalent types (BOOL and FLOAT64)
- Add Impala engine support for Soda scans via ODCS
impalaserver type.
- Restrict DuckDB dependency to < 1.4.0 (#972)
This is a major release with breaking changes: We switched the internal data model from Data Contract Specification to Open Data Contract Standard (ODCS).
Not all features that were available are supported in this version, as some features are not supported by the Open Data Contract Standard, such as:
- Internal definitions using
$ref(you can refer to external definitions viaauthoritativeDefinition) - Lineage (no real workaround, use customProperties or transformation object if needed)
- Support for different physical types (no real workaround, use customProperties if needed)
- Support for enums (use quality metric
invalidValues) - Support for properties with type map and defining
keysandvalues(use logical type map) - Support for
scaleandprecision(define them inphysicalType)
The reason for this change is that the Data Contract Specification is deprecated, we focus on best possible support for the Open Data Contract Standard. We try to make this transition as seamless as possible. If you face issues, please open an issue on GitHub.
We continue support reading Data Contract Specification data contracts during v0.11.x releases until end of 2026. To migrate existing data contracts to Open Data Contract Standard use this instruction: https://datacontract-specification.com/#migration
- ODCS v3.1.0 is now the default format for all imports.
- Renamed
--modeloption to--schema-namein theexportcommand to align with ODCS terminology. - Renamed exporter files from
*_converter.pyto*_exporter.pyfor consistency (internal change).
- If an ODCS slaProperty "freshness" is defined with a reference to the element (column), the CLI will now test freshness of the data.
- If an ODCS slaProperty "retention" is defined with a reference to the element (column), the CLI will now test retention of the data.
- Support for custom Soda quality checks in ODCS using
type: customandengine: sodawith raw SodaCL implementation.
- Oracle: Fix
service_nameattribute access to use ODCS field nameserviceName
- The
breaking,changelog, anddiffcommands are now deleted (#925). - The
terraformexport format has been removed.
- Great Expectations export: Update to Great Expectations 1.x format (#919)
- Changed
expectation_suite_nametonamein suite output - Changed
expectation_typetotypein expectations - Removed
data_asset_typefield from suite output - Breaking: Users with custom quality definitions using
expectation_typemust update to usetype
- Changed
- test: Log server name and type in output (#963)
- api: CORS is now enabled for all origins
- quality: Support
{schema}and${schema}placeholder in SQL quality checks to reference the server's database schema (#957) - SQL Server: Support
DATACONTRACT_SQLSERVER_DRIVERenvironment variable to specify the ODBC driver (#959) - Excel: Add Oracle server type support for Excel export/import (#960)
- Excel: Add local/CSV server type support for Excel export/import (#961)
- Excel Export: Complete server types (glue, kafka, postgres, s3, snowflake, sqlserver, custom)
- Protobuf import: Fix transitive imports across subdirectories (#943)
- Protobuf export now works without error (#951)
- lint: YAML date values (e.g.,
2022-01-15) are now kept as strings instead of being converted to datetime objects, fixing ODCS schema validation - export: field annotation now matches to number/numeric/decimal types
- Excel: Server port is now correctly parsed as integer instead of string for all server types
- Excel: Remove invalid
tableandviewfields from custom server import - Fixed DuckDB DDL generation to use
JSONtype instead of invalid emptySTRUCT()for objects without defined properties (#940)
- The
breaking,changelog, anddiffcommands are now deprecated and will be removed in a future version (#925)
- Support for ODCS v3.1.0
- Oracle DB: Client Directory for Connection Mode 'Thick' can now be specified in the
DATACONTRACT_ORACLE_CLIENT_DIRenvironment variable (#949)
- Import composite primary keys from open data contract spec
- Support for Oracle Database (>= 19C)
- Athena: Now correctly uses the (optional) AWS session token specified in the `DATACONTRACT_S3_SESSION_TOKEN' environment variable when testing contracts (#934)
- import: Support for nested arrays in odcs v3 importer
- lint: ODCS schema is now checked before converting
- --debug flag for all commands
- export: Excel exporter now exports critical data element
- Support for Data Contract Specification v1.2.1 (Data Quality Metrics)
- Support for decimal testing in spark and databricks (#902)
- Support for BigQuery Flexible Schema in Data Contract Checks (#909)
DataContract().import_from_source()as an instance method is now deprecated. UseDataContract.import_from_source()as a class method instead.
- Export to DQX: Correct DQX format for global-level quality check of data contract export. (#877)
- Import the table tags from a open data contract spec v3 (#895)
- dbt export: Enhanced model-level primaryKey support with automatic test generation for single and multiple column primary keys (#898)
- ODCS: field discarded when no logicalType defined (#891)
- Removed specific linters, as the linters did not support ODCS (#913)
- Export to DQX : datacontract export --format dqx (#846)
- API
/testendpoint now supportspublish_urlparameter to publish test results to a URL. (#853) - The Spark importer and exporter now also exports the description of columns via the additional metadata of StructFields (#868)
- Improved regex for extracting Azure storage account names from URLs with containerName@storageAccountName format (#848)
- JSON Schema Check: Add globbing support for local JSON files
- Fixed server section rendering for markdown exporter
datacontract testnow supports HTTP APIs.datacontract testnow supports Athena.
- Avro Importer: Optional and required enum types are now supported (#804)
- Export to Excel: Convert ODCS YAML to Excel https://github.com/datacontract/open-data-contract-standard-excel-template (#742)
- Extra properties in Markdown export. (#842)
- Import from Excel: Support the new quality sheet
- JUnit Test Report: Fixed incorrect syntax on handling warning test report. (#833)
- Added support for Variant with Spark exporter, data_contract.test(), and import as source unity catalog (#792)
- Excel Import should return ODCS YAML (#829)
- Excel Import: Missing server section when the server included a schema property (#823)
- Use
 instead of for tab in Markdown export.
- Support for Data Contract Specification v1.2.0
datacontract import --format json: Import from JSON files
datacontract api [OPTIONS]: Added option to pass extra arguments foruvicorn.run()
pytest tests\test_api.py: Fixed an issue where special characters were not read correctly from file.datacontract export --format mermaid: Fixed an issue where themermaidexport did not handle references correctly
- Much better ODCS support
- Import anything to ODCS via the
import --spec odcsflag - Export to HTML with an ODCS native template via
export --format html - Export to Mermaid with an ODCS native mapping via
export --format mermaid
- Import anything to ODCS via the
- The databricks
unityimporter now supports more than a single table. You can use--unity-table-full-namemultiple times to import multiple tables. And it will automatically add a server with the catalog and schema name.
datacontract catalog [OPTIONS]: Added version to contract cards inindex.htmlof the catalog (enabled search by version)- The type mapping of the
unityimporter no uses the native databricks types instead of relying on spark types. This allows for better type mapping and more accurate data contracts.
datacontract export --format mermaidExport to Mermaid (#767, #725)
datacontract export --format html: Adding the mermaid figure to the html exportdatacontract export --format odcs: Export physical type to ODCS if the physical type is configured in config objectdatacontract import --format spark: Added support for spark importer table level comments (#761)datacontract importrespects--ownerand--idflags (#753)
datacontract export --format sodacl: Fix resolving server when using--serverflag (#768)datacontract export --format dbt: Fixed DBT export behaviour of constraints to default to data tests when no model type is specified in the datacontract model
- Databricks: Add support for Variant type (#758)
datacontract export --format odcs: Export physical type if the physical type is configured in config object (#757)datacontract export --format sqlInclude datacontract descriptions in the Snowflake sql export ( #756)
- Extracted the DataContractSpecification and the OpenDataContractSpecification in separate pip modules and use them in the CLI.
datacontract import --format excel: Import from Excel template https://github.com/datacontract/open-data-contract-standard-excel-template (#742)
datacontract testwith DuckDB: Deep nesting of json objects in duckdb (#681)
datacontract import --format csvproduces more descriptive output. Replaced using clevercsv with duckdb for loading and sniffing csv file.- Updated dependencies
- Fix to handle logicalType format wrt avro mentioned in issue (#687)
- Fix field type from TIME to DATETIME in BigQuery converter and schema (#728)
- Fix encoding issues. (#712)
- ODCS: Fix required in export and added item and fields format (#724)
- Deprecated QualityLinter is now removed
-
datacontract test --output-format junit --output TEST-datacontract.xmlExport CLI test results to a file, in a standard format (e.g. JUnit) to improve CI/CD experience (#650) -
Added import for
ProtoBufCode for proto to datacontract (#696) -
dbt&dbt-sourcesexport formats now support the optional--serverflag to adapt the DBT columndata_typeto specific SQL dialects -
Duckdb Connections are now configurable, when used as Python library (#666)
-
export to avro format add map type
- Changed Docker base image to python:3.11-bullseye
- Relax fastparquet dependency
- Unicode Encode Error when exporting data contract YAML to HTML (#652)
- Fix multiline descriptions in the DBT export functionality
- Incorrectly parsing $ref values in definitions (#664)
- Better error message when the server configuration is missing in a data contract (#670)
- Improved default values in ODCS generator to avoid breaking schema validation (#671)
- Updated ODCS v3 generator to drop the "is" prefix from fields like
isNullableandisUnique(#669) - Fix issue when testing databricks server with ODCS format
- avro export fix float format
datacontract testnow also executes tests for service levels freshness and retention (#407)
datacontract import --format sqlis now using SqlGlot as importer.datacontract import --format sql --dialect <dialect>Dialect can now to defined when importing SQL.
- Schema type checks fail on nested fields for Databricks spark (#618)
- Export to Avro add namespace on field as optional configuration (#631)
datacontract test --examples: This option was removed as it was not very popular and top-level examples section is deprecated in the Data Contract Specification v1.1.0 (#628)- Support for
odcs_v2(#645)
datacontract export --format custom: Export to custom format with Jinjadatacontract apinow can be protected with an API key
datacontract serverenamed todatacontract api
- Fix Error: 'dict object' has no attribute 'model_extra' when trying to use type: string with enum values inside an array (#619)
- datacontract serve: now has a route for testing data contracts
- datacontract serve: now has a OpenAPI documentation on root
- FastAPI endpoint is now moved to extra "web"
- API Keys for Data Mesh Manager are now also applied for on-premise installations
- datacontract import --format csv
- publish command also supports publishing ODCS format
- Option to separate physical table name for a model via config option (#270)
- JSON Schemas are now bundled with the application (#598)
- datacontract export --format html: The model title is now shown if it is different to the model name (#585)
- datacontract export --format html: Custom model attributes are now shown (#585)
- datacontract export --format html: The composite primary key is now shown. (#591)
- datacontract export --format html: now examples are rendered in the model and definition (#497)
- datacontract export --format sql: Create arrays and struct for Databricks (#467)
- datacontract lint: Linter 'Field references existing field' too many values to unpack (expected 2) (#586)
- datacontract test (Azure): Error querying delta tables from azure storage. (#458)
- datacontract export --format data-caterer: Use
fieldsinstead ofschema - datacontract export --format data-caterer: Use
optionsinstead ofgenerator.options - datacontract export --format data-caterer: Capture array type length option and inner data type
- Fixed schemas/datacontract-1.1.0.init.yaml not included in build and
--templatenot resolving file
- Fixed an issue when resolving project's dependencies when all extras are installed.
- Definitions referenced by nested fields are not validated correctly (#595)
- Replaced deprecated
primaryfield withprimaryKeyin exporters, importers, examples, and Jinja templates for backward compatibility. Fixes #518. - Cannot execute test on column of type record(bigquery) #597
- added export format markdown:
datacontract export --format markdown(#545) - When importing in dbt format, add the dbt unique information as a datacontract unique field (#558)
- When importing in dbt format, add the dbt primary key information as a datacontract primaryKey field (#562)
- When exporting in dbt format, add the datacontract references field as a dbt relationships test (#569)
- When importing in dbt format, add the dbt relationships test field as a reference in the data contract (#570)
- Add serve command on README (#592)
- Primary and example fields have been deprecated in Data Contract Specification v1.1.0 (#561)
- Define primaryKey and examples for model to follow the changes in datacontract-specification v1.1.0 (#559)
- SQL Server: cannot escape reserved word on model (#557)
- Export dbt-staging-sql error on multi models contracts (#587)
- OpenTelemetry publisher, as it was hardly used
- Support for exporting a Data Contract to an Iceberg schema definition.
- When importing in dbt format, add the dbt
not_nullinformation as a datacontractrequiredfield (#547)
- Type conversion when importing contracts into dbt and exporting contracts from dbt (#534)
- Ensure 'name' is the first column when exporting in dbt format, considering column attributes (#541)
- Rename dbt's
teststodata_tests(#548)
- Modify the arguments to narrow down the import target with
--dbt-model(#532) - SodaCL: Prevent
KeyError: 'fail'from happening when testing with SodaCL - fix: populate database and schema values for bigquery in exported dbt sources (#543)
- Fixing the options for importing and exporting to standard output (#544)
- Fixing the data quality name for model-level and field-level quality tests
- Support for model import from parquet file metadata.
- Great Expectation export: add optional args (#496)
suite_namethe name of the expectation suite to exportengineused to run checkssql_server_typeto define the type of SQL Server to use when engine issql
- Changelog support for
InfoandTermsblocks. datacontract importnow has--outputoption for saving Data Contract to file- Enhance JSON file validation (local and S3) to return the first error for each JSON object, the max number of total errors can be configured via the environment variable:
DATACONTRACT_MAX_ERRORS. Furthermore, the primaryKey will be additionally added to the error message. - fixes issue where records with no fields create an invalid bq schema.
- Changelog support for custom extension keys in
ModelsandFieldsblocks. datacontract catalog --files '*.yaml'now checks also any subfolders for such files.- Optimize test output table on console if tests fail
- raise valid exception in DataContractSpecification.from_file if file does not exist
- Fix importing JSON Schemas containing deeply nested objects without
requiredarray - SodaCL: Only add data quality tests for executable queries
Data Contract CLI now supports the Open Data Contract Standard (ODCS) v3.0.0.
datacontract testnow also supports ODCS v3 data contract formatdatacontract export --format odcs_v3: Export to Open Data Contract Standard v3.0.0 (#460)datacontract testnow also supports ODCS v3 anda Data Contract SQL quality checks on field and model level- Support for import from Iceberg table definitions.
- Support for decimal logical type on avro export.
- Support for custom Trino types
datacontract import --format odcs: Now supports ODSC v3.0.0 files (#474)datacontract export --format odcs: Now creates v3.0.0 Open Data Contract Standard files (alias to odcs_v3). Old versions are still available as formatodcs_v2. (#460)
- fix timestamp serialization from parquet -> duckdb (#472)
datacontract export --format data-caterer: Export to Data Caterer YAML
datacontract export --format jsonschemahandle optional and nullable fields (#409)datacontract import --format unityhandle nested and complex fields (#420)datacontract import --format sparkhandle field descriptions (#420)datacontract export --format bigqueryhandle bigqueryType (#422)
- use correct float type with bigquery (#417)
- Support DATACONTRACT_MANAGER_API_KEY
- Some minor bug fixes
- Support for import of DBML Models (#379)
datacontract export --format sqlalchemy: Export to SQLAlchemy ORM models (#399)- Support of varchar max length in Glue import (#351)
datacontract publishnow also accepts theDATACONTRACT_MANAGER_API_KEYas an environment variable- Support required fields for Avro schema export (#390)
- Support data type map in Spark import and export (#408)
- Support of enum on export to avro
- Support of enum title on avro import
- Deltalake is now using DuckDB's native deltalake support (#258). Extra deltalake removed.
- When dumping to YAML (import) the alias name is used instead of the pythonic name. (#373)
- Fix an issue where the datacontract cli fails if installed without any extras (#400)
- Fix an issue where Glue database without a location creates invalid data contract (#351)
- Fix bigint -> long data type mapping (#351)
- Fix an issue where column description for Glue partition key column is ignored (#351)
- Corrected name of table parameter for bigquery import (#377)
- Fix a failed to connect to S3 Server (#384)
- Fix a model bug mismatching with the specification (
definitions.fields) (#375) - Fix array type management in Spark import (#408)
- Support data type map in Glue import. (#340)
- Basic html export for new
keysandvaluesfields - Support for recognition of 1 to 1 relationships when exporting to DBML
- Added support for arrays in JSON schema import (#305)
- Aligned JSON schema import and export of required properties
- Change dbt importer to be more robust and customizable
- Fix required field handling in JSON schema import
- Fix an issue where the quality and definition
$refare not always resolved - Fix an issue where the JSON schema validation fails for a field with type
stringand formatuuid - Fix an issue where common DBML renderers may not be able to parse parts of an exported file
- Add support for dbt manifest file (#104)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)
- Adds support for referencing fields within a definition (#322)
- Add
mapandenumtype for Avro schema import (#311)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)-
datacontract import --format spark: Import from Spark tables (#326) - Fix an issue where specifying
glue_tableas parameter did not filter the tables and instead returned all tables fromsourcedatabase (#333)
- Add support for Trino (#278)
- Spark export: add Spark StructType exporter (#277)
- add
--schemaoption for thecatalogandexportcommand to provide the schema also locally - Integrate support into the pre-commit workflow. For further details, please refer to the information provided here.
- Improved HTML export, supporting links, tags, and more
- Add support for AWS SESSION_TOKEN (#309)
- Added array management on HTML export (#299)
- Fix
datacontract import --format jsonschemawhen description is missing (#300) - Fix
datacontract testwith case-sensitive Postgres table names (#310)
datacontract servestart a local web server to provide a REST-API for the commands- Provide server for sql export for the appropriate schema (#153)
- Add struct and array management to Glue export (#271)
- Introduced optional dependencies/extras for significantly faster installation times. (#213)
- Added delta-lake as an additional optional dependency
- support
GOOGLE_APPLICATION_CREDENTIALSas variable for connecting to bigquery indatacontract test - better support bigqueries
typeattribute, don't assume all imported models are tables - added initial implementation of an importer from unity catalog (not all data types supported, yet)
- added the importer factory. This refactoring aims to make it easier to create new importers and consequently the growth and maintainability of the project. (#273)
datacontract export --format avrofixed array structure (#243)
- Test data contract against dataframes / temporary views (#175)
- AVRO export: Logical Types should be nested (#233)
- Fixed Docker build by removing msodbcsql18 dependency (temporary workaround)
- Added support for
sqlserver(#196) datacontract export --format dbml: Export to Database Markup Language (DBML) (#135)datacontract export --format avro: Now supports config map on field level for logicalTypes and default values Custom Avro Propertiesdatacontract import --format avro: Now supports importing logicalType and default definition on avro files Custom Avro Properties- Support
config.bigqueryTypefor testing BigQuery types - Added support for selecting specific tables in an AWS Glue
importthrough theglue-tableparameter (#122)
- Fixed jsonschema export for models with empty object-typed fields (#218)
- Fixed testing BigQuery tables with BOOL fields
datacontract catalogShow search bar also on mobile
datacontract catalogSearchdatacontract publish: Publish the data contract to the Data Mesh Managerdatacontract import --format bigquery: Import from BigQuery format (#110)datacontract export --format bigquery: Export to BigQuery format (#111)datacontract export --format avro: Now supports Avro logical types to better model date types.date,timestamp/timestamp-tzandtimestamp-ntzare now mapped to the appropriate logical types. (#141)datacontract import --format jsonschema: Import from JSON schema (#91)datacontract export --format jsonschema: Improved export by exporting more additional informationdatacontract export --format html: Added support for Service Levels, Definitions, Examples and nested Fieldsdatacontract export --format go: Export to go types format
- datacontract catalog: Add index.html to manifest
- Added import glue (#166)
- Added test support for
azure(#146) - Added support for
deltatables on S3 (#24) - Added new command
datacontract catalogthat generates a data contract catalog with anindex.htmlfile. - Added field format information to HTML export
- RDF Export: Fix error if owner is not a URI/URN
- Fixed docker columns
- Added timestamp when ah HTML export was created
- Fixed export format html
- Added export format html (#15)
- Added descriptions as comments to
datacontract export --format sqlfor Databricks dialects - Added import of arrays in Avro import
- Added export format great-expectations:
datacontract export --format great-expectations - Added gRPC support to OpenTelemetry integration for publishing test results
- Added AVRO import support for namespace (#121)
- Added handling for optional fields in avro import (#112)
- Added Databricks SQL dialect for
datacontract export --format sql
- Use
sql_type_converterto build checks. - Fixed AVRO import when doc is missing (#121)
- Added option publish test results to OpenTelemetry:
datacontract test --publish-to-opentelemetry - Added export format protobuf:
datacontract export --format protobuf - Added export format terraform:
datacontract export --format terraform(limitation: only works for AWS S3 right now) - Added export format sql:
datacontract export --format sql - Added export format sql-query:
datacontract export --format sql-query - Added export format avro-idl:
datacontract export --format avro-idl: Generates an Avro IDL file containing records for each model. - Added new command changelog:
datacontract changelog datacontract1.yaml datacontract2.yamlwill now generate a changelog based on the changes in the data contract. This will be useful for keeping track of changes in the data contract over time. - Added extensive linting on data contracts.
datacontract lintwill now check for a variety of possible errors in the data contract, such as missing descriptions, incorrect references to models or fields, nonsensical constraints, and more. - Added importer for avro schemas.
datacontract import --format avrowill now import avro schemas into a data contract.
- Fixed a bug where the export to YAML always escaped the unicode characters.
- test kafka for avro messages
- added export format avro:
datacontract export --format avro
This is a huge step forward, we now support testing Kafka messages. We start with JSON messages and avro, and Protobuf will follow.
- test kafka for JSON messages
- added import format sql:
datacontract import --format sql(#51) - added export format dbt-sources:
datacontract export --format dbt-sources - added export format dbt-staging-sql:
datacontract export --format dbt-staging-sql - added export format rdf:
datacontract export --format rdf(#52) - added command
datacontract breakingto detect breaking changes in between two data contracts.
- export to dbt models (#37).
- export to ODCS (#49).
- test - show a test summary table.
- lint - Support local schema (#46).
- Support for Postgres
- Support for Databricks
- Support for BigQuery data connection
- Support for multiple models with S3
- Fix Docker images. Disable builds for linux/amd64.
- Publish to Docker Hub
This is a breaking change (we are still on a 0.x.x version). The project migrated from Golang to Python. The Golang version can be found at cli-go
testSupport to directly run tests and connect to data sources defined in servers section.testgenerated schema tests from the model definition.test --publish URLPublish test results to a server URL.exportnow exports the data contract so format jsonschema and sodacl.
- The
--fileoption removed in favor of a direct argument.: Usedatacontract test datacontract.yamlinstead ofdatacontract test --file datacontract.yaml.
modelis now part ofexportqualityis now part ofexport- Temporary Removed:
diffneeds to be migrated to Python. - Temporary Removed:
breakingneeds to be migrated to Python. - Temporary Removed:
inlineneeds to be migrated to Python.
- Support local json schema in lint command.
- Update to specification 0.9.2.
- Fix format flag bug in model (print) command.
- Log to STDOUT.
- Rename
modelcommand parameter,type->format.
- Remove
schemacommand.
- Fix documentation.
- Security update of x/sys.
- Adapt Data Contract Specification in version 0.9.2.
- Use
modelssection fordiff/breaking. - Add
modelcommand. - Let
inlineprint to STDOUT instead of overwriting datacontract file. - Let
qualitywrite input from STDIN if present.
- Basic implementation of
testcommand for Soda Core.
- Change package structure to allow usage as library.
- Fix field parsing for dbt models, affects stability of
diff/breaking.
- Fix comparing order of contracts in
diff/breaking.
- Handle non-existent schema specification when using
diff/breaking. - Resolve local and remote resources such as schema specifications when using "$ref: ..." notation.
- Implement
schemacommand: prints your schema. - Implement
qualitycommand: prints your quality definitions. - Implement the
inlinecommand: resolves all references using the "$ref: ..." notation and writes them to your data contract.
- Allow remote and local location for all data contract inputs (
--file,--with).
- Add
diffcommand for dbt schema specification. - Add
breakingcommand for dbt schema specification.
- Suggest a fix during
initwhen the file already exists. - Rename
validatecommand tolint.
- Remove
check-compatibilitycommand.
- Improve usage documentation.
- Initial release.