Skip to content

Commit ce120c5

Browse files
martinv13cre-os
andauthored
Add DatabaseDialect and implement identifiers truncation (#54)
* Added a DatabaseDialect class hierarchy (base, postgresql, mssql, mysql, duckdb), one subclass per backend, to abstract db specific logic and avoid conditionnals scattered in the code * Implement truncation of database identifiers, because long names was an issue frequently found especially with postgresql. Each db dialect sets its own limit * Remove Python 3.9 support (EOL) and add Python 3.14 to the test matrix * Bump version --------- Co-authored-by: cre-os <opensource@cre.fr>
1 parent c28e6a4 commit ce120c5

36 files changed

Lines changed: 792 additions & 364 deletions

.github/workflows/python-package.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
strategy:
1818
fail-fast: false
1919
matrix:
20-
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
20+
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
2121

2222
steps:
2323
- uses: actions/checkout@v4

docs/configuring.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -79,16 +79,16 @@ table, or a child table.
7979
### Data types
8080

8181
By default, the data type defined in the database table for each column is based on a mapping between the data type
82-
indicated in the XSD and a corresponding `sqlalchemy` type implemented in the following three functions:
82+
indicated in the XSD and a corresponding `sqlalchemy` type implemented in the following three methods:
8383

84-
??? info "Default: `types_mapping_default`"
85-
::: xml2db.table.column.types_mapping_default
84+
??? info "Default: `DatabaseDialect.column_type`"
85+
::: xml2db.dialect.base.DatabaseDialect.column_type
8686

87-
??? info "MySQL: `types_mapping_mysql`"
88-
::: xml2db.table.column.types_mapping_mysql
87+
??? info "MySQL: `MySQLDialect.column_type`"
88+
::: xml2db.dialect.mysql.MySQLDialect.column_type
8989

90-
??? info "MSSQL: `types_mapping_mssql`"
91-
::: xml2db.table.column.types_mapping_mssql
90+
??? info "MSSQL: `MSSQLDialect.column_type`"
91+
::: xml2db.dialect.mssql.MSSQLDialect.column_type
9292

9393
You may override this mapping by specifying a column type for any field in the model config. Custom column types are
9494
defined as `sqlalchemy` types and will be passed to the `sqlalchemy.Column` constructor as is.

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "xml2db"
7-
version = "0.12.6"
7+
version = "0.13.0"
88
authors = [
99
{ name="Commission de régulation de l'énergie", email="opensource@cre.fr" },
1010
]
1111
description = "Import complex XML files to a relational database"
1212
readme = "README.md"
13-
requires-python = ">=3.9"
13+
requires-python = ">=3.10"
1414
classifiers = [
1515
"Programming Language :: Python :: 3",
1616
"License :: OSI Approved :: MIT License",

src/xml2db/dialect/__init__.py

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""Backend-specific dialect classes for xml2db.
2+
3+
This package centralises all database-backend-specific behaviour that was
4+
previously scattered across the codebase as ``if db_type == "..."``
5+
conditionals. Each supported backend has a dedicated subclass of
6+
:class:`~xml2db.dialect.base.DatabaseDialect`. Unknown backends fall back to
7+
the base class, which provides safe, generic defaults.
8+
9+
Usage::
10+
11+
from xml2db.dialect import get_dialect
12+
13+
dialect = get_dialect("postgresql")
14+
physical_name = dialect.db_identifier("some_very_long_xsd_derived_name")
15+
16+
The registry is a plain dict so that third-party code (or tests) can register
17+
custom dialects without subclassing anything in xml2db::
18+
19+
from xml2db.dialect import DIALECT_REGISTRY
20+
from mypackage import OracleDialect
21+
22+
DIALECT_REGISTRY["oracle"] = OracleDialect
23+
"""
24+
25+
from .base import DatabaseDialect
26+
from .duckdb import DuckDBDialect
27+
from .mssql import MSSQLDialect
28+
from .mysql import MySQLDialect
29+
from .postgresql import PostgreSQLDialect
30+
31+
__all__ = [
32+
"DatabaseDialect",
33+
"DuckDBDialect",
34+
"MSSQLDialect",
35+
"MySQLDialect",
36+
"PostgreSQLDialect",
37+
"DIALECT_REGISTRY",
38+
"get_dialect",
39+
]
40+
41+
# Maps the SQLAlchemy dialect name (as returned by engine.dialect.name) to
42+
# the corresponding DatabaseDialect subclass.
43+
DIALECT_REGISTRY: dict[str, type[DatabaseDialect]] = {
44+
"postgresql": PostgreSQLDialect,
45+
"mssql": MSSQLDialect,
46+
"mysql": MySQLDialect,
47+
"mariadb": MySQLDialect, # SQLAlchemy reports MariaDB as "mariadb"
48+
"duckdb": DuckDBDialect,
49+
}
50+
51+
52+
def get_dialect(db_type: str | None) -> DatabaseDialect:
53+
"""Return a :class:`DatabaseDialect` instance for the given backend name.
54+
55+
Args:
56+
db_type: The SQLAlchemy dialect name, e.g. ``"postgresql"``,
57+
``"mssql"``, ``"mysql"``, ``"duckdb"``. ``None`` or any
58+
unrecognised string falls back to the base
59+
:class:`DatabaseDialect`, which uses safe generic defaults.
60+
61+
Returns:
62+
An instantiated :class:`DatabaseDialect` (or subclass) ready for use.
63+
"""
64+
cls = DIALECT_REGISTRY.get(db_type, DatabaseDialect)
65+
return cls()

0 commit comments

Comments
 (0)