Skip to content

Commit 03db2c6

Browse files
authored
Feat: Add Trino Hive Connector Support (#1579)
* feat: add trino hive connector support * fix mypy after sqlglot upgrade * rollback trino specific changes
1 parent d298ab8 commit 03db2c6

29 files changed

Lines changed: 892 additions & 154 deletions

docs/guides/configuration.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -419,6 +419,7 @@ These pages describe the connection configuration options for each execution eng
419419
* [Redshift](../integrations/engines/redshift.md)
420420
* [Snowflake](../integrations/engines/snowflake.md)
421421
* [Spark](../integrations/engines/spark.md)
422+
* [Trino](../integrations/engines/trino.md)
422423

423424
#### State connection
424425

docs/guides/connections.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,3 +87,4 @@ default_gateway: local_db
8787
* [Redshift](../integrations/engines/redshift.md)
8888
* [Snowflake](../integrations/engines/snowflake.md)
8989
* [Spark](../integrations/engines/spark.md)
90+
* [Trino](../integrations/engines/trino.md)

docs/integrations/engines/trino.md

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Trino
2+
3+
## Local/Built-in Scheduler
4+
**Engine Adapter Type**: `trino`
5+
6+
## Installation
7+
```
8+
pip install "sqlmesh[trino]"
9+
```
10+
11+
If you are using Oath for Authentication, it is recommended to install keyring cache:
12+
```
13+
pip install "trino[external-authentication-token-cache]"
14+
```
15+
16+
### Trino Connector Support
17+
18+
The trino engine adapter has been tested against the [Hive Connector](https://trino.io/docs/current/connector/hive.html).
19+
Please let us know on [Slack](https://tobikodata.com/slack) if you are wanting to use another connector or have tried another connector.
20+
21+
### Hive Connector Configuration
22+
23+
Recommended hive catalog properties configuration (`<catalog_name>.properties`):
24+
25+
```properties linenums="1"
26+
hive.metastore-cache-ttl=0s
27+
hive.metastore-refresh-interval=5s
28+
hive.metastore-timeout=10s
29+
hive.allow-drop-table=true
30+
hive.allow-add-column=true
31+
hive.allow-drop-column=true
32+
hive.allow-rename-column=true
33+
hive.allow-rename-table=true
34+
```
35+
36+
### Connection options
37+
38+
| Option | Description | Type | Required |
39+
|----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------:|:--------:|
40+
| `type` | Engine type name - must be `trino` | string | Y |
41+
| `user` | The username (of the account) to log in to your cluster. When connecting to Starburst Galaxy clusters, you must include the role of the user as a suffix to the username. | string | Y |
42+
| `host` | The hostname of your cluster. Don't include the `http://` or `https://` prefix. | string | Y |
43+
| `catalog` | The name of a catalog in your cluster. | string | Y |
44+
| `http_scheme` | The HTTP scheme to use when connecting to your cluster. By default, it's `https` and can only be `http` for no-auth or basic auth. | string | N |
45+
| `port` | The port to connect to your cluster. By default, it's `443` for `https` scheme and `80` for `http` | int | N |
46+
| `roles` | Mapping of catalog name to a role | dict | N |
47+
| `http_headers` | Additional HTTP headers to send with each request. | dict | N |
48+
| `session_properties` | Trino session properties. Run `SHOW SESSION` to see all options. | dict | N |
49+
| `retries` | Number of retries to attempt when a request fails. Default: `3` | int | N |
50+
| `timezone` | Timezone to use for the connection. Default: client-side local timezone | string | N |
51+
52+
```yaml linenums="1"
53+
connector_name:
54+
type: trino
55+
user: [user]
56+
host: [host]
57+
catalog: [catalog]
58+
```
59+
60+
### Authentication
61+
62+
=== "No Auth"
63+
| Option | Description | Type | Required |
64+
|------------|------------------------------------------|:------:|:--------:|
65+
| `method` | `no-auth` (Default) | string | N |
66+
67+
```yaml linenums="1"
68+
connector_name:
69+
type: trino
70+
user: [user]
71+
host: [host]
72+
catalog: [catalog]
73+
# Most likely you will want http for scheme when not using auth
74+
http_scheme: http
75+
```
76+
77+
78+
=== "Basic Auth"
79+
80+
| Option | Description | Type | Required |
81+
|------------|------------------------------------------|:------:|:--------:|
82+
| `method` | `basic` | string | Y |
83+
| `password` | The password to use when authenticating. | string | Y |
84+
85+
86+
```yaml linenums="1"
87+
connector_name:
88+
type: trino
89+
method: basic
90+
user: [user]
91+
password: [password]
92+
host: [host]
93+
catalog: [catalog]
94+
```
95+
96+
* [Trino Documentation on Basic Authentication](https://trino.io/docs/current/security/password-file.html)
97+
* [Python Client Basic Authentication](https://github.com/trinodb/trino-python-client#basic-authentication)
98+
99+
=== "LDAP"
100+
101+
| Option | Description | Type | Required |
102+
|----------------------|-------------------------------------------------------------------------|:------:|:--------:|
103+
| `method` | `ldap` | string | Y |
104+
| `password` | The password to use when authenticating. | string | Y |
105+
| `impersonation_user` | Override the provided username. This lets you impersonate another user. | string | N |
106+
107+
```yaml linenums="1"
108+
connector_name:
109+
type: trino
110+
method: ldap
111+
user: [user]
112+
password: [password]
113+
host: [host]
114+
catalog: [catalog]
115+
```
116+
117+
* [Trino Documentation on LDAP Authentication](https://trino.io/docs/current/security/ldap.html)
118+
* [Python Client LDAP Authentication](https://github.com/trinodb/trino-python-client#basic-authentication)
119+
120+
=== "Kerberos"
121+
122+
| Option | Description | Type | Required |
123+
|----------------------------------|-----------------------------------------------------------------------------------|:------:|:--------:|
124+
| `method` | `kerberos` | string | Y |
125+
| `keytab` | Path to keytab. Ex: `/tmp/trino.keytab` | string | Y |
126+
| `krb5_config` | Path to config. Ex: `/tmp/krb5.conf` | string | Y |
127+
| `principal` | Principal. Ex: `user@company.com` | string | Y |
128+
| `service_name` | Service name (default is `trino`) | string | N |
129+
| `hostname_override` | Kerberos hostname for a host whose DNS name doesn't match | string | N |
130+
| `mutual_authentication` | Boolean flag for mutual authentication. Default: `false` | bool | N |
131+
| `force_preemptive` | Boolean flag to preemptively initiate the Kerberos GSS exchange. Default: `false` | bool | N |
132+
| `sanitize_mutual_error_response` | Boolean flag to strip content and headers from error responses. Default: `true` | bool | N |
133+
| `delegate` | Boolean flag for credential delegation (`GSS_C_DELEG_FLAG`). Default: `false` | bool | N |
134+
135+
```yaml linenums="1"
136+
connector_name:
137+
type: trino
138+
method: kerberos
139+
user: user
140+
keytab: /tmp/trino.keytab
141+
krb5_config: /tmp/krb5.conf
142+
principal: trino@company.com
143+
host: trino.company.com
144+
catalog: datalake
145+
```
146+
147+
* [Trino Documentation on Kerberos Authentication](https://trino.io/docs/current/security/kerberos.html)
148+
* [Python Client Kerberos Authentication](https://github.com/trinodb/trino-python-client#kerberos-authentication)
149+
150+
=== "JWT"
151+
152+
| Option | Description | Type | Required |
153+
|-------------|-----------------|:------:|:--------:|
154+
| `method` | `jwt` | string | Y |
155+
| `jwt_token` | The JWT string. | string | Y |
156+
157+
```yaml linenums="1"
158+
connector_name:
159+
type: trino
160+
method: jwt
161+
user: [user]
162+
password: [password]
163+
host: [host]
164+
catalog: [catalog]
165+
```
166+
167+
* [Trino Documentation on JWT Authentication](https://trino.io/docs/current/security/jwt.html)
168+
* [Python Client JWT Authentication](https://github.com/trinodb/trino-python-client#jwt-authentication)
169+
170+
=== "Certificate"
171+
172+
| Option | Description | Type | Required |
173+
|----------------------|---------------------------------------------------|:------:|:--------:|
174+
| `method` | `certificate` | string | Y |
175+
| `cert` | The full path to a certificate file | string | Y |
176+
| `client_certificate` | Path to client certificate. Ex: `/tmp/client.crt` | string | Y |
177+
| `client_private_key` | Path to client private key. Ex: `/tmp/client.key` | string | Y |
178+
179+
180+
```yaml linenums="1"
181+
connector_name:
182+
type: trino
183+
method: certificate
184+
user: [user]
185+
password: [password]
186+
host: [host]
187+
catalog: [catalog]
188+
cert: [path/to/cert_file]
189+
client_certificate: [path/to/client/cert]
190+
client_private_key: [path/to/client/key]
191+
```
192+
193+
=== "Oath"
194+
195+
| Option | Description | Type | Required |
196+
|----------------------|---------------------------------------------------|:------:|:--------:|
197+
| `method` | `oath` | string | Y |
198+
199+
```yaml linenums="1"
200+
connector_name:
201+
type: trino
202+
method: oauth
203+
host: trino.company.com
204+
catalog: datalake
205+
```
206+
207+
* [Trino Documentation on Oath Authentication](https://trino.io/docs/current/security/oauth2.html)
208+
* [Python Client Oath Authentication](https://github.com/trinodb/trino-python-client#oauth2-authentication)
209+

docs/integrations/overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ SQLMesh supports the following execution engines for running SQLMesh projects:
2020
* [Redshift](./engines/redshift.md)
2121
* [Snowflake](./engines/snowflake.md)
2222
* [Spark](./engines/spark.md)
23+
* [Trino](./engines/trino.md)

docs/reference/configuration.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ These pages describe the connection configuration options for each execution eng
113113
* [Redshift](../integrations/engines/redshift.md)
114114
* [Snowflake](../integrations/engines/snowflake.md)
115115
* [Spark](../integrations/engines/spark.md)
116+
* [Trino](../integrations/engines/trino.md)
116117

117118
### Scheduler
118119

examples/sushi/models/waiter_as_customer_by_day.sql

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@ JINJA_QUERY_BEGIN;
1717
{% set x = 1 %}
1818

1919
SELECT
20-
w.ds as ds,
2120
w.waiter_id as waiter_id,
2221
wn.name as waiter_name,
23-
{{ alias(identity(x), 'flag') }}
22+
{{ alias(identity(x), 'flag') }},
23+
w.ds as ds
2424
FROM sushi.waiters AS w
2525
JOIN sushi.customers as c ON w.waiter_id = c.customer_id
2626
JOIN sushi.waiter_names as wn ON w.waiter_id = wn.id;

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ nav:
6969
- integrations/engines/redshift.md
7070
- integrations/engines/snowflake.md
7171
- integrations/engines/spark.md
72+
- integrations/engines/trino.md
7273
- Resources:
7374
- comparisons.md
7475
- development.md

setup.cfg

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,9 @@ ignore_missing_imports = True
9393
[mypy-boto3.*]
9494
ignore_missing_imports = True
9595

96+
[mypy-trino.*]
97+
ignore_missing_imports = True
98+
9699
[autoflake]
97100
in-place = True
98101
expand-star-imports = True

setup.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@
4646
"requests",
4747
"rich[jupyter]",
4848
"ruamel.yaml",
49-
"sqlglot~=18.13.0",
49+
"sqlglot~=18.14.0",
5050
],
5151
extras_require={
5252
"bigquery": [
@@ -130,6 +130,9 @@
130130
"snowflake-connector-python[pandas,secure-local-storage]",
131131
"pyarrow>=10.0.1,<10.1.0",
132132
],
133+
"trino": [
134+
"trino",
135+
],
133136
"web": [
134137
"fastapi==0.100.0",
135138
"watchfiles>=0.19.0",

sqlmesh/core/audit/builtin.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,9 +57,12 @@
5757
number_of_rows_audit = ModelAudit(
5858
name="number_of_rows",
5959
query="""
60-
SELECT 1
61-
FROM @this_model
62-
LIMIT @threshold + 1
60+
SELECT COUNT(*)
61+
FROM (
62+
SELECT 1
63+
FROM @this_model
64+
LIMIT @threshold + 1
65+
)
6366
HAVING COUNT(*) <= @threshold
6467
""",
6568
)

0 commit comments

Comments
 (0)