You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/docs/lakehouse/multi-catalog/dlf.md
+75-3Lines changed: 75 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
{
3
-
"title": "Aliyun DLF",
3
+
"title": "Alibaba Cloud DLF",
4
4
"language": "en"
5
5
}
6
6
---
@@ -25,7 +25,79 @@ under the License.
25
25
-->
26
26
27
27
28
-
# Aliyun DLF
28
+
# Alibaba Cloud DLF
29
+
30
+
Data Lake Formation (DLF) is the unified metadata management service of Alibaba Cloud. It is compatible with the Hive Metastore protocol.
31
+
32
+
> [What is DLF](https://www.alibabacloud.com/product/datalake-formation)
33
+
34
+
Doris can access DLF the same way as it accesses Hive Metastore.
35
+
36
+
## Connect to DLF
37
+
38
+
1. Create `hive-site.xml`
39
+
40
+
Create the `hive-site.xml` file, and put it in the `fe/conf` directory.
41
+
42
+
```
43
+
<?xml version="1.0"?>
44
+
<configuration>
45
+
<!--Set to use dlf client-->
46
+
<property>
47
+
<name>hive.metastore.type</name>
48
+
<value>dlf</value>
49
+
</property>
50
+
<property>
51
+
<name>dlf.catalog.endpoint</name>
52
+
<value>dlf-vpc.cn-beijing.aliyuncs.com</value>
53
+
</property>
54
+
<property>
55
+
<name>dlf.catalog.region</name>
56
+
<value>cn-beijing</value>
57
+
</property>
58
+
<property>
59
+
<name>dlf.catalog.proxyMode</name>
60
+
<value>DLF_ONLY</value>
61
+
</property>
62
+
<property>
63
+
<name>dlf.catalog.uid</name>
64
+
<value>20000000000000000</value>
65
+
</property>
66
+
<property>
67
+
<name>dlf.catalog.accessKeyId</name>
68
+
<value>XXXXXXXXXXXXXXX</value>
69
+
</property>
70
+
<property>
71
+
<name>dlf.catalog.accessKeySecret</name>
72
+
<value>XXXXXXXXXXXXXXXXX</value>
73
+
</property>
74
+
</configuration>
75
+
```
76
+
77
+
*`dlf.catalog.endpoint`: DLF Endpoint. See [Regions and Endpoints of DLF](https://www.alibabacloud.com/help/en/data-lake-formation/latest/regions-and-endpoints).
78
+
*`dlf.catalog.region`: DLF Region. See [Regions and Endpoints of DLF](https://www.alibabacloud.com/help/en/data-lake-formation/latest/regions-and-endpoints).
79
+
*`dlf.catalog.uid`: Alibaba Cloud account. You can find the "Account ID" in the upper right corner on the Alibaba Cloud console.
80
+
*`dlf.catalog.accessKeyId`:AccessKey, which you can create and manage on the [Alibaba Cloud console](https://ram.console.aliyun.com/manage/ak).
81
+
*`dlf.catalog.accessKeySecret`:SecretKey, which you can create and manage on the [Alibaba Cloud console](https://ram.console.aliyun.com/manage/ak).
82
+
83
+
Other configuration items are fixed and require no modifications.
84
+
85
+
2. Restart FE, and create Catalog via the `CREATE CATALOG` statement.
86
+
87
+
Doris will read and parse `fe/conf/hive-site.xml`.
88
+
89
+
```sql
90
+
CREATE CATALOG hive_with_dlf PROPERTIES (
91
+
"type"="hms",
92
+
"hive.metastore.uris"="thrift://127.0.0.1:9083"
93
+
)
94
+
```
95
+
96
+
`type` should always be `hms`; while `hive.metastore.uris` can be arbitary since it is not used in real practice, but it should follow the format of Hive Metastore Thrift URI.
97
+
98
+
After the above steps, you can access metadata in DLF the same way as you access Hive MetaStore.
99
+
100
+
Doris supports accessing Hive/Iceberg/Hudi metadata in DLF.
Copy file name to clipboardExpand all lines: docs/en/docs/lakehouse/multi-catalog/hive.md
+146-1Lines changed: 146 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,4 +26,149 @@ under the License.
26
26
27
27
# Hive
28
28
29
-
TODO: translate
29
+
Once Doris is connected to Hive Metastore or made compatible with Hive Metastore metadata service, it can access databases and tables in Hive and conduct queries.
30
+
31
+
Besides Hive, many other systems, such as Iceberg and Hudi, use Hive Metastore to keep their metadata. Thus, Doris can also access these systems via Hive Catalog.
32
+
33
+
## Usage
34
+
35
+
When connnecting to Hive, Doris:
36
+
37
+
1. Supports Hive version 1/2/3;
38
+
2. Supports both Managed Table and External Table;
39
+
3. Can identify metadata of Hive, Iceberg, and Hudi stored in Hive Metastore;
40
+
4. Supports Hive tables with data stored in JuiceFS, which can be used the same way as normal Hive tables (put `juicefs-hadoop-x.x.x.jar` in `fe/lib/` and `apache_hdfs_broker/lib/`).
In Doris 1.2.1 and newer, you can create a Resource that contains all these parameters, and reuse the Resource when creating new Catalogs. Here is an example:
# 2. Create Catalog and use an existing Resource. The key and value information in the followings will overwrite the corresponding information in the Resource.
132
+
CREATE CATALOG hive WITH RESOURCE hms_resource PROPERTIES(
133
+
'key'='value'
134
+
);
135
+
```
136
+
137
+
You can also put the `hive-site.xml` file in the `conf` directories of FE and BE. This will enable Doris to automatically read information from `hive-site.xml`. The relevant information will be overwritten based on the following rules :
138
+
139
+
140
+
* Information in Resource will overwrite that in `hive-site.xml`.
141
+
* Information in `CREATE CATALOG PROPERTIES` will overwrite that in Resource.
142
+
143
+
### Hive Versions
144
+
145
+
Doris can access Hive Metastore in all Hive versions. By default, Doris uses the interface compatible with Hive 2.3 to access Hive Metastore. You can specify a certain Hive version when creating Catalogs, for example:
Copy file name to clipboardExpand all lines: docs/en/docs/lakehouse/multi-catalog/hudi.md
+25-1Lines changed: 25 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,4 +27,28 @@ under the License.
27
27
28
28
# Hudi
29
29
30
-
TODO: translate
30
+
## Usage
31
+
32
+
1. Currently, Doris supports Snapshot Query on Copy-on-Write Hudi tables and Read Optimized Query on Merge-on-Read tables. In the future, it will support Snapshot Query on Merge-on-Read tables and Incremental Query.
33
+
2. Doris only supports Hive Metastore Catalogs currently. The usage is basically the same as that of Hive Catalogs. More types of Catalogs will be supported in future versions.
34
+
35
+
## Create Catalog
36
+
37
+
Same as creating Hive Catalogs. A simple example is provided here. See [Hive](./hive) for more information.
Same as that in Hive Catalogs. See the relevant section in [Hive](./hive).
58
+
59
+
## Time Travel
60
+
61
+
<versionsince="dev">
62
+
63
+
Doris supports reading the specified Snapshot of Iceberg tables.
64
+
65
+
</version>
66
+
67
+
Each write operation to an Iceberg table will generate a new Snapshot.
68
+
69
+
By default, a read request will only read the latest Snapshot.
70
+
71
+
You can read data of historical table versions using the `FOR TIME AS OF` or `FOR VERSION AS OF` statements based on the Snapshot ID or the timepoint the Snapshot is generated. For example:
72
+
73
+
`SELECT * FROM iceberg_tbl FOR TIME AS OF "2022-10-07 17:20:37";`
74
+
75
+
`SELECT * FROM iceberg_tbl FOR VERSION AS OF 868895038966572;`
76
+
77
+
You can use the [iceberg_meta](https://doris.apache.org/docs/dev/sql-manual/sql-functions/table-functions/iceberg_meta/) table function to view the Snapshot details of the specified table.
0 commit comments