-
Notifications
You must be signed in to change notification settings - Fork 532
File based driver properties
Note: This feature is available in Enterprise and AWS editions only.
- General properties
- CSV driver properties
- XLSX driver properties
- JSON driver properties
- XML driver properties
- Parquet driver properties
File drivers allow you to interact with various file formats, like CSV, Excel, JSON, XML, and Parquet, as if they were databases. Each driver comes with customizable properties to control how data is read and processed.
| Property | Description | Default value |
|---|---|---|
defaultSchema |
Default schema name. | default |
firstRow |
First row to read. Row numbering starts at 1. | 1 |
rowCount |
Maximum number of rows to read. -1 means no limit. |
-1 |
scanSubfolders |
Scan subfolders for data files. | true |
subfolderNameSeparator |
Defines the separator used in subfolder names within schema names. | __ |
internalDbBatchSize |
Internal database batch size. | 1000 |
internalDbTransactionSize |
A number of batches in a single transaction. | 10 |
useInternalDb |
Use internal database for complex queries. | true |
partitioningStrategy |
Defines how files are grouped into tables. Available values: none, folder, and pattern. |
none |
partitioningRegex |
Regular expression used with pattern to extract the logical table name from file names. |
^(.+)[_-].*$ |
Use partitioningStrategy to group multiple files of the same format into one logical table.
A logical table is a table shown by the driver in CloudBeaver. It can represent one file or a group of files, depending on the partitioning strategy. The source files aren’t physically merged. The driver reads matching files together and displays them as one table.
Available values:
-
none: Each file is a separate table. -
folder: Files in the same folder are grouped into one table. -
pattern: Files are grouped into separate logical tables by the table name extracted from file names.
Use partitioningRegex with pattern to extract the logical table name. The first capture group defines the table
name. For each extracted name, the driver creates a separate logical table and reads matching files into that table.
Files with different extracted names are shown as separate tables.
Example:
With the default regex
^(.+)[_-].*$:
Files Logical table data_01.csv,data_02.csvdataevents-2023.json,events-2024.jsonevents
Note: Files with different formats aren’t combined into the same table. For example,
data_01.csvanddata_02.parquetaren’t grouped together, even if the regex returns the same logical table name.
| Property | Description | Default value |
|---|---|---|
separator |
The delimiter to use for separating entries. | , |
escapeChar |
The character to use for escaping a separator or quote. | \ |
quoteChar |
The character to use for quoted elements. | " |
strictQuotes |
Sets if characters outside the quotes are ignored. | false |
ignoreLeadingWhitespace |
If true, parser should ignore white space before a quote in a field. | true |
ignoreQuotations |
If true, treat quotations like any other character. | false |
nullFieldIndicator |
Which field content will be returned as null. | NEITHER |
trimWhitespaces |
If true, parser should trim whitespaces from the beginning and end of the field. | true |
header |
If true, the first row is treated as a header. | true |
sampleRows |
Number of rows to extract metadata from. | 5 |
wildcard |
Wildcard for file names. | **.{csv,tcv,txt} |
| Property | Description | Default value |
|---|---|---|
header |
Indicates if the file has a header. | true |
sampleRows |
Number of rows to extract metadata from. | 5 |
wildcard |
Wildcard for file names. | **.{xlsx} |
schemaNameMode |
Defines how the schema name is formed: | RELATIVE_DIR_PATH |
- RELATIVE_DIR_PATH: Uses the concatenated relative path of the directory, and the table name is $file_name + _ + $sheet_name. |
||
- RELATIVE_FILE_PATH: Uses the relative file path to generate the schema name. Each sheet in the file becomes a table. |
| Property | Description | Default value |
|---|---|---|
sampleRows |
Number of rows to extract metadata from. | 5 |
wildcard |
Wildcard for file names. | **.{json} |
| Property | Description | Default value |
|---|---|---|
sampleRows |
Number of rows to extract metadata from. | 5 |
wildcard |
Wildcard for file names. | **.{xml} |
| Property | Description | Default value |
|---|---|---|
tmpFolder |
Temporary folder for storing downloaded Parquet files from Cloud Storages. | |
wildcard |
Wildcard for file names. | **.{parquet} |
- Getting started
- Create connection
- Connection network options
- Supported databases
-
Drivers management
- Database authentication methods
- Database navigator
- Properties editor
- Data editor
- SQL editor
-
Entity relation diagrams
- Cloud services
- Data transfer
- General user guide
- Administration
- Server configuration
-
Server security and access configuration
- Authentication methods
- Access management
- Proxy configuration
-
Secret management
- Logs
-
Query manager
- Workspace location
- Command line parameters
-
Session manager
- Deployment options
- CloudBeaver Editions
- FAQ
- Development