Skip to content

Commit ea99f85

Browse files
authored
docs: update pipeline index config docs (#2451)
1 parent 31e5a34 commit ea99f85

2 files changed

Lines changed: 101 additions & 6 deletions

File tree

docs/reference/pipeline/pipeline-config.md

Lines changed: 51 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1071,7 +1071,28 @@ GreptimeDB supports the following four types of index for fields:
10711071
- `timestamp`: Specifies a column as a timestamp index column.
10721072
- `inverted`: Specifies a column to use the inverted index type.
10731073
- `fulltext`: Specifies a column to use the fulltext index type. The column must be of string type.
1074-
- `skipping`: Specifies a column to use the skipping index type. The column must be of string type.
1074+
- `skipping`: Specifies a column to use the skipping index type.
1075+
1076+
The `index` field supports both a shorthand string form and a detailed object form:
1077+
1078+
```yaml
1079+
index: fulltext
1080+
1081+
index:
1082+
type: fulltext
1083+
options:
1084+
analyzer: Chinese
1085+
case_sensitive: true
1086+
backend: bloom
1087+
granularity: 2048
1088+
false_positive_rate: 0.02
1089+
```
1090+
1091+
The shorthand string form remains supported and uses the default index options.
1092+
In the object form, `type` is required and `options` is optional.
1093+
Only `fulltext` and `skipping` support `options`.
1094+
Option names and validation rules are the same as the corresponding SQL index options described in [Data Index](/user-guide/manage-data/data-index.md).
1095+
Each option value must be a scalar YAML value.
10751096

10761097
When `index` field is not provided, GreptimeDB doesn't create index on the column.
10771098

@@ -1087,11 +1108,38 @@ Specify which field uses the inverted index. Refer to the [Transform Example](#t
10871108

10881109
#### The Fulltext Index
10891110

1090-
Specify which field will be used for full-text search using `index: fulltext`. This index greatly improves the performance of [log search](/user-guide/logs/fulltext-search.md). Refer to the [Transform Example](#transform-example) below for syntax.
1111+
Specify which field will be used for full-text search using `index: fulltext`. This index greatly improves the performance of [log search](/user-guide/logs/fulltext-search.md).
1112+
Use the detailed form when you need to set fulltext index options:
1113+
1114+
```yaml
1115+
- field: message
1116+
type: string
1117+
index:
1118+
type: fulltext
1119+
options:
1120+
analyzer: Chinese
1121+
case_sensitive: true
1122+
backend: bloom
1123+
granularity: 2048
1124+
false_positive_rate: 0.02
1125+
```
10911126

10921127
#### The Skipping Index
10931128

1094-
Specify which field uses the skipping index. This index speeds up the query on high cardinality fields but consumes far less storage for building index files. Refer to the [Transform Example](#transform-example) below for syntax.
1129+
Specify which field uses the skipping index. This index speeds up the query on high cardinality fields but consumes far less storage for building index files.
1130+
Unlike `fulltext`, `skipping` is not limited to string columns.
1131+
Use the detailed form when you need to set skipping index options:
1132+
1133+
```yaml
1134+
- field: trace_id
1135+
type: int64
1136+
index:
1137+
type: skipping
1138+
options:
1139+
granularity: 1024
1140+
false_positive_rate: 0.05
1141+
type: BLOOM
1142+
```
10951143

10961144
### The `tag` field
10971145

i18n/zh/docusaurus-plugin-content-docs/current/reference/pipeline/pipeline-config.md

Lines changed: 50 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1087,8 +1087,28 @@ GreptimeDB 支持以下四种字段的索引类型:
10871087
- `timestamp`: 用于指定某列是时间索引列
10881088
- `inverted`: 用于指定某列使用 inverted 类型的索引(倒排索引)
10891089
- `fulltext`: 用于指定某列使用 fulltext 类型的索引(全文索引),该列需要是字符串类型
1090-
- `skipping`: 用于指定某列使用 skipping 类型的索引(跳数索引),该列需要是字符串类型
1090+
- `skipping`: 用于指定某列使用 skipping 类型的索引(跳数索引)
10911091
1092+
`index` 字段同时支持字符串简写和对象写法:
1093+
1094+
```yaml
1095+
index: fulltext
1096+
1097+
index:
1098+
type: fulltext
1099+
options:
1100+
analyzer: Chinese
1101+
case_sensitive: true
1102+
backend: bloom
1103+
granularity: 2048
1104+
false_positive_rate: 0.02
1105+
```
1106+
1107+
字符串简写仍然兼容,且会使用该索引类型的默认配置。
1108+
使用对象写法时,`type` 是必填项,`options` 是可选项。
1109+
只有 `fulltext``skipping` 支持 `options`
1110+
`options` 的名称和校验规则与 [数据索引](/user-guide/manage-data/data-index.md) 中对应的 SQL 索引选项保持一致。
1111+
每个选项值都必须是 YAML 标量值。
10921112

10931113
不提供 `index` 字段时,GreptimeDB 将不会在该字段上建立索引。
10941114

@@ -1104,11 +1124,38 @@ GreptimeDB 支持以下四种字段的索引类型:
11041124

11051125
#### Fulltext 索引
11061126

1107-
通过 `index: fulltext` 指定在哪个列上建立全文索引,该索引可大大提升 [日志搜索](/user-guide/logs/fulltext-search.md) 的性能,写法请参考下方的 [Transform 示例](#transform-示例)。
1127+
通过 `index: fulltext` 指定在哪个列上建立全文索引,该索引可大大提升 [日志搜索](/user-guide/logs/fulltext-search.md) 的性能。
1128+
如果需要设置全文索引选项,可使用对象写法:
1129+
1130+
```yaml
1131+
- field: message
1132+
type: string
1133+
index:
1134+
type: fulltext
1135+
options:
1136+
analyzer: Chinese
1137+
case_sensitive: true
1138+
backend: bloom
1139+
granularity: 2048
1140+
false_positive_rate: 0.02
1141+
```
11081142
11091143
#### Skipping 索引
11101144
1111-
通过 `index: skipping` 指定在哪个列上建立跳数索引,该索引只需少量存储空间的索引文件即可以加速在高基数列上的查询,写法请参考下方的 [Transform 示例](#transform-示例)。
1145+
通过 `index: skipping` 指定在哪个列上建立跳数索引,该索引只需少量存储空间的索引文件即可以加速在高基数列上的查询。
1146+
与 `fulltext` 不同,`skipping` 不仅限于字符串类型列。
1147+
如果需要设置跳数索引选项,可使用对象写法:
1148+
1149+
```yaml
1150+
- field: trace_id
1151+
type: int64
1152+
index:
1153+
type: skipping
1154+
options:
1155+
granularity: 1024
1156+
false_positive_rate: 0.05
1157+
type: BLOOM
1158+
```
11121159

11131160
### `tag` 字段
11141161

0 commit comments

Comments
 (0)