Skip to content

Commit 5123e2a

Browse files
authored
docs(retrieval): note pg_search configurable tokenizer in BM25 backends table (#1790)
* docs(retrieval): note pg_search configurable tokenizer in BM25 backends table * docs(retrieval): note pg_search configurable tokenizer in BM25 backends table
1 parent 7e0afff commit 5123e2a

2 files changed

Lines changed: 2 additions & 2 deletions

File tree

hindsight-docs/docs/developer/retrieval.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ No single search method handles all these well. Hindsight solves this with **TEM
7272
| `vchord` | `vchord_bm25` extension | No |
7373
| `pg_textsearch` | Timescale `pg_textsearch` extension | No |
7474
| `pgroonga` | PGroonga (Groonga) full-text extension, `TokenBigram` polyglot tokenizer | No |
75-
| `pg_search` | ParadeDB `pg_search` extension | Yes |
75+
| `pg_search` | ParadeDB `pg_search` extension, configurable tokenizer (e.g. `jieba`, `chinese_compatible`, `ngram`) via `HINDSIGHT_API_TEXT_SEARCH_EXTENSION_PG_SEARCH_TOKENIZER` | Yes |
7676

7777
If you need true BM25 ranking on a horizontally scaled Postgres (Citus) cluster,
7878
`pg_search` is the only option. See the [`pg_search` docker-compose example](https://github.com/vectorize-io/hindsight/tree/main/docker/docker-compose/pg_search).

skills/hindsight-docs/references/developer/retrieval.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ No single search method handles all these well. Hindsight solves this with **TEM
7272
| `vchord` | `vchord_bm25` extension | No |
7373
| `pg_textsearch` | Timescale `pg_textsearch` extension | No |
7474
| `pgroonga` | PGroonga (Groonga) full-text extension, `TokenBigram` polyglot tokenizer | No |
75-
| `pg_search` | ParadeDB `pg_search` extension | Yes |
75+
| `pg_search` | ParadeDB `pg_search` extension, configurable tokenizer (e.g. `jieba`, `chinese_compatible`, `ngram`) via `HINDSIGHT_API_TEXT_SEARCH_EXTENSION_PG_SEARCH_TOKENIZER` | Yes |
7676

7777
If you need true BM25 ranking on a horizontally scaled Postgres (Citus) cluster,
7878
`pg_search` is the only option. See the [`pg_search` docker-compose example](https://github.com/vectorize-io/hindsight/tree/main/docker/docker-compose/pg_search).

0 commit comments

Comments
 (0)