Skip to content

Commit 30790be

Browse files
zxqfd555Manul from Pathway
authored andcommitted
weaviate output connector (#10464)
GitOrigin-RevId: 3c8b608d43c5c2a512f645f3642ca34a92a7f5fc
1 parent c1c98f3 commit 30790be

13 files changed

Lines changed: 1176 additions & 3 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
66
## [Unreleased]
77

88
### Added
9+
- `pw.io.weaviate.write` writes a Pathway table to a [Weaviate](https://weaviate.io/) collection, keeping it in sync with the table: additions and updates upsert objects (keyed by a UUID derived from the required `primary_key`), and deletions remove them. An optional `vector` column is stored as the object's embedding; the remaining columns become object properties. The target collection must already exist. The connector writes in parallel across workers (`pathway spawn -n`), with `batch_size` and `concurrency` to tune throughput, and supports `api_key`/`headers` authentication for self-hosted and Weaviate Cloud deployments.
910
- `pw.io.pinecone.write` writes a Pathway table to a [Pinecone](https://www.pinecone.io/) index for use as a vector store in RAG pipelines. It keeps the index in sync with the current state of the table: a row is upserted under its record id and a row removed from the table is deleted from the index. By default the id is the table's internal row key (unique, and written in parallel across workers); pass a `primary_key` column to use your own ids instead, in which case the values must uniquely identify rows (a collision raises an error) and writing runs on a single worker. The `vector` column holds the dense embedding (`list[float]` or a 1-D `numpy.ndarray`); the remaining columns — or just those listed in `metadata_columns` — are stored as record metadata. Targets both Pinecone cloud and the local Pinecone emulator via the `host` parameter; the index must already exist with a matching dimension.
1011

1112
### Changed

Cargo.lock

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ tokio-util = { version = "0.7", features = ["compat"] }
120120
tonic = { version = "0.13.1", features = ["tls-native-roots"] }
121121
typetag = "0.2.21"
122122
usearch = "2.15.3"
123-
uuid = { version = "1.17.0", features = ["v4"] }
123+
uuid = { version = "1.17.0", features = ["v4", "v5"] }
124124
xxhash-rust = { version = "0.8.15", features = ["xxh3"] }
125125

126126
[features]

docs/2.developers/4.user-guide/20.connect/30.connectors-overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ Before going into more details about the different connectors and how they work,
101101
<span class="block"><a href="/developers/user-guide/connect/connectors/switching-to-redpanda">Redpanda</a></span>
102102
<span class="block"><a href="/developers/user-guide/connect/connectors/slack_send_alerts">Slack</a></span>
103103
<span class="block"><a href="/developers/api-docs/pathway-io/sqlite">SQLite</a></span>
104+
<span class="block"><a href="/developers/api-docs/pathway-io/weaviate">Weaviate</a></span>
104105
</td>
105106
<td class="text-center !align-middle">
106107
<span class="block"><a href="/developers/user-guide/connect/connectors/csv_connectors">CSV</a></span>

integration_tests/db_connectors/conftest.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
PostgresContext,
2121
PostgresWithTlsContext,
2222
QuestDBContext,
23+
WeaviateContext,
2324
clickhouse_concurrency_slot,
2425
elasticsearch_concurrency_slot,
2526
mongodb_concurrency_slot,
@@ -279,3 +280,10 @@ def milvus(tmp_path):
279280
ctx = MilvusContext(str(tmp_path / "milvus.db"))
280281
yield ctx
281282
ctx.close()
283+
284+
285+
@pytest.fixture
286+
def weaviate():
287+
ctx = WeaviateContext()
288+
yield ctx
289+
ctx.cleanup()

0 commit comments

Comments
 (0)