Skip to content

Commit d7f20d7

Browse files
committed
BE-622: Move semantic search embedding generation into the graph
Add a hash-graph-embeddings crate exposing a provider-agnostic EmbeddingGenerator trait and an OpenAI-backed client (reqwest + reqwest-retry). The graph now resolves a search request's semanticString to an embedding itself, configured via HASH_GRAPH_OPENAI_API_KEY and wired in like the Temporal client, removing the Temporal roundtrip from the Node SDK search path. The /entities/search and /entity-types/search endpoints accept embedding xor semanticString; the Node SDK searchEntities/searchEntityTypes forward semanticString directly and no longer take a temporalClient. Provider failures are classified (auth/rate-limit/outage) and mapped to appropriate HTTP statuses rather than a blanket 500, a caller-supplied embedding is validated against Embedding::DIM, and startup logs whether semantic search is enabled. Regenerate the OpenAPI spec, document the sync:turborepo task in AGENTS.md, and wire HASH_GRAPH_OPENAI_API_KEY into the compose graph service.
1 parent a162311 commit d7f20d7

27 files changed

Lines changed: 1339 additions & 135 deletions

.clippy.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,5 @@ allow-indexing-slicing-in-tests = true
33
allow-print-in-tests = true
44
allow-renamed-params-for = ["core::fmt::Debug", "core::fmt::Display", "core::fmt::LowerHex", "core::fmt::UpperHex", "core::fmt::Pointer", "futures_sink::Sink", "serde::de::Visitor", ".."]
55
avoid-breaking-exported-api = false
6-
doc-valid-idents = ["BlockProtocol", "HaRPC", "HashQL", "OpenAPI", "PostgreSQL", "OAuth2", ".."]
6+
doc-valid-idents = ["BlockProtocol", "HaRPC", "HashQL", "OpenAI", "OpenAPI", "PostgreSQL", "OAuth2", ".."]
77
suppress-restriction-lint-in-const = true

AGENTS.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,16 @@ cargo clippy --all-features --package <package-name>
8181

8282
For Rust packages, you can add features as needed with `--all-features`, specific features like `--features=foo,bar`, or use `cargo-hack` with `--feature-powerset` for comprehensive feature testing.
8383

84+
### Monorepo wiring for Rust crates
85+
86+
Each Rust crate has a `package.json` whose **identity and workspace-dependency wiring** — its `@rust/<name>` name, version, and the `dependencies` mirroring its `Cargo.toml` — is generated from `Cargo.toml`. After **adding, removing, or renaming a Rust crate**, or changing its `Cargo.toml` dependencies, re-sync that wiring:
87+
88+
```bash
89+
mise run sync:turborepo # sync package.json identity + deps from Cargo.toml metadata
90+
```
91+
92+
`sync:turborepo` only manages that generated wiring — the `scripts` section is hand-maintained and is used by CI and Turborepo (e.g. `test:unit`, `lint:clippy`, `doc:dependency-diagram`), so add or edit scripts by hand. The task wraps the `repo-chores` CLI; the equivalent direct invocation is `cargo run --package hash-repo-chores --bin repo-chores-cli -- sync-turborepo`. A related task, `mise run fix:package-json`, sorts `package.json` keys consistently.
93+
8494
## Documentation Maintenance
8595

8696
### Petrinaut user-facing docs

Cargo.lock

Lines changed: 79 additions & 24 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ members = [
1313
"libs/@local/effect-dns/hickory",
1414
"libs/@local/graph/api",
1515
"libs/@local/graph/authorization",
16+
"libs/@local/graph/embeddings",
1617
"libs/@local/graph/migrations",
1718
"libs/@local/graph/migrations-macros",
1819
"libs/@local/graph/postgres-store",
@@ -75,6 +76,7 @@ hash-codec.path = "libs/@local/codec"
7576
hash-codegen.path = "libs/@local/codegen"
7677
hash-graph-api.path = "libs/@local/graph/api"
7778
hash-graph-authorization.path = "libs/@local/graph/authorization"
79+
hash-graph-embeddings.path = "libs/@local/graph/embeddings"
7880
hash-graph-migrations.path = "libs/@local/graph/migrations"
7981
hash-graph-migrations-macros.path = "libs/@local/graph/migrations-macros"
8082
hash-graph-postgres-store.path = "libs/@local/graph/postgres-store"
@@ -228,6 +230,7 @@ refinery = { version = "0.8.16", default-features = fa
228230
regex = { version = "1.11.2", default-features = false, features = ["perf", "unicode"] }
229231
reqwest = { version = "0.13.0", default-features = false, features = ["json", "rustls"] }
230232
reqwest-middleware = { version = "0.5.0", default-features = false }
233+
reqwest-retry = { version = "0.9.1", default-features = false }
231234
reqwest-tracing = { version = "0.7.0", default-features = false }
232235
roaring = { version = "0.11.2", default-features = false }
233236
rpds = { version = "1.1.2", default-features = false }

apps/hash-graph/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ harpc-server = { workspace = true }
1818
hash-codec = { workspace = true }
1919
hash-graph-api = { workspace = true, features = ["clap"] }
2020
hash-graph-authorization = { workspace = true }
21+
hash-graph-embeddings = { workspace = true }
2122
hash-graph-postgres-store = { workspace = true, features = ["clap"] }
2223
hash-graph-store = { workspace = true }
2324
hash-graph-type-fetcher = { workspace = true }

apps/hash-graph/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
"@rust/hash-codec": "workspace:*",
2828
"@rust/hash-graph-api": "workspace:*",
2929
"@rust/hash-graph-authorization": "workspace:*",
30+
"@rust/hash-graph-embeddings": "workspace:*",
3031
"@rust/hash-graph-postgres-store": "workspace:*",
3132
"@rust/hash-graph-store": "workspace:*",
3233
"@rust/hash-graph-type-fetcher": "workspace:*",

0 commit comments

Comments
 (0)