You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature parity: extensions needed to replace magic-indexer for certified-app
We're evaluating moving certified-app from magic-indexer onto upstream hyperindex (filing here because issues are disabled on hyperindex-v2, and v2 forks from this repo via GainForest/hyperindex). The three repos share the same architecture (Go AT Protocol AppView, Tap/Jetstream ingestion, lexicon-driven GraphQL schema), but magic-indexer carries a set of consumer-facing extensions that certified-app depends on and that upstream hyperindex / hyperindex-v2 do not. This issue catalogs them so they can be upstreamed.
Updated to cover the full consumer surface certified-app actually calls — most notably the unified follower feed, the endorsement-graph query, and the profile/identity denormalisation cluster (the avatar/identity epic), which the earlier draft only touched via the award issuer field.
What works out-of-the-box on hyperindex-v2
Once the app.certified.* and org.hypercerts.* lexicons are registered, the generic schema surface already handles:
totalCount on every connection root (used for /welcome network-counts strip)
So appCertifiedActorProfile, appCertifiedActorOrganization, orgHypercertsCollection, appCertifiedGraphFollow, and appCertifiedBadgeDefinition are essentially free.
certified-app's home feed (src/app/api/indexer/route.ts, op FollowerEvents) is not a single-collection query — it's a unified, time-ordered feed across the viewer's follow set, spanning 8 record kinds (cert / collection / evaluation / measurement / hyperboard / record-update / endorsement award / the folded project.created_with_cert).
This is not derivable from any single lexicon's auto-generated connection.
Recommended implementation:
A followerEvents root that UNION ALLs the per-kind record tables for did IN (authors), ordered by the chosen sort key, with kinds narrowing the set of unioned tables.
Two sort modes — feed-lw (last-writer sort_at) and created_at — encoded in the cursor (v1:feed-lw:<sort_at> / v1:created_at:<created_at>); reject a cursor minted under a different sort mode with INVALID_CURSOR (otherwise paging is silently inconsistent).
Project+cert fold: a cert.create + collection.create pair from the same author within a 60s window, linked via the collection's items[], collapses into one synthetic project.created_with_cert event — so the feed doesn't render a cert and its wrapper collection as two adjacent rows.
2. Activity-feed connection args on orgHypercertsClaimActivity
The certified-app feed (src/app/api/indexer/route.ts, op Activities) calls this query against magic-indexer:
labels / excludeLabels / authors / search do not exist on hyperindex-v2's auto-generated connection roots. We also use where: { contributor: { eq: $did } } for the per-user "contributed" feed (op ContributedActivities), which is not auto-derived from the lexicon since contributor is a sub-field on a union variant.
Recommended implementation:
Add a ConnectionArgRegistry keyed by lexicon NSID. Each entry provides the GraphQL arg → SQL clause mapping.
For org.hypercerts.claim.activity, register:
labels: [String!] → JOIN against the labels table (see #3) with IN (...) and EXISTS (...)
excludeLabels: [String!] → NOT EXISTS (... IN (...))
authors: [String!] → did IN (...) (already implicit in v2's where.did.in, but elevate as a first-class connection arg so the API matches the upstream Hyperindex precedent)
search: String → to_tsquery against a GIN index across title, shortDescription, description, and the workScope.scope JSON path (see #4)
where.contributor → recursive descent into the contributor array, with the same did/identity tri-form handling magic-indexer documents in its contributor filter (DID values only; non-DID handles silently skipped; MaxArrayContributorScan cap)
These arg registrations should live in internal/graphql/query/connection.go, alongside the existing where/sortBy plumbing.
3. Profile & actor denormalisation
This was the single largest body of work on magic-indexer's side (the avatar / identity epic) and has no v2 equivalent. It is several distinct pieces:
(a) Denormalised fields on appCertifiedBadgeAward. The endorsement detail card (src/hooks/use-received-endorsements.ts, op ReceivedEndorsements) needs the issuer's actor profile and the recipient's response inlined on each award node:
Without this we re-introduce a /api/resolve-did fan-out per award row on first paint, which is exactly what magic-indexer issue #96 closed.
Port internal/graphql/schema/derived_fields.go from magic-indexer (it has no equivalent in v2).
Register two derived fields for app.certified.badge.award:
issuer — joins the award's did to app.certified.actor.profile (with app.bsky.actor.profile ingestion as a fallback once enabled). Expose bannerCid alongside avatarCid.
response — joins the award URI to the latest app.certified.badge.response record by the award's subject, ordered by sort_at DESC NULLS LAST (matches magic-indexer issue add Vercel deploy button to README.md #26 so reset → accept resolves correctly)
Sort order on the appCertifiedBadgeAward connection itself should default to sort_at DESC NULLS LAST so resets land before originals.
(b) Top-level actorProfile(did): Issuer query. certified-app's /api/resolve-did resolves a DID to { handle, displayName, description, avatarCid, bannerCid, pds } in one call instead of a getRecord fan-out against the PDS. Same Issuer shape as the derived field in (a).
(c) bsky-profile denormalisation onto the actor table. On every app.bsky.actor.profile/self write, populate display_name / description / avatar_cid / banner_cid on the actor row, so (a) and (b) resolve from one indexed row without a per-request PDS read. Requires registering app.bsky.actor.profile for ingestion.
(d) did-scoped bsky-profile backfill worker. Actors first indexed from an app.certified.* record may never emit a firehose app.bsky.actor.profile event, so their denormalised profile stays empty. A worker pulls each such actor's app.bsky.actor.profile from their PDS and stamps a bsky_profile_indexed_at marker so it runs at most once per actor. Gate it behind an env flag (e.g. BSKY_PROFILE_BACKFILL_ENABLED).
(e) Last-writer guard. A bsky_profile_record_at column plus an ordering predicate on the denorm upsert (... WHERE COALESCE(existing.bsky_profile_record_at,'-infinity') <= EXCLUDED.bsky_profile_record_at) so an out-of-order firehose or backfill write cannot overwrite a fresher avatar with a stale (or empty) one. This was the root cause of an avatar-data-loss bug observed on the consumer side, where a later empty/older write nulled a freshly-set avatar.
(f) Identity-event handling. Parse #identity firehose events: Actors.UpdateHandle + handle validation, a per-DID serialised identityDispatcher, and a DID-document cache (~5-min TTL) invalidated on identity events, so handle changes propagate without staleness or races.
(g) Blob.ref shape. Return the bare CID string (not { "$link": cid }) for avatar/banner blob refs, with CBOR tag-42 decoded on the ingestion path, so the consumer can build blob URLs directly from (did, cid).
4. Endorsement graph (endorsementClosure)
certified-app (src/app/api/indexer/route.ts, op EndorsementClosure) requests the transitive endorsement closure around a viewer:
A recursive walk over app.certified.badge.award edges out to N degrees (1st / 2nd / 3rd), returning each reached DID with its degree and provenance (via, the DID that pulled it into the closure). Not expressible through generic connection args.
Recommended implementation: a dedicated recursive-CTE-backed resolver over the award edge set, with a hard degree cap and per-level fan-out limit to avoid pathological traversals.
5. Inline labels on every record
Every record-bearing connection in magic-indexer exposes a flat labels: [String!] array — the current set of active labeller values for that record. This replaces a separate label query.
hyperindex-v2 only exposes externalLabels (different shape: {src, uri, cid, val, cts}, scoped to ATProto labelers).
Recommended implementation:
Add a labels resolver to the connection-node builder when the lexicon has any registered labeller.
Batch-load via DataLoader keyed by record URI; resolve from the existing labels table.
Keep externalLabels as the structured form (useful when the consumer needs the labeller DID), but materialize labels as the flat array consumers like certified-app expect.
6. Full-text search arg
where.title.contains (per-field, 3-char min) is not enough — the certified-app feed search box queries across title, shortDescription, description, and the JSON workScope.scope field as a single AND-ed phrase.
Recommended implementation:
Generate a single tsvector column per record type (built at ingestion time from the configured set of text fields), GIN-indexed.
Expose search: String as a connection arg on each record connection where the lexicon has registered a search profile.
Implicit-AND between whitespace-separated terms (matches magic-indexer behavior).
7. Authenticated notifications subsystem
certified-app's notification dropdown (src/app/api/notifications/route.ts) talks to a separate /notifications/graphql endpoint on magic-indexer, authenticated by AT Protocol service-auth JWT (the acting DID is taken from the JWT iss, not a variable). Operations: notifications, unreadNotificationCount, updateNotificationsSeen.
This is the largest missing piece — it's an authenticated, write-capable GraphQL endpoint with a service-auth JWT verification layer and persistent "seen at" state per actor.
Recommended implementation:
Add internal/notifications/ subsystem with: ingestion side that materializes notification rows from interaction records (follows, badge awards, replies, etc.), and read side that queries them per recipient DID.
Service-auth verification middleware: validate the JWT iss (acting DID), aud (this indexer's DID), lxm (= com.hypergoat.notification.query for reads / com.hypergoat.notification.markSeen for the mutation), exp ~60s window, jti replay cache.
isOrganization connection arg on appCertifiedActorProfile — certified-app's People/Orgs discriminator (op OrganizationDids and friends). It's a computed flag (whether the DID has an app.certified.actor.organization record), not a raw lexicon field, so it needs a registered filter rather than falling out of the generic where.
where: { uri: { in: [String!] } } on every connection — the hydrate-by-URI pattern (ops ActivitiesByUris, HydrateFeedPage) fetches a page of mixed records in one round-trip instead of N× getRecord. v2's generic where likely already admits this; flagging it so it isn't dropped, and so the synthetic uri column is PK-indexed (a btree hit, not a jsonExtract scan).
Promoted subjectUri: { eq } on orgHypercertsContext{Attachment,Evaluation,Measurement} and itemUri: { eq } on orgHypercertsCollection (records pointing at a given subject) — sub-fields on union variants, so not auto-derived.
appCertifiedTempGraphEndorsement — legacy temp endorsement records (pre-badge-migration). certified-app still reads from this for the compat window; can drop once unused.
MaxArrayContributorScancost cap — magic-indexer skips contributor lookups on records with too many contributors. v2 should adopt this or similar to avoid pathological queries.
(ops / protection, not consumed by the app) per-IP /graphql rate limiter with a trusted-caller bypass header — magic-indexer added this after exposing /graphql publicly. Off by default (opt-in via env), bypass keyed by a shared secret header so first-party proxies aren't throttled.
Strategic ask
If hyperindex-v2 is the org's strategic direction, the path of least pain is to upstream the read-side items here — the feed (1, 2, 4), the profile/identity denormalisation (3), inline labels (5) and full-text search (6) — and either fold notifications (item 7) into v2 or carve it out as a small companion service. That clears the way for certified-app — and any other consumer — to migrate off the magic-indexer fork.
Feature parity: extensions needed to replace magic-indexer for certified-app
We're evaluating moving certified-app from magic-indexer onto upstream hyperindex (filing here because issues are disabled on hyperindex-v2, and v2 forks from this repo via GainForest/hyperindex). The three repos share the same architecture (Go AT Protocol AppView, Tap/Jetstream ingestion, lexicon-driven GraphQL schema), but magic-indexer carries a set of consumer-facing extensions that certified-app depends on and that upstream hyperindex / hyperindex-v2 do not. This issue catalogs them so they can be upstreamed.
What works out-of-the-box on hyperindex-v2
Once the
app.certified.*andorg.hypercerts.*lexicons are registered, the generic schema surface already handles:totalCounton every connection root (used for/welcomenetwork-counts strip)where: { type: { eq } },where: { subject: { eq } },where: { did: { in } }(used for project / follower / badge-definition lookups)first/after/edges{cursor,node}/pageInfoSo
appCertifiedActorProfile,appCertifiedActorOrganization,orgHypercertsCollection,appCertifiedGraphFollow, andappCertifiedBadgeDefinitionare essentially free.What's missing
1. Unified follower activity feed (
followerEvents)certified-app's home feed (
src/app/api/indexer/route.ts, opFollowerEvents) is not a single-collection query — it's a unified, time-ordered feed across the viewer's follow set, spanning 8 record kinds (cert / collection / evaluation / measurement / hyperboard / record-update / endorsement award / the foldedproject.created_with_cert).This is not derivable from any single lexicon's auto-generated connection.
Recommended implementation:
followerEventsroot thatUNION ALLs the per-kind record tables fordid IN (authors), ordered by the chosen sort key, withkindsnarrowing the set of unioned tables.feed-lw(last-writersort_at) andcreated_at— encoded in the cursor (v1:feed-lw:<sort_at>/v1:created_at:<created_at>); reject a cursor minted under a different sort mode withINVALID_CURSOR(otherwise paging is silently inconsistent).cert.create+collection.createpair from the same author within a 60s window, linked via the collection'sitems[], collapses into one syntheticproject.created_with_certevent — so the feed doesn't render a cert and its wrapper collection as two adjacent rows.2. Activity-feed connection args on
orgHypercertsClaimActivityThe certified-app feed (
src/app/api/indexer/route.ts, opActivities) calls this query against magic-indexer:labels/excludeLabels/authors/searchdo not exist on hyperindex-v2's auto-generated connection roots. We also usewhere: { contributor: { eq: $did } }for the per-user "contributed" feed (opContributedActivities), which is not auto-derived from the lexicon since contributor is a sub-field on a union variant.Recommended implementation:
ConnectionArgRegistrykeyed by lexicon NSID. Each entry provides the GraphQL arg → SQL clause mapping.org.hypercerts.claim.activity, register:labels: [String!]→ JOIN against the labels table (see #3) withIN (...)andEXISTS (...)excludeLabels: [String!]→NOT EXISTS (... IN (...))authors: [String!]→did IN (...)(already implicit in v2'swhere.did.in, but elevate as a first-class connection arg so the API matches the upstream Hyperindex precedent)search: String→to_tsqueryagainst a GIN index acrosstitle,shortDescription,description, and theworkScope.scopeJSON path (see #4)where.contributor→ recursive descent into the contributor array, with the samedid/identitytri-form handling magic-indexer documents in itscontributorfilter (DID values only; non-DID handles silently skipped;MaxArrayContributorScancap)These arg registrations should live in
internal/graphql/query/connection.go, alongside the existingwhere/sortByplumbing.3. Profile & actor denormalisation
This was the single largest body of work on magic-indexer's side (the avatar / identity epic) and has no v2 equivalent. It is several distinct pieces:
(a) Denormalised fields on
appCertifiedBadgeAward. The endorsement detail card (src/hooks/use-received-endorsements.ts, opReceivedEndorsements) needs the issuer's actor profile and the recipient's response inlined on each award node:Without this we re-introduce a
/api/resolve-didfan-out per award row on first paint, which is exactly what magic-indexer issue #96 closed.internal/graphql/schema/derived_fields.gofrom magic-indexer (it has no equivalent in v2).app.certified.badge.award:issuer— joins the award'sdidtoapp.certified.actor.profile(withapp.bsky.actor.profileingestion as a fallback once enabled). ExposebannerCidalongsideavatarCid.response— joins the award URI to the latestapp.certified.badge.responserecord by the award's subject, ordered bysort_at DESC NULLS LAST(matches magic-indexer issue add Vercel deploy button to README.md #26 so reset → accept resolves correctly)appCertifiedBadgeAwardconnection itself should default tosort_at DESC NULLS LASTso resets land before originals.(b) Top-level
actorProfile(did): Issuerquery. certified-app's/api/resolve-didresolves a DID to{ handle, displayName, description, avatarCid, bannerCid, pds }in one call instead of agetRecordfan-out against the PDS. SameIssuershape as the derived field in (a).(c) bsky-profile denormalisation onto the
actortable. On everyapp.bsky.actor.profile/selfwrite, populatedisplay_name/description/avatar_cid/banner_cidon the actor row, so (a) and (b) resolve from one indexed row without a per-request PDS read. Requires registeringapp.bsky.actor.profilefor ingestion.(d) did-scoped bsky-profile backfill worker. Actors first indexed from an
app.certified.*record may never emit a firehoseapp.bsky.actor.profileevent, so their denormalised profile stays empty. A worker pulls each such actor'sapp.bsky.actor.profilefrom their PDS and stamps absky_profile_indexed_atmarker so it runs at most once per actor. Gate it behind an env flag (e.g.BSKY_PROFILE_BACKFILL_ENABLED).(e) Last-writer guard. A
bsky_profile_record_atcolumn plus an ordering predicate on the denorm upsert (... WHERE COALESCE(existing.bsky_profile_record_at,'-infinity') <= EXCLUDED.bsky_profile_record_at) so an out-of-order firehose or backfill write cannot overwrite a fresher avatar with a stale (or empty) one. This was the root cause of an avatar-data-loss bug observed on the consumer side, where a later empty/older write nulled a freshly-set avatar.(f) Identity-event handling. Parse
#identityfirehose events:Actors.UpdateHandle+ handle validation, a per-DID serialisedidentityDispatcher, and a DID-document cache (~5-min TTL) invalidated on identity events, so handle changes propagate without staleness or races.(g)
Blob.refshape. Return the bare CID string (not{ "$link": cid }) for avatar/banner blob refs, with CBOR tag-42 decoded on the ingestion path, so the consumer can build blob URLs directly from(did, cid).4. Endorsement graph (
endorsementClosure)certified-app (
src/app/api/indexer/route.ts, opEndorsementClosure) requests the transitive endorsement closure around a viewer:A recursive walk over
app.certified.badge.awardedges out to N degrees (1st / 2nd / 3rd), returning each reached DID with itsdegreeand provenance (via, the DID that pulled it into the closure). Not expressible through generic connection args.Recommended implementation: a dedicated recursive-CTE-backed resolver over the award edge set, with a hard
degreecap and per-level fan-out limit to avoid pathological traversals.5. Inline
labelson every recordEvery record-bearing connection in magic-indexer exposes a flat
labels: [String!]array — the current set of active labeller values for that record. This replaces a separate label query.hyperindex-v2 only exposes
externalLabels(different shape:{src, uri, cid, val, cts}, scoped to ATProto labelers).Recommended implementation:
labelsresolver to the connection-node builder when the lexicon has any registered labeller.DataLoaderkeyed by record URI; resolve from the existing labels table.externalLabelsas the structured form (useful when the consumer needs the labeller DID), but materializelabelsas the flat array consumers like certified-app expect.6. Full-text
searchargwhere.title.contains(per-field, 3-char min) is not enough — the certified-app feed search box queries acrosstitle,shortDescription,description, and the JSONworkScope.scopefield as a single AND-ed phrase.Recommended implementation:
tsvectorcolumn per record type (built at ingestion time from the configured set of text fields), GIN-indexed.search: Stringas a connection arg on each record connection where the lexicon has registered a search profile.7. Authenticated notifications subsystem
certified-app's notification dropdown (
src/app/api/notifications/route.ts) talks to a separate/notifications/graphqlendpoint on magic-indexer, authenticated by AT Protocol service-auth JWT (the acting DID is taken from the JWTiss, not a variable). Operations:notifications,unreadNotificationCount,updateNotificationsSeen.This is the largest missing piece — it's an authenticated, write-capable GraphQL endpoint with a service-auth JWT verification layer and persistent "seen at" state per actor.
Recommended implementation:
internal/notifications/subsystem with: ingestion side that materializes notification rows from interaction records (follows, badge awards, replies, etc.), and read side that queries them per recipient DID.iss(acting DID),aud(this indexer's DID),lxm(=com.hypergoat.notification.queryfor reads /com.hypergoat.notification.markSeenfor the mutation),exp~60s window,jtireplay cache.notifications(first, after)connection,unreadNotificationCount { count more },updateNotificationsSeen(seenAt: String)mutation.Smaller items & filter ergonomics
isOrganizationconnection arg onappCertifiedActorProfile— certified-app's People/Orgs discriminator (opOrganizationDidsand friends). It's a computed flag (whether the DID has anapp.certified.actor.organizationrecord), not a raw lexicon field, so it needs a registered filter rather than falling out of the genericwhere.where: { uri: { in: [String!] } }on every connection — the hydrate-by-URI pattern (opsActivitiesByUris,HydrateFeedPage) fetches a page of mixed records in one round-trip instead of N×getRecord. v2's genericwherelikely already admits this; flagging it so it isn't dropped, and so the syntheticuricolumn is PK-indexed (a btree hit, not ajsonExtractscan).subjectUri: { eq }onorgHypercertsContext{Attachment,Evaluation,Measurement}anditemUri: { eq }onorgHypercertsCollection(records pointing at a given subject) — sub-fields on union variants, so not auto-derived.appCertifiedTempGraphEndorsement— legacy temp endorsement records (pre-badge-migration). certified-app still reads from this for the compat window; can drop once unused.MaxArrayContributorScancost cap — magic-indexer skips contributor lookups on records with too many contributors. v2 should adopt this or similar to avoid pathological queries./graphqlrate limiter with a trusted-caller bypass header — magic-indexer added this after exposing/graphqlpublicly. Off by default (opt-in via env), bypass keyed by a shared secret header so first-party proxies aren't throttled.Strategic ask
If hyperindex-v2 is the org's strategic direction, the path of least pain is to upstream the read-side items here — the feed (1, 2, 4), the profile/identity denormalisation (3), inline labels (5) and full-text search (6) — and either fold notifications (item 7) into v2 or carve it out as a small companion service. That clears the way for certified-app — and any other consumer — to migrate off the magic-indexer fork.