Skip to content
This repository was archived by the owner on Jun 2, 2026. It is now read-only.
This repository was archived by the owner on Jun 2, 2026. It is now read-only.

Feature parity: extensions needed to replace magic-indexer for certified-app #43

Description

@holkexyz

Feature parity: extensions needed to replace magic-indexer for certified-app

We're evaluating moving certified-app from magic-indexer onto upstream hyperindex (filing here because issues are disabled on hyperindex-v2, and v2 forks from this repo via GainForest/hyperindex). The three repos share the same architecture (Go AT Protocol AppView, Tap/Jetstream ingestion, lexicon-driven GraphQL schema), but magic-indexer carries a set of consumer-facing extensions that certified-app depends on and that upstream hyperindex / hyperindex-v2 do not. This issue catalogs them so they can be upstreamed.

Updated to cover the full consumer surface certified-app actually calls — most notably the unified follower feed, the endorsement-graph query, and the profile/identity denormalisation cluster (the avatar/identity epic), which the earlier draft only touched via the award issuer field.

What works out-of-the-box on hyperindex-v2

Once the app.certified.* and org.hypercerts.* lexicons are registered, the generic schema surface already handles:

  • totalCount on every connection root (used for /welcome network-counts strip)
  • where: { type: { eq } }, where: { subject: { eq } }, where: { did: { in } } (used for project / follower / badge-definition lookups)
  • first / after / edges{cursor,node} / pageInfo

So appCertifiedActorProfile, appCertifiedActorOrganization, orgHypercertsCollection, appCertifiedGraphFollow, and appCertifiedBadgeDefinition are essentially free.

What's missing

1. Unified follower activity feed (followerEvents)

certified-app's home feed (src/app/api/indexer/route.ts, op FollowerEvents) is not a single-collection query — it's a unified, time-ordered feed across the viewer's follow set, spanning 8 record kinds (cert / collection / evaluation / measurement / hyperboard / record-update / endorsement award / the folded project.created_with_cert).

query FollowerEvents(
  $authors: [String!]!, $kinds: [String!],
  $sortBy: String, $after: String, $first: Int!
) {
  followerEvents(
    authors: $authors, kinds: $kinds,
    sortBy: $sortBy, after: $after, first: $first
  ) {
    edges { cursor node { kind uri cid did createdAt sortAt ... } }
    pageInfo { ... }
  }
}

This is not derivable from any single lexicon's auto-generated connection.

Recommended implementation:

  • A followerEvents root that UNION ALLs the per-kind record tables for did IN (authors), ordered by the chosen sort key, with kinds narrowing the set of unioned tables.
  • Two sort modes — feed-lw (last-writer sort_at) and created_at — encoded in the cursor (v1:feed-lw:<sort_at> / v1:created_at:<created_at>); reject a cursor minted under a different sort mode with INVALID_CURSOR (otherwise paging is silently inconsistent).
  • Project+cert fold: a cert.create + collection.create pair from the same author within a 60s window, linked via the collection's items[], collapses into one synthetic project.created_with_cert event — so the feed doesn't render a cert and its wrapper collection as two adjacent rows.

2. Activity-feed connection args on orgHypercertsClaimActivity

The certified-app feed (src/app/api/indexer/route.ts, op Activities) calls this query against magic-indexer:

query Activities(
  $first: Int!, $after: String,
  $labels: [String!],
  $excludeLabels: [String!],
  $authors: [String!],
  $search: String
) {
  orgHypercertsClaimActivity(
    first: $first, after: $after,
    labels: $labels, excludeLabels: $excludeLabels,
    authors: $authors, search: $search
  ) { totalCount edges{cursor node{...}} pageInfo{...} }
}

labels / excludeLabels / authors / search do not exist on hyperindex-v2's auto-generated connection roots. We also use where: { contributor: { eq: $did } } for the per-user "contributed" feed (op ContributedActivities), which is not auto-derived from the lexicon since contributor is a sub-field on a union variant.

Recommended implementation:

  • Add a ConnectionArgRegistry keyed by lexicon NSID. Each entry provides the GraphQL arg → SQL clause mapping.
  • For org.hypercerts.claim.activity, register:
    • labels: [String!] → JOIN against the labels table (see #3) with IN (...) and EXISTS (...)
    • excludeLabels: [String!]NOT EXISTS (... IN (...))
    • authors: [String!]did IN (...) (already implicit in v2's where.did.in, but elevate as a first-class connection arg so the API matches the upstream Hyperindex precedent)
    • search: Stringto_tsquery against a GIN index across title, shortDescription, description, and the workScope.scope JSON path (see #4)
    • where.contributor → recursive descent into the contributor array, with the same did/identity tri-form handling magic-indexer documents in its contributor filter (DID values only; non-DID handles silently skipped; MaxArrayContributorScan cap)

These arg registrations should live in internal/graphql/query/connection.go, alongside the existing where/sortBy plumbing.

3. Profile & actor denormalisation

This was the single largest body of work on magic-indexer's side (the avatar / identity epic) and has no v2 equivalent. It is several distinct pieces:

(a) Denormalised fields on appCertifiedBadgeAward. The endorsement detail card (src/hooks/use-received-endorsements.ts, op ReceivedEndorsements) needs the issuer's actor profile and the recipient's response inlined on each award node:

appCertifiedBadgeAward(...) {
  edges { node {
    uri cid did createdAt note badge
    issuer { did handle displayName description avatarCid bannerCid pds }
    response { state weight createdAt }
  }}
}

Without this we re-introduce a /api/resolve-did fan-out per award row on first paint, which is exactly what magic-indexer issue #96 closed.

  • Port internal/graphql/schema/derived_fields.go from magic-indexer (it has no equivalent in v2).
  • Register two derived fields for app.certified.badge.award:
    • issuer — joins the award's did to app.certified.actor.profile (with app.bsky.actor.profile ingestion as a fallback once enabled). Expose bannerCid alongside avatarCid.
    • response — joins the award URI to the latest app.certified.badge.response record by the award's subject, ordered by sort_at DESC NULLS LAST (matches magic-indexer issue add Vercel deploy button to README.md #26 so reset → accept resolves correctly)
  • Sort order on the appCertifiedBadgeAward connection itself should default to sort_at DESC NULLS LAST so resets land before originals.

(b) Top-level actorProfile(did): Issuer query. certified-app's /api/resolve-did resolves a DID to { handle, displayName, description, avatarCid, bannerCid, pds } in one call instead of a getRecord fan-out against the PDS. Same Issuer shape as the derived field in (a).

(c) bsky-profile denormalisation onto the actor table. On every app.bsky.actor.profile/self write, populate display_name / description / avatar_cid / banner_cid on the actor row, so (a) and (b) resolve from one indexed row without a per-request PDS read. Requires registering app.bsky.actor.profile for ingestion.

(d) did-scoped bsky-profile backfill worker. Actors first indexed from an app.certified.* record may never emit a firehose app.bsky.actor.profile event, so their denormalised profile stays empty. A worker pulls each such actor's app.bsky.actor.profile from their PDS and stamps a bsky_profile_indexed_at marker so it runs at most once per actor. Gate it behind an env flag (e.g. BSKY_PROFILE_BACKFILL_ENABLED).

(e) Last-writer guard. A bsky_profile_record_at column plus an ordering predicate on the denorm upsert (... WHERE COALESCE(existing.bsky_profile_record_at,'-infinity') <= EXCLUDED.bsky_profile_record_at) so an out-of-order firehose or backfill write cannot overwrite a fresher avatar with a stale (or empty) one. This was the root cause of an avatar-data-loss bug observed on the consumer side, where a later empty/older write nulled a freshly-set avatar.

(f) Identity-event handling. Parse #identity firehose events: Actors.UpdateHandle + handle validation, a per-DID serialised identityDispatcher, and a DID-document cache (~5-min TTL) invalidated on identity events, so handle changes propagate without staleness or races.

(g) Blob.ref shape. Return the bare CID string (not { "$link": cid }) for avatar/banner blob refs, with CBOR tag-42 decoded on the ingestion path, so the consumer can build blob URLs directly from (did, cid).

4. Endorsement graph (endorsementClosure)

certified-app (src/app/api/indexer/route.ts, op EndorsementClosure) requests the transitive endorsement closure around a viewer:

query EndorsementClosure($viewer: String!, $degree: Int!) {
  endorsementClosure(viewer: $viewer, degree: $degree) { did degree via }
}

A recursive walk over app.certified.badge.award edges out to N degrees (1st / 2nd / 3rd), returning each reached DID with its degree and provenance (via, the DID that pulled it into the closure). Not expressible through generic connection args.

Recommended implementation: a dedicated recursive-CTE-backed resolver over the award edge set, with a hard degree cap and per-level fan-out limit to avoid pathological traversals.

5. Inline labels on every record

Every record-bearing connection in magic-indexer exposes a flat labels: [String!] array — the current set of active labeller values for that record. This replaces a separate label query.

hyperindex-v2 only exposes externalLabels (different shape: {src, uri, cid, val, cts}, scoped to ATProto labelers).

Recommended implementation:

  • Add a labels resolver to the connection-node builder when the lexicon has any registered labeller.
  • Batch-load via DataLoader keyed by record URI; resolve from the existing labels table.
  • Keep externalLabels as the structured form (useful when the consumer needs the labeller DID), but materialize labels as the flat array consumers like certified-app expect.

6. Full-text search arg

where.title.contains (per-field, 3-char min) is not enough — the certified-app feed search box queries across title, shortDescription, description, and the JSON workScope.scope field as a single AND-ed phrase.

Recommended implementation:

  • Generate a single tsvector column per record type (built at ingestion time from the configured set of text fields), GIN-indexed.
  • Expose search: String as a connection arg on each record connection where the lexicon has registered a search profile.
  • Implicit-AND between whitespace-separated terms (matches magic-indexer behavior).

7. Authenticated notifications subsystem

certified-app's notification dropdown (src/app/api/notifications/route.ts) talks to a separate /notifications/graphql endpoint on magic-indexer, authenticated by AT Protocol service-auth JWT (the acting DID is taken from the JWT iss, not a variable). Operations: notifications, unreadNotificationCount, updateNotificationsSeen.

This is the largest missing piece — it's an authenticated, write-capable GraphQL endpoint with a service-auth JWT verification layer and persistent "seen at" state per actor.

Recommended implementation:

  • Add internal/notifications/ subsystem with: ingestion side that materializes notification rows from interaction records (follows, badge awards, replies, etc.), and read side that queries them per recipient DID.
  • Service-auth verification middleware: validate the JWT iss (acting DID), aud (this indexer's DID), lxm (= com.hypergoat.notification.query for reads / com.hypergoat.notification.markSeen for the mutation), exp ~60s window, jti replay cache.
  • Schema: notifications(first, after) connection, unreadNotificationCount { count more }, updateNotificationsSeen(seenAt: String) mutation.

Smaller items & filter ergonomics

  • isOrganization connection arg on appCertifiedActorProfile — certified-app's People/Orgs discriminator (op OrganizationDids and friends). It's a computed flag (whether the DID has an app.certified.actor.organization record), not a raw lexicon field, so it needs a registered filter rather than falling out of the generic where.
  • where: { uri: { in: [String!] } } on every connection — the hydrate-by-URI pattern (ops ActivitiesByUris, HydrateFeedPage) fetches a page of mixed records in one round-trip instead of N× getRecord. v2's generic where likely already admits this; flagging it so it isn't dropped, and so the synthetic uri column is PK-indexed (a btree hit, not a jsonExtract scan).
  • Promoted subjectUri: { eq } on orgHypercertsContext{Attachment,Evaluation,Measurement} and itemUri: { eq } on orgHypercertsCollection (records pointing at a given subject) — sub-fields on union variants, so not auto-derived.
  • appCertifiedTempGraphEndorsement — legacy temp endorsement records (pre-badge-migration). certified-app still reads from this for the compat window; can drop once unused.
  • MaxArrayContributorScan cost cap — magic-indexer skips contributor lookups on records with too many contributors. v2 should adopt this or similar to avoid pathological queries.
  • (ops / protection, not consumed by the app) per-IP /graphql rate limiter with a trusted-caller bypass header — magic-indexer added this after exposing /graphql publicly. Off by default (opt-in via env), bypass keyed by a shared secret header so first-party proxies aren't throttled.

Strategic ask

If hyperindex-v2 is the org's strategic direction, the path of least pain is to upstream the read-side items here — the feed (1, 2, 4), the profile/identity denormalisation (3), inline labels (5) and full-text search (6) — and either fold notifications (item 7) into v2 or carve it out as a small companion service. That clears the way for certified-app — and any other consumer — to migrate off the magic-indexer fork.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions