Adds subtree (descendant) matching to taxonomy where filters#1648
Adds subtree (descendant) matching to taxonomy where filters#1648MA2153 wants to merge 13 commits into
where filters#1648Conversation
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: 34174b4 The changes in this PR will be included in the next version bump. This PR includes changesets to release 16 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Scope checkThis PR changes 567 lines across 12 files. Large PRs are harder to review and more likely to be closed without review. If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs. See CONTRIBUTING.md for contribution guidelines. |
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/auth-atproto
@emdash-cms/blocks
@emdash-cms/cloudflare
@emdash-cms/contentful-to-portable-text
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/plugin-cli
@emdash-cms/plugin-types
@emdash-cms/registry-client
@emdash-cms/registry-lexicons
@emdash-cms/sandbox-workerd
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-field-kit
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
There was a problem hiding this comment.
Approach
This is the right change for the right problem. Exact-slug taxonomy filters cannot express "this term or anything under it", and the only workaround (enumerating every descendant slug) blows past D1's bind-parameter limit on deep trees. Adding a first-class { subtree } operator and resolving descendants in SQL with a recursive CTE fits EmDash's architecture cleanly: it reuses the existing translation_group-aware model from migration 045, keeps the parameter count constant, and doesn't disturb the default exact-slug behavior.
What I checked
- SQL safety: The recursive CTEs use Kysely's
sqltagged template for values andsql.ref()for identifiers; the dynamic table/collection names are validated viagetTableName/getTaxonomyNames. No raw interpolation of user slugs. - Locale / i18n correctness: The subtree walk uses
translation_group/parent_id, so matches are locale-agnostic in the same waycontent_taxonomies.taxonomy_idis. The loader's outerlocalefilter still scopes the returned entries. - Authorization: The admin terms route still checks
taxonomies:read; the new?rollupquery param is just passed through. - Cache invalidation:
getTaxonomyTerms({ rollup: true })is still wrapped in the existingcachedQuery/requestCachedlayers with a cache key that includesr1, so term mutations viainvalidateTermCache()bust it correctly. - Tests: Dialect-parametric tests cover single root, multiple roots, the >999-descendant overflow guard, mixing exact + subtree filters, empty-root short-circuit, cross-locale group matching, keyset pagination, and distinct-entry rollup counts. A changeset is present.
Headline conclusion
The code is clean and I don't see any blocking bugs. I have one small suggestion: the public where docstrings in loader.ts and query.ts list exact/array/range examples but don't mention the new subtree operator, so developers won't discover it from autocomplete/docs.
Process note: Per AGENTS.md, new features require a maintainer-approved Discussion. This PR links to Discussion #1647, which is currently in Ideas and awaiting approval. The implementation looks ready, but merge should wait for that approval.
Findings
-
[suggestion]
packages/core/src/loader.ts:643The public
wheredocstring lists exact, byline, field, and range examples but omits the newsubtreeoperator. Add a usage example so callers discover the feature through API docs/autocomplete.* @example { published_at: { gte: '2024-01-01', lt: '2025-01-01' } } - date range * @example { category: { subtree: 'news' } } - match a term and all descendants -
[suggestion]
packages/core/src/query.ts:124The public
wheredocstring lists exact/array/byline/field/range examples but does not mention the newsubtreeoperator added by this PR. Add an example so the public API surface is documented.* @example { published_at: { gte: '2024-01-01', lt: '2025-01-01' } } - Date range * @example { category: { subtree: 'news' } } - Match a term and all its descendants
Postgres COUNT() returns bigint as a string, so getTaxonomyTermCounts returned "1" instead of 1 under the pg driver, failing the rollup test's exact-count assertion. Coerce with Number(), matching countEntriesForSubtrees. Also document the new `subtree` where-operator in the loader/query docstrings (review suggestion). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Overlapping PRsThis PR modifies files that are also changed by other open PRs:
This may cause merge conflicts or duplicated work. A maintainer will coordinate. |
d3cce3d to
6690313
Compare
What does this PR do?
Adds a
subtreeoperator to collectionwheretaxonomy filters so selecting a parent term matches that term and all of its descendants, resolved in SQL:Today the taxonomy
wherefilter matches by exact slug only, so "this term or anything filed under it" (selecting a parent category in a faceted browse UI) is inexpressible. The only workaround — enumerating the subtree client-side and passing every descendant slug — expands to one bound parameter per slug and overflows D1's 100-bind-parameter cap (D1_ERROR: too many SQL variables) on deep hierarchies. It also can't be chunked without breaking keyset pagination.This resolves the subtree server-side from a single root slug via a recursive CTE over
taxonomies.parent_id, so the bound-parameter count is independent of subtree size. After #1646 bothparent_idandcontent_taxonomies.taxonomy_idlive in translation_group space, so the walk is locale-correct and matchestaxonomy_iddirectly.Also adds an opt-in
rollupoption togetTaxonomyTerms(and the admin terms endpoint via?rollup=1) returning distinct-entry subtree counts, so a facet badge equals what selecting that facet returns. Default behavior (exact-slug filter, exact-term counts) is unchanged.Related: Discussion #1647 (opened in Ideas; awaiting maintainer approval — opening the PR early to share the implementation, happy to adjust the operator surface, e.g.
{ subtree }vs{ descendantsOf }, per the discussion).Type of change
Checklist
pnpm typecheckpassespnpm lintpasses (0 diagnostics)pnpm testpasses (targeted: the new subtree filter + subtree count suites, 10 tests + the schema-coercion unit tests)pnpm formathas been runmessages.pochanges included (n/a for i18n)emdash: minor)AI-generated code disclosure
Screenshots / test output
Dialect-parity tests run under
describeEachDialect(SQLite locally; Postgres in CI viaPG_CONNECTION_STRING). Highlights:loader-taxonomy-subtree-filter.test.ts— single-root and multi-root match, >999-descendant overflow guard (would exceed SQLite's bind limit if descendants were enumerated rather than walked in-SQL), mixed exact + subtree across taxonomies, empty-roots short-circuit, cross-locale (match by translation_group), keyset pagination.taxonomy-subtree-counts.test.ts— distinct-entry rollup honesty (an entry tagged at both a parent and its child counts once),getTaxonomyTerms({ rollup }), andhandleTermList({ rollup }).🤖 Generated with Claude Code