Add media usage index foundation#1670
Conversation
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
emdash-demo-cache | aeb745a | Jun 30 2026, 05:59 PM |
🦋 Changeset detectedLatest commit: aeb745a The changes in this PR will be included in the next version bump. This PR includes changesets to release 16 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
emdash-playground | aeb745a | Jun 30 2026, 05:59 PM |
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ❌ Deployment failed View logs |
emdash-demo-do | aeb745a | Jun 30 2026, 05:58 PM |
Scope checkThis PR changes 2,417 lines across 19 files. Large PRs are harder to review and more likely to be closed without review. If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs. See CONTRIBUTING.md for contribution guidelines. |
@emdash-cms/admin
@emdash-cms/auth
@emdash-cms/auth-atproto
@emdash-cms/blocks
@emdash-cms/cloudflare
@emdash-cms/contentful-to-portable-text
emdash
create-emdash
@emdash-cms/gutenberg-to-portable-text
@emdash-cms/plugin-cli
@emdash-cms/plugin-types
@emdash-cms/registry-client
@emdash-cms/registry-lexicons
@emdash-cms/sandbox-workerd
@emdash-cms/x402
@emdash-cms/plugin-ai-moderation
@emdash-cms/plugin-atproto
@emdash-cms/plugin-audit-log
@emdash-cms/plugin-color
@emdash-cms/plugin-embeds
@emdash-cms/plugin-field-kit
@emdash-cms/plugin-forms
@emdash-cms/plugin-webhook-notifier
commit: |
There was a problem hiding this comment.
Pull request overview
Adds the internal “media usage index” foundation to track where media is referenced across content entries (including provider-aware assets), enabling future features like usage views and safer delete workflows.
Changes:
- Introduces
_emdash_media_usage_sourcesand_emdash_media_usagetables with a backfilling migration (046_media_usage_index). - Adds extraction + indexing helpers and a repository for replacing/querying current usage generations.
- Wires media-usage maintenance into key write paths (content lifecycle handlers, schema changes, plugin writes, seed writes, and revision/draft flows), with comprehensive unit/integration coverage.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/tests/unit/media/usage-extractor.test.ts | Unit coverage for media reference extraction across supported field shapes. |
| packages/core/tests/integration/database/migrations.test.ts | Registers migration 046 in the integration migration test list. |
| packages/core/tests/integration/database/media-usage-repository.test.ts | Integration tests for MediaUsageRepository replace/query semantics and D1 bind-limit batching. |
| packages/core/tests/integration/database/media-usage-index.test.ts | Integration tests for schema/index creation, down/up behavior, and migration backfill. |
| packages/core/tests/integration/content/media-usage-index.test.ts | End-to-end integration coverage across content lifecycle operations and indexing behavior. |
| packages/core/src/seed/apply.ts | Ensures seed create/update writes also maintain media usage sources for the correct live/draft state. |
| packages/core/src/schema/registry.ts | Hooks schema mutations (field create/update/delete, collection delete) to reindex or clear usage as needed. |
| packages/core/src/plugins/context.ts | Updates plugin write/delete paths to refresh usage metadata (including deleted_at) for affected content. |
| packages/core/src/media/usage-index.ts | New indexing orchestration: computes indexed fields and replaces content/collection usage sources. |
| packages/core/src/media/usage-extractor.ts | New extractor that walks field data and portable text to produce normalized usage references. |
| packages/core/src/media/mime.ts | Adds MediaKind classification helper derived from MIME types. |
| packages/core/src/emdash-runtime.ts | Ensures draft revision save/restore paths also update media usage for draft state. |
| packages/core/src/database/types.ts | Adds DB typings for the new media usage tables. |
| packages/core/src/database/repositories/media-usage.ts | New repository implementing generation-based replace + current-usage queries. |
| packages/core/src/database/migrations/runner.ts | Registers migration 046 in the static migration provider. |
| packages/core/src/database/migrations/046_media_usage_index.ts | Creates usage tables + indexes and backfills existing content usage. |
| packages/core/src/api/handlers/revision.ts | Refreshes media usage when restoring a revision via the API handler. |
| packages/core/src/api/handlers/content.ts | Integrates media usage maintenance across content CRUD/publish/schedule/revision transitions and i18n syncing. |
| .changeset/media-usage-index.md | Minor release notes for introducing internal media usage tracking foundation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await replaceContentMediaUsage( | ||
| trx, | ||
| revision.collection, | ||
| updated, | ||
| updated.status === "published" ? "live" : "draft", | ||
| ); | ||
|
|
| const tableName = `ec_${collection}`; | ||
| const result = await sql<Record<string, unknown>>` | ||
| SELECT * FROM ${sql.ref(tableName)} | ||
| `.execute(db); | ||
| const revisionRepo = new RevisionRepository(db); | ||
|
|
||
| for (const row of result.rows) { | ||
| const item = rowToContentItem(collection, row); |
Overlapping PRsThis PR modifies files that are also changed by other open PRs:
This may cause merge conflicts or duplicated work. A maintainer will coordinate. |
What does this PR do?
Adds the internal media usage index foundation for content entries.
This is the first implementation layer for Media Library maturity work discussed in #1655. It gives EmDash a durable, provider-aware record of where media assets are referenced by content, without exposing that data through a public API or UI yet.
The new index tracks usage for content fields that can hold media references, including top-level image/file fields, repeater image subfields, Portable Text image blocks, and structured provider media references. Local media is indexed by
media_id, while non-local structured references are indexed by provider and provider asset ID so future integrations like Cloudflare Images, Mux, or Cloudinary do not require redesigning the storage model.The migration adds two internal tables:
_emdash_media_usage_sources_emdash_media_usageThe source/generation model makes usage replacement safe for D1-style environments by avoiding a visible delete-then-insert gap when reindexing a content item.
The index is maintained across the content lifecycle:
This PR intentionally does not add:
Those are follow-up features built on top of this foundation.
Discussion: #1655
Closes: n/a
Type of change
Checklist
pnpm typecheckpassespnpm lintpassespnpm testpasses (or targeted tests for my change)pnpm formathas been runmessages.pochanges except in translation PRs — a workflow extracts catalogs on merge tomain.AI-generated code disclosure
Screenshots / test output
No screenshots. This PR does not add UI.
Verified locally:
pnpm lint pnpm typecheck pnpm format pnpm --filter emdash exec vitest run tests/unit/media/usage-extractor.test.ts tests/integration/database/media-usage-index.test.ts tests/integration/database/media-usage-repository.test.ts tests/integration/content/media-usage-index.test.ts tests/integration/database/migrations.test.ts tests/integration/seed-live-revisions.test.ts tests/unit/seed/apply.test.ts tests/unit/seed/media.test.ts