|
| 1 | +# Cloudinary → Sanity Asset Migration (Sanity-First) |
| 2 | + |
| 3 | +A production-grade Node.js tool that migrates Cloudinary assets to Sanity using |
| 4 | +a **Sanity-first** approach: it starts by scanning your Sanity documents to |
| 5 | +discover which Cloudinary assets are actually referenced, then migrates only |
| 6 | +those assets and rewrites all references. |
| 7 | + |
| 8 | +## Why Sanity-First? |
| 9 | + |
| 10 | +The previous approach enumerated **all** Cloudinary assets and uploaded them |
| 11 | +blindly. This was wasteful because: |
| 12 | + |
| 13 | +- Many Cloudinary assets may not be referenced by any Sanity document |
| 14 | +- It uploaded assets that were never needed, wasting time and storage |
| 15 | +- It couldn't handle the Sanity Cloudinary plugin's `cloudinary.asset` type |
| 16 | + |
| 17 | +The new approach: |
| 18 | + |
| 19 | +1. **Discovers** what's actually used in Sanity |
| 20 | +2. **Extracts** a deduplicated list of Cloudinary URLs |
| 21 | +3. **Migrates** only what's needed |
| 22 | +4. **Updates** all references in-place |
| 23 | +5. **Reports** a full summary |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## Prerequisites |
| 28 | + |
| 29 | +| Requirement | Why | |
| 30 | +|---|---| |
| 31 | +| **Node.js ≥ 18** | Native `fetch` support & ES-module compatibility | |
| 32 | +| **Sanity project** | Project ID, dataset name, and a **write-enabled** API token | |
| 33 | + |
| 34 | +> **Note:** Cloudinary API credentials are no longer required! The script |
| 35 | +> downloads assets directly from their public URLs. You only need Cloudinary |
| 36 | +> credentials if your assets are private/restricted. |
| 37 | +
|
| 38 | +## Quick Start |
| 39 | + |
| 40 | +```bash |
| 41 | +# 1. Install dependencies |
| 42 | +cd migration |
| 43 | +npm install |
| 44 | + |
| 45 | +# 2. Create your .env from the template |
| 46 | +cp env-example.txt .env |
| 47 | +# Then fill in your real credentials |
| 48 | + |
| 49 | +# 3. Run the full migration (dry-run first!) |
| 50 | +npm run migrate:dry-run |
| 51 | + |
| 52 | +# 4. Run for real |
| 53 | +npm run migrate |
| 54 | +``` |
| 55 | + |
| 56 | +## Environment Variables |
| 57 | + |
| 58 | +Copy `env-example.txt` to `.env` and fill in: |
| 59 | + |
| 60 | +| Variable | Required | Description | |
| 61 | +|---|---|---| |
| 62 | +| `SANITY_PROJECT_ID` | ✅ | Sanity project ID | |
| 63 | +| `SANITY_DATASET` | ✅ | Sanity dataset (e.g. `production`) | |
| 64 | +| `SANITY_TOKEN` | ✅ | Sanity API token with **write** access | |
| 65 | +| `CLOUDINARY_CLOUD_NAME` | | Cloudinary cloud name (default: `ajonp`) | |
| 66 | +| `CONCURRENCY` | | Max parallel uploads (default: `5`) | |
| 67 | +| `DRY_RUN` | | Set to `true` to preview without writing | |
| 68 | + |
| 69 | +## CLI Flags |
| 70 | + |
| 71 | +```bash |
| 72 | +node migrate.mjs # Full migration, all phases |
| 73 | +node migrate.mjs --dry-run # Preview mode — no writes |
| 74 | +node migrate.mjs --phase=1 # Run only Phase 1 |
| 75 | +node migrate.mjs --phase=1,2 # Run Phases 1 & 2 |
| 76 | +node migrate.mjs --phase=3,4 # Run Phases 3 & 4 (uses cached data) |
| 77 | +node migrate.mjs --concurrency=10 # Override parallel upload limit |
| 78 | +``` |
| 79 | + |
| 80 | +## What Each Phase Does |
| 81 | + |
| 82 | +### Phase 1 — Discover Cloudinary References in Sanity |
| 83 | + |
| 84 | +Scans **all** Sanity documents (excluding built-in asset types) to find any |
| 85 | +that reference Cloudinary. Handles two types of references: |
| 86 | + |
| 87 | +#### `cloudinary.asset` objects (Sanity Cloudinary Plugin) |
| 88 | + |
| 89 | +The [sanity-plugin-cloudinary](https://github.com/sanity-io/sanity-plugin-cloudinary) |
| 90 | +stores assets as objects with `_type: "cloudinary.asset"` containing fields like |
| 91 | +`public_id`, `secure_url`, `resource_type`, `format`, etc. |
| 92 | + |
| 93 | +#### Plain URL strings |
| 94 | + |
| 95 | +Any string field containing: |
| 96 | +- `res.cloudinary.com/ajonp` (standard Cloudinary URL) |
| 97 | +- `media.codingcat.dev` (custom CNAME domain) |
| 98 | + |
| 99 | +This includes both standalone URL fields and URLs embedded in text/markdown content. |
| 100 | + |
| 101 | +**Output:** `discovered-references.json` — list of documents with their Cloudinary references. |
| 102 | + |
| 103 | +### Phase 2 — Extract Unique Cloudinary URLs |
| 104 | + |
| 105 | +Deduplicates all discovered references into a unique list of Cloudinary asset |
| 106 | +URLs that need to be migrated. Tracks which documents reference each URL. |
| 107 | + |
| 108 | +**Output:** `unique-cloudinary-urls.json` — deduplicated URL list with metadata: |
| 109 | +```json |
| 110 | +{ |
| 111 | + "cloudinaryUrl": "https://res.cloudinary.com/ajonp/image/upload/v123/folder/photo.jpg", |
| 112 | + "cloudinaryPublicId": "folder/photo", |
| 113 | + "resourceType": "image", |
| 114 | + "sourceDocIds": ["doc-abc", "doc-def"] |
| 115 | +} |
| 116 | +``` |
| 117 | + |
| 118 | +### Phase 3 — Download & Upload Assets |
| 119 | + |
| 120 | +Downloads each unique Cloudinary asset and uploads it to Sanity's asset pipeline. |
| 121 | + |
| 122 | +**Output:** `asset-mapping.json` — mapping between Cloudinary and Sanity: |
| 123 | +```json |
| 124 | +{ |
| 125 | + "cloudinaryUrl": "https://res.cloudinary.com/ajonp/image/upload/v123/folder/photo.jpg", |
| 126 | + "cloudinaryPublicId": "folder/photo", |
| 127 | + "sanityAssetId": "image-abc123-1920x1080-jpg", |
| 128 | + "sanityUrl": "https://cdn.sanity.io/images/{projectId}/{dataset}/abc123-1920x1080.jpg", |
| 129 | + "sourceDocIds": ["doc-abc", "doc-def"] |
| 130 | +} |
| 131 | +``` |
| 132 | + |
| 133 | +- **Resume support**: assets already in the mapping are skipped automatically. |
| 134 | +- Retries failed downloads/uploads up to 3× with exponential back-off. |
| 135 | + |
| 136 | +### Phase 4 — Update References |
| 137 | + |
| 138 | +Patches Sanity documents to replace Cloudinary references with Sanity references: |
| 139 | + |
| 140 | +| Reference Type | Action | |
| 141 | +|---|---| |
| 142 | +| `cloudinary.asset` object | Replaced with `{ _type: "image", asset: { _type: "reference", _ref: "..." } }` | |
| 143 | +| Full URL string | Replaced with Sanity CDN URL | |
| 144 | +| Embedded URL in text | URL swapped inline within the text | |
| 145 | + |
| 146 | +All patches are applied inside **transactions** for atomicity (one transaction per document). |
| 147 | + |
| 148 | +### Phase 5 — Report |
| 149 | + |
| 150 | +Prints a summary to the console and writes a detailed report: |
| 151 | + |
| 152 | +``` |
| 153 | +══════════════════════════════════════════════════════════ |
| 154 | + MIGRATION SUMMARY |
| 155 | +══════════════════════════════════════════════════════════ |
| 156 | + Documents with refs: 42 |
| 157 | + Total references found: 128 |
| 158 | + cloudinary.asset objects: 35 |
| 159 | + URL string fields: 61 |
| 160 | + Embedded URLs in text: 32 |
| 161 | + Unique Cloudinary URLs: 87 |
| 162 | + Assets uploaded to Sanity: 87 |
| 163 | + Document fields updated: 128 |
| 164 | + Errors: 0 |
| 165 | +══════════════════════════════════════════════════════════ |
| 166 | +``` |
| 167 | + |
| 168 | +**Output:** `migration-report.json` |
| 169 | + |
| 170 | +## Generated Files |
| 171 | + |
| 172 | +| File | Phase | Description | |
| 173 | +|---|---|---| |
| 174 | +| `discovered-references.json` | 1 | Documents with Cloudinary references | |
| 175 | +| `unique-cloudinary-urls.json` | 2 | Deduplicated Cloudinary URLs to migrate | |
| 176 | +| `asset-mapping.json` | 3 | Cloudinary → Sanity asset mapping | |
| 177 | +| `migration-report.json` | 5 | Full migration report | |
| 178 | + |
| 179 | +## Resuming an Interrupted Migration |
| 180 | + |
| 181 | +The script is fully resumable: |
| 182 | + |
| 183 | +1. **Phase 1** is skipped if `discovered-references.json` exists. |
| 184 | +2. **Phase 2** is skipped if `unique-cloudinary-urls.json` exists. |
| 185 | +3. **Phase 3** skips any asset already present in `asset-mapping.json`. |
| 186 | +4. **Phases 4–5** are idempotent — re-running them is safe. |
| 187 | + |
| 188 | +To start completely fresh, delete the generated JSON files: |
| 189 | + |
| 190 | +```bash |
| 191 | +rm -f discovered-references.json unique-cloudinary-urls.json asset-mapping.json migration-report.json |
| 192 | +``` |
| 193 | + |
| 194 | +## Troubleshooting |
| 195 | + |
| 196 | +| Problem | Fix | |
| 197 | +|---|---| |
| 198 | +| `401 Unauthorized` from Sanity | Check `SANITY_TOKEN` has write permissions | |
| 199 | +| Download fails for private assets | Add Cloudinary credentials to `.env` and modify the download logic | |
| 200 | +| Script hangs | Check network; the script logs progress for every asset | |
| 201 | +| Partial migration | Just re-run — resume picks up where it left off | |
| 202 | +| `cloudinary.asset` not detected | Ensure the field has `_type: "cloudinary.asset"` in the document | |
| 203 | +| Custom CNAME not detected | Add your domain to `CLOUDINARY_PATTERNS` in the script | |
0 commit comments