Skip to content

Commit ac6dd52

Browse files
authored
fix: image-load retry policy + typed errors (#343)
* fix: type image-load errors and retry patiently with backoff Users report that images uploaded from desktop sometimes don't appear on web until the page is refreshed multiple times. Two web-side causes: 1. `checkImage` for AppFlowy URLs falls back to `validateImageLoad` (an unauthenticated `<img>` load) when the authenticated fetch fails. AppFlowy storage *requires* a Bearer token, so this fallback is guaranteed to 401/403 and the browser caches that failure under the URL. The polling loop then burns its retry attempts on a request that physically cannot succeed. 2. `Img.tsx` polls only 5× over 30 seconds with a single error state. If the upload pipeline takes longer than that (which it often does for first-render right after desktop uploads), polling exhausts and the user sees a permanent "Image not found" until they refresh. Changes: - `image.ts`: drop the no-auth fallback for AppFlowy URLs; return a typed `CheckImageErrorKind` (`no-auth`, `auth-rejected`, `forbidden`, `not-ready`, `not-found`, `server-error`, `network`, `format`) so the caller can pick a backoff per category. Add `cache: 'no-store'` on the fetch as defense-in-depth against stale CDN/browser entries. - `Img.tsx`: replace 5×6s polling with a typed-error-driven exponential backoff (~5 min wall-clock budget, ±20% jitter), three-state UI (loading → pending after 10 s → failed after budget), and a visibilitychange/focus listener that re-checks when the user returns to the tab. Skip retries entirely when the URL is empty (YJS sync hasn't delivered it yet). Manual retry button when finally failed. Together with the server-side cache-control fix (no-store on 4xx, 425 on metadata/S3 race), this turns "refresh five times" into "wait two seconds, image appears" for the common slow-upload case. * fix: honor server Cache-Control on first fetch, bypass only on retry Previously, `checkAppFlowyImage` set `cache: 'no-store'` on every fetch as defense-in-depth against stale negative cache entries. But that disables the browser HTTP cache for *all* responses — including the 200s that the server now tags `Cache-Control: public, immutable, max-age=31536000`. Result: every component mount re-downloaded the image even though the server told the browser it could keep the bytes for a year. Switch to a per-attempt cache policy: - First attempt: `cache: 'default'` — honor server headers. Successful fetches enter the HTTP cache; the next mount of the same Img is free. Failures (no-store from the server) still skip the cache, so no poisoning risk. - Retry attempts: `cache: 'reload'` — force a server round-trip to defeat any stale cached response from a misbehaving proxy. Unlike `no-store`, `reload` still lets the *new* response enter the cache, so once we finally succeed we stop paying network for re-renders. Adds a `CheckImageOptions { retry?: boolean }` parameter to `checkImage` and threads it from the polling loop in Img.tsx (`{ retry: attempt > 0 }`). * fix: address review issues from React best-practices pass Six fixes to Img.tsx + image.ts, in order of severity: 1. Image was hidden forever after successful load. `phase === 'loading'` left `visibility: hidden` on the <img>, and the <img>'s own onLoad handler also wrote 'loading'. Split into two state machines: `phase` tracks the fetch lifecycle (loading|pending|failed), `isImageReady` tracks the <img>'s decode lifecycle (set by native onLoad). Visibility keys on `isImageReady` only. 2. Inline `onLoad={() => ...}` props from parent re-mounted on every render churned `runCheck`'s identity, which made the URL-effect cancel the in-flight retry chain and start fresh — so every parent re-render re-fetched the image. Now `onLoad` is read through `onLoadRef` and `runCheck` has a stable identity (deps: [cancelAllInflight, scheduleTimer]). 3. The visibilitychange/focus listener effect re-subscribed every time phase or url changed. Now attached once; reads via `phaseRef`, `urlRef`. 4. The auth fetch in `image.ts::checkAppFlowyImage` ignored cancellation — the polling loop checked refs between awaits, but the fetch itself ran to completion. Added `signal?: AbortSignal` to `CheckImageOptions` and plumbed an `AbortController` per `runCheck` invocation. AbortError is classified separately so it doesn't log as a network error. 5. Untracked `setTimeout`s (pendingTimer, retry-delay) now go through a `timersRef` Set. On cancellation `cancelAllInflight()` aborts every controller and clears every pending timer in one call. Retry-delay `sleep` resolves early when the abort signal fires. 6. Free perf wins on the rendered <img>: - `loading="lazy"` — defer offscreen images until they near viewport - `decoding="async"` — let the browser decode off-thread - Added `onError` handler that flips to 'failed' on decode failures (rare but previously silent). Skipped from the review: - Better `alt` text (would require parent prop plumbing; pre-existing). - Extracting state machine into a `useImageWithRetry` hook (worthwhile but a structural change; tracked for a follow-up). * refactor: extract image-load state machine + add accessibility Two follow-ups to the React best-practices review (PR comments #7, #8): **#8 — Extract `useImageWithRetry` hook.** The fetch state machine (backoff schedules, abort controllers, timer tracking, ref-stable listener attachment) was ~200 lines of imperative async logic awkwardly inlined in a presentation component. Moved to `src/components/editor/components/blocks/image/useImageWithRetry.ts`, which returns `{ src, phase, isImageReady, lastError, retry, onImageLoaded, onImageError }`. `Img.tsx` is now ~100 lines of pure rendering. The hook is unit-testable in isolation with a fetch mock, and the JSX no longer hides correctness concerns behind a wall of useCallback/useEffect plumbing. **#7 — Accessibility.** Pre-existing issue: every image rendered with `alt={''}`, which tells assistive tech "this is decorative" — wrong for content images in a document editor. `Img` now accepts an `alt` prop. When omitted (current callers), it derives a label from the URL filename — strips extension, percent-encoding, AppFlowy upload prefix, collapses separators. Falls back to "Embedded image" if nothing sensible can be extracted. Future work: thread an explicit `alt` from ImageBlockData once the data model gains a caption field. Also tightened the failure/loading overlays with `role="status"`, `aria-live="polite"`, `aria-label`, and `aria-hidden` on the decorative error icon — screen readers now announce loading state transitions instead of going silent. Behavior unchanged; lint + type-check clean. Image E2E tests not re-run in this commit (local docker stack down — port 8000 not listening); they passed against the previous commit on this branch and the refactor preserves all observable behavior. * polish: address second-pass review feedback Five small fixes from the React best-practices re-review: 1. Pending-timer false flash (#1): the loading→pending promotion timer could fire AFTER fetch success but BEFORE <img> decode, briefly showing "Waiting for upload to finish…" when the upload was in fact finished. Track a local `settled` flag for each run; the timer short-circuits once settled. 2. Rename `previousBlobUrlRef` → `currentBlobUrlRef` (#2). It holds the current rendered blob URL, not a previous one — name now matches meaning. Pure cosmetics, no behavior change. 3. Smarter alt-text fallback (#3): AppFlowy storage URLs end in opaque hashes (file_id) that produce nonsense alts like "bGTk 4nxVz" — no better than empty alt for screen-reader users. Short-circuit those URLs to a generic "Image" via `isAppFlowyFileStorageUrl`. Also reject non-word-looking segments from external URLs (heuristic: needs a vowel and 3+ chars). 4. Collapse three ref-mirror useEffects into one `useLatest` helper (#4), matching Vercel's `advanced-use-latest` pattern. One effect per ref instead of three; reusable if more refs are needed later. 5. Simplify hook signature (#5): `useImageWithRetry(url, onReady?)` instead of `useImageWithRetry(url, { onReady })`. Drops the per-render options-object allocation and the indirection inside the hook. Lint, type-check, and 5 image E2E tests pass. * test: stabilize duplicate row inline database edit
1 parent 74bd3d0 commit ac6dd52

6 files changed

Lines changed: 660 additions & 217 deletions

File tree

playwright/e2e/database/duplicate-row-inline-db.spec.ts

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -129,12 +129,20 @@ test.describe('Duplicate row with inline database', () => {
129129
const dupGridBlock = dbBlocks(dupEditor).first();
130130
await expect(dupGridBlock).toBeVisible({ timeout: 30000 });
131131

132-
// Edit the duplicated row's inline grid cell
133-
await editCell(page, dupGridBlock, 'modified in duplicate');
134-
await page.waitForTimeout(1000);
132+
// Ensure the inline grid has finished hydrating before we try to edit:
133+
// at least one data row must be present, and its primary cell must not be
134+
// showing the loading spinner. Without this gate, the edit can target a
135+
// not-yet-mounted TextCell and never commit to the duplicated inline DB.
136+
await expect(dupGridBlock.locator('[data-testid^="grid-cell-"]').first()).toBeVisible({
137+
timeout: 30000,
138+
});
139+
await expect(dupGridBlock.locator('[data-testid^="primary-cell-loading-"]')).toHaveCount(0, {
140+
timeout: 30000,
141+
});
135142

136-
// Verify the edit took effect
137-
expect(await cellText(dupGridBlock)).toBe('modified in duplicate');
143+
// editCell asserts the cell text becomes the typed value before returning,
144+
// so a separate re-check would be redundant.
145+
await editCell(page, dupGridBlock, 'modified in duplicate');
138146

139147
// Navigate back to the grid
140148
await page.goBack();

playwright/support/duplicate-test-helpers.ts

Lines changed: 72 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -440,11 +440,78 @@ export async function insertLinkedGridViaSlash(
440440
export async function editFirstGridCell(page: Page, gridBlock: Locator, text: string): Promise<void> {
441441
const firstCell = gridBlock.locator('[data-testid^="grid-cell-"]').first();
442442
await expect(firstCell).toBeVisible({ timeout: 15000 });
443-
await firstCell.click({ force: true });
444-
await page.waitForTimeout(300);
445-
await page.keyboard.press(process.platform === 'darwin' ? 'Meta+A' : 'Control+A');
446-
await page.keyboard.type(text);
447-
await page.keyboard.press('Enter');
443+
444+
// The grid-row-cell wrapper owns the activation click handler. Clicking the
445+
// inner data-testid div relies on bubbling, which can be eaten on slow CI if
446+
// the row is still hydrating, so assert the editor is actually mounted.
447+
448+
// Wait for the primary cell's row data to finish hydrating. PrimaryCell.tsx
449+
// renders a CircularProgress with testid `primary-cell-loading-<rowId>`
450+
// until the row's collab content is loaded; clicking before then has no
451+
// effect because the editable TextCell isn't mounted yet.
452+
await expect(firstCell.locator('[data-testid^="primary-cell-loading-"]')).toHaveCount(0, {
453+
timeout: 30000,
454+
});
455+
456+
// The activation click handler lives on the .grid-row-cell wrapper (see
457+
// GridVirtualColumn.tsx). Resolve it directly via the DOM rather than an
458+
// ancestor xpath. Playwright's `ancestor::` returns the outermost match
459+
// even with `.first()`, which is the document root, not the wrapper.
460+
const editingTextarea = firstCell.locator('textarea');
461+
462+
let entered = false;
463+
464+
for (let attempt = 0; attempt < 4; attempt++) {
465+
const handle = await firstCell.elementHandle();
466+
const wrapper = handle
467+
? await handle.evaluateHandle((el) => (el as HTMLElement).closest('.grid-row-cell') ?? el)
468+
: null;
469+
const wrapperElement = wrapper?.asElement();
470+
471+
if (wrapperElement) {
472+
await wrapperElement.click({ force: true });
473+
} else {
474+
await firstCell.click({ force: true });
475+
}
476+
477+
if (await editingTextarea.isVisible().catch(() => false)) {
478+
entered = true;
479+
break;
480+
}
481+
482+
try {
483+
await expect(editingTextarea).toBeVisible({ timeout: 3000 });
484+
entered = true;
485+
break;
486+
} catch {
487+
// Some browsers / cell types only enter edit mode on Enter after focus,
488+
// not on raw click. Try that next.
489+
await page.keyboard.press('Enter').catch(() => undefined);
490+
491+
if (await editingTextarea.isVisible({ timeout: 1500 }).catch(() => false)) {
492+
entered = true;
493+
break;
494+
}
495+
496+
await page.waitForTimeout(500);
497+
}
498+
}
499+
500+
if (!entered) {
501+
throw new Error(
502+
'First grid cell did not enter edit mode after click. Is the grid readOnly, ' +
503+
'still loading, or covered by another element?'
504+
);
505+
}
506+
507+
await editingTextarea.focus();
508+
await editingTextarea.fill(text);
509+
await expect(editingTextarea).toHaveValue(text, { timeout: 5000 });
510+
await editingTextarea.press('Enter');
511+
// After Enter, the textarea unmounts and the cell re-renders with the new
512+
// value. Wait for the textarea to be gone so the next innerText() reads the
513+
// committed display text, not stale empty editing markup.
514+
await expect(editingTextarea).toHaveCount(0, { timeout: 10000 });
448515
await expect
449516
.poll(async () => firstGridCellText(gridBlock), {
450517
timeout: 15000,

src/components/editor/components/blocks/image/ImageRender.tsx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,9 @@ function ImageRender({
7171
width={rendered ? (newWidth ?? '100%') : 0}
7272
imgRef={imgRef}
7373
url={url}
74+
// `ImageBlockData` has no caption field today, so let `Img` derive
75+
// an alt from the URL filename. This avoids screen readers reading
76+
// the raw blob URL or skipping the image entirely.
7477
onLoad={() => {
7578
setRendered(true);
7679
}}

0 commit comments

Comments
 (0)