Skip to content

Add @handle.tld mention support: profile links and Bluesky notifications#165

Open
jeherve wants to merge 25 commits into
trunkfrom
add/content-mentions
Open

Add @handle.tld mention support: profile links and Bluesky notifications#165
jeherve wants to merge 25 commits into
trunkfrom
add/content-mentions

Conversation

@jeherve

@jeherve jeherve commented Jun 25, 2026

Copy link
Copy Markdown
Member

Fixes #142

Proposed changes:

Adds Bluesky @handle.tld mention support so authors can notify accounts and link to their profiles. This works in two places: a display-side linkifier (includes/class-mention.php) wraps bare handles in the_content as profile links routed through the appview_url() helper (so they honor the atmosphere_appview_host filter), and a publishing-side carry-over pulls resolvable body mentions into long-form and teaser-thread Bluesky post text so mentioned accounts are notified even when the mention is buried deep in the body. It also replaces the unreliable did:web:<handle> resolution fallback in Facet::resolve_mention() with the real handle-resolution chain (Resolver::handle_to_did()), memoized per request to bound lookups.

Other information:

  • Have you written new tests for your changes, if applicable?

Testing instructions:

  • Publish a post whose body mentions a real Bluesky account using @handle.tld (e.g. @bsky.app).
  • On the front end, confirm the handle renders as a link to the account's profile on your site's appview host.
  • On Bluesky, confirm the published post (including long-form and teaser-thread variants) carries the mention so the account is notified, and that the mention resolves to the correct DID rather than a dead did:web profile.
  • Run composer lint and npm run env-test — both pass (786 tests).

Changelog entry

A changelog entry is already committed at .github/changelog/add-content-mentions (Significance: minor, Type: added).

jeherve added 12 commits June 24, 2026 19:44
Mention facets only tried the DNS TXT (_atproto.<handle>) resolution
method, which is empty for every *.bsky.social handle, then fabricated
did:web:<handle> on the miss. That produced a mention pointing at a
non-existent profile (e.g. did:web:pevohr.bsky.social instead of the
account's real did:plc), which broke the rendered mention link and
suppressed Bluesky's standard.site publication preview card on the
companion post.

Delegate resolution to Resolver::handle_to_did(), which already does
DNS TXT plus the HTTPS .well-known/atproto-did fallback that bsky.social
handles actually use. When a handle resolves to neither, leave the
@handle as plain text instead of minting an unroutable did:web.
Remove the deprecated imagedestroy() call from the new post transformer
test so composer lint passes, and move the dropped-mention debug log
above the early return in carry_body_mentions() so a total drop (no
handles fit) is still recorded.
Route the display-side @handle.tld linkifier through appview_url() instead
of a hardcoded bsky.app profile URL, so mention links honour the
atmosphere_appview_host / atmosphere_appview_url filters that self-hosted
appviews rely on. The default output is unchanged (bsky.app).
Copilot AI review requested due to automatic review settings June 25, 2026 16:19
@jeherve jeherve self-assigned this Jun 25, 2026
@jeherve jeherve requested a review from a team June 25, 2026 16:19
@github-actions github-actions Bot added [Feature] Transformer AT Protocol record transformers [Tests] Includes Tests PR includes test changes labels Jun 25, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end Bluesky @handle.tld mention support across both display rendering (auto-linking handles to appview profiles) and publishing (ensuring mentions buried in long-form bodies are carried into Bluesky post text so they generate proper #mention facets and notifications). It also switches mention DID resolution to the full AT Protocol handle-resolution chain and memoizes resolutions per request to bound repeated lookups.

Changes:

  • Introduces a display-side mention linkifier (Atmosphere\Mention) hooked into the_content, with a suppression guard for transformer/plain-text rendering paths.
  • Extends Bluesky post composition to carry resolvable body mentions into long-form/link-card and teaser-thread outputs while staying within the 300-grapheme limit.
  • Updates facet mention resolution to use Resolver::handle_to_did() with request-scoped memoization, and adds PHPUnit coverage for mention resolution and carry-over behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/phpunit/tests/transformer/class-test-post.php Adds long-form/teaser-thread tests verifying mention carry-over and facet shape; includes HTTP stubbing helper.
tests/phpunit/tests/transformer/class-test-facet.php Adds tests for resolved DID mentions, unresolvable mention behavior, and resolve_handles() output; includes HTTP stubbing helper.
tests/phpunit/tests/transformer/class-test-document.php Verifies document HTML content keeps mention links (linkifier active on document render path).
tests/phpunit/tests/class-test-mention.php New unit tests for the display-side mention linkifier, host filter behavior, and suppression guard scoping.
readme.txt Updates end-user documentation to describe @handle.tld mention linking + notification behavior.
includes/transformer/class-post.php Implements body-mention carry-over into composed Bluesky post text and teaser-thread CTA; guards Bluesky text generation against linkified HTML.
includes/transformer/class-facet.php Adds shared mention regex constant, request-scoped resolution cache, resolve_handles(), and switches mention DID resolution to Resolver chain.
includes/transformer/class-base.php Suppresses mention auto-linking in the shared plain-text render path used for Bluesky text composition.
includes/class-mention.php New mention linkifier that wraps bare handles in profile links via appview_url(), with protected-tag skipping and a suppression guard.
includes/class-atmosphere.php Registers the mention linkifier on init so it applies to front-end and document-parser rendering.
.github/changelog/add-content-mentions Adds a minor “added” changelog entry describing mention support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread includes/class-mention.php
Comment thread includes/transformer/class-facet.php Outdated
Comment thread tests/phpunit/tests/transformer/class-test-post.php
Comment thread tests/phpunit/tests/transformer/class-test-facet.php
Comment thread tests/phpunit/tests/transformer/class-test-facet.php Outdated
…st cleanup

- Skip email/ActivityPub domain halves in Facet::MENTION_PATTERN so they
  no longer drive handle resolution or mint stray mention facets.
- Unwind the linkifier tag stack to the most recent same-name tag so
  nested protected tags keep inner text protected.
- Remove the pre_http_request stub in both Facet/Post test tear_downs and
  prefix the bare add_filter() call.
- Revert the imagedestroy restoration (it reintroduces a PHPCS deprecation).

@pfefferle pfefferle left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things that I checked:

  • The @-mention detection misinterprets domain like WebFinger handles as profile handles: @notiz.blog@notiz.blog
  • There is no lookup if accounts really exist, so even an example mention like @example.com will be linked.

Not (yet) checked: Does the implementation also takes care of the recent implemented "Custom Bluesky text"!?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Comment thread tests/phpunit/tests/transformer/class-test-post.php
jeherve added 3 commits June 25, 2026 19:00
# Conflicts:
#	includes/transformer/class-post.php
Pin the two halves of the interaction:
- a mention buried only in the post body is not injected into a
  custom-text record (and is never even resolved), and
- a mention typed directly into the custom text still produces a
  #mention facet so the account is notified.
Address review feedback on the mention detection:
- Add a trailing boundary so '@notiz.blog@notiz.blog' is no longer
  mistaken for a standalone Bluesky handle (its leading half was being
  linked and faceted). A '.' is kept out of the boundary so a handle
  ending a sentence still matches.
- Make Facet::MENTION_PATTERN the single source of truth and reuse it in
  the display linkifier, so the publish path and the front-end can no
  longer drift apart.
- Add an 'atmosphere_link_mention' filter so a site can veto linking a
  handle (e.g. gate on a cached existence check), since the display
  linkifier intentionally does no per-render network lookup.
@jeherve

jeherve commented Jun 25, 2026

Copy link
Copy Markdown
Member Author

Thanks @pfefferle — addressed all three. Two new commits: 4aab0d2 (tests for the custom-text interaction) and a3a2dd1 (the WebFinger fix + shared pattern + filter).

1. @notiz.blog@notiz.blog misread as a profile handle — fixed

The detection only had a leading boundary, so it caught the domain half of @user@domain.tld but not a WebFinger handle whose user half is itself domain-shaped (@notiz.blog@notiz.blog) — the leading @notiz.blog was linked and faceted.

Added a trailing (?![\w@]) boundary so a handle immediately followed by @ (or another word char) is rejected. . is deliberately kept out of the trailing class so a handle ending a sentence (@bsky.app.) still matches. While here, I made Facet::MENTION_PATTERN the single source of truth and pointed the display linkifier at it too, so the publish path and the front-end can't drift apart again (the bug existed precisely because they were two separate regexes). Regression tests added on both sides.

2. No existence check — @example.com still gets linked

Worth splitting this into the two paths, because they already behave differently:

  • Publishing / notifications: existence is checked. Facet::resolve_mention() runs the real handle-resolution chain and only mints a #mention facet when the handle resolves to a DID. So @example.com never produces a Bluesky mention or notification — it ships as plain text. (Pinned by test_unresolvable_mention_produces_no_facet.)
  • Display linkifier: intentionally does no network lookup. A per-render resolution would turn every front-end page view into an outbound DNS/HTTP request to each mentioned domain — a real perf and availability/SSRF-amplification concern (a post mentioning @attacker.example would make every visitor hit that host). So by default we link by syntax and let the appview render an unknown handle gracefully, same as the ActivityPub plugin's display-side linkifier.

To still give a seam for stricter sites, I added an atmosphere_link_mention filter — return false for a handle to leave it as plain text (e.g. gate on a cached existence check or an allowlist) without imposing the lookup cost on everyone by default.

If you'd prefer the plugin itself to gate display links on existence, the non-network way to do it would be to persist the publish-time resolved handle→DID set to post meta and only link handles present there — happy to do that as a follow-up if you think it's worth the coupling.

3. Does it handle the Custom Bluesky text?

Yes — custom text takes precedence, and the interaction is now pinned by tests (commit 4aab0d2):

  • When custom text is set, the record text is exactly what the author wrote. The body-mention carry-over is intentionally not applied — injecting a handle the author chose to omit would break the "post exactly what I typed" contract. A body-buried mention isn't even resolved on that path (test_custom_text_does_not_carry_body_only_mention, with an HTTP tripwire).
  • A handle the author types into the custom text still works: facet extraction runs on the final record text regardless of which branch composed it, so it produces a #mention facet and notifies (test_custom_text_mention_still_produces_facet).

composer lint and the full PHPUnit suite pass.

@pfefferle

Copy link
Copy Markdown
Member

Review — @handle.tld mention support

Synthesized from three parallel lensed passes (security · correctness · WP-standards/i18n/BC) plus a manual read. Net: solid, well-defended feature; one blocking correctness bug and a consistency fix, then easy to merge.

Correctness — blocking — includes/transformer/class-post.php (~L751)

The } elseif ( '' !== $permalink && $text === $permalink ) branch sets $suffix = $permalink instead of $sep . $permalink. When a handle is carried in, the mention glues to the URL (@handle.tldhttps://…); MENTION_PATTERN's trailing (?![\w@]) then over-extends into a bogus handle, so no #mention facet is emitted — a silent notification loss. Reachable via an empty-title + empty-excerpt post (array_filter at L667 collapses $parts to [$permalink]). Fix: mirror the sibling branch ($suffix = $sep . $permalink) and add a test for that post shape.

Consistency — should fix — class-post.php (~L851)

carry_mentions_into_teaser() uses break on the first overflowing handle (dropping all shorter ones after it); carry_body_mentions() correctly uses continue. The teaser path also lacks the debug_log drop-accounting the link-card path has. Make it continue + log to match.

Robustness (not XSS) — includes/class-mention.php (L29)

PROTECTED_TAGS omits script, noscript, svg, iframe, title, so a @handle inside a <script>/<svg> body gets linkified into <a> and corrupts the element. Not an XSS (output is esc_url/esc_html'd; KSES blocks these tags from non-unfiltered_html users), but worth the one-line array extension.

Minor / hygiene

  • mb_strlen vs grapheme_length in both carry methods (class-post.php ~L769/775/799/803/846/851) — safe (never exceeds the 300 cap) but over-truncates prose with emoji and is inconsistent with the rest of the file.
  • Missing @since unreleased on Facet::MENTION_PATTERN, Facet::resolve_handles(), and class-mention.php's header/class + the atmosphere_link_mention filter (the release script needs it).
  • Facet::is_valid_handle() duplicates Resolver::is_valid_handle() and has drifted (no reserved-TLD check); Resolver re-validates so SSRF stays closed, but prefer calling the canonical copy.
  • $resolution_cache caches transient-failure misses, so one network blip can deflate #mention facets across a WP-CLI/cron bulk run — consider caching only successes.
  • Tag regex skips hyphenated custom elements; mock_handle_resolution() duplicated across 3 test files; silent 1 MiB the_content guard; no test for the truncate-link carry path.

Verified solid

Output escaping; SSRF gate (wp_safe_remote_get + reserved-TLD rejection + is_valid_handle); no ReDoS (bounded quantifiers + 1 MiB cap); single-source MENTION_PATTERN with correct email/WebFinger boundaries; without_links() try/finally guard; projection + redacted guards keep DNS off the display and editor-preview paths; correct PREG_OFFSET_CAPTURE byte offsets; hit+miss memoization; reflection-based cache reset in tests; the did:web→real-resolution regression test pinned to a real handle.

Risk: MEDIUM (display linkifier runs on the_content site-wide, but read-only and well-guarded). Blocking item is just the separator bug — fix that + the teaser break/continue and this is good to go.

- Fix the bare-permalink branch in carry_body_mentions() to keep the
  `\n\n` separator before the URL. Without it a carried handle glued
  onto the permalink, over-extending MENTION_PATTERN and silently
  dropping the #mention facet. Adds a regression test for the
  empty-title + empty-excerpt post shape.
- carry_mentions_into_teaser() now uses continue instead of break (a
  long handle no longer drops shorter ones after it) and logs dropped
  handles, matching the body-mention path.
- Extend Mention::PROTECTED_TAGS with script/noscript/svg/iframe/title
  and match hyphenated custom elements on the tag stack.
- Count graphemes (not code points) in both carry methods for
  consistency with the rest of the file.
- Consolidate handle validation onto Resolver::is_valid_handle(),
  removing the drifted Facet copy that lacked the reserved-TLD check.
- Cache only successful handle resolutions so a transient lookup
  failure can't deflate #mention facets across a bulk run.
- Add @SInCE unreleased to the new mention API surface.
@jeherve

jeherve commented Jun 29, 2026

Copy link
Copy Markdown
Member Author

Thanks for the thorough review. Addressed in 8382ebb:

Blocking — separator bug: The bare-permalink branch in carry_body_mentions() now keeps the \n\n separator ($suffix = $sep . $permalink). Added test_bare_permalink_carries_body_mention_with_separator and confirmed it reproduces the bug — reintroducing the old code fails with the exact @alice.bsky.socialhttp://… symptom, the fix makes it pass. (Hitting that branch needs an empty title and an excerpt that sanitises to empty, so the test uses an HTML-comment body plus a the_content filter that pushes the rendered body onto the link-card path.)

Should fix — teaser consistency: carry_mentions_into_teaser() now uses continue instead of break and emits the same debug_log drop-accounting as the body-mention path.

Robustness: PROTECTED_TAGS extended with script, noscript, svg, iframe, title (with a parametrized test), and the tag regex now matches hyphenated custom elements ([a-z][a-z0-9-]*) so they track correctly on the stack.

Minor / hygiene:

  • Both carry methods now count graphemes instead of mb_strlen code points.
  • @since unreleased added to Facet::MENTION_PATTERN, Facet::resolve_handles(), the Mention class, and the atmosphere_link_mention filter.
  • Facet now calls the canonical Resolver::is_valid_handle(); the drifted local copy (missing the reserved-TLD check) is removed.
  • The resolution cache stores successes only, so a transient lookup failure can't deflate #mention facets across a WP-CLI/cron bulk run.

Deliberately left (happy to follow up if you'd like):

  • mock_handle_resolution() duplicated across 3 test files — pure test hygiene; left out to keep this diff focused on passing files.
  • The silent 1 MiB the_content guard — it runs on every front-end render and mirrors ActivityPub's own silent guard, so logging there risks noise.
  • A dedicated test for the short-form truncate-link carry path.

composer lint and the full PHPUnit suite (806 tests) are green.

@pfefferle

Copy link
Copy Markdown
Member

Code review — typed multi-lens pass

Reviewed with three parallel lenses (correctness/fatals, security/escaping, perf-patterns-i18n-compat), each reading whole files. Blocking findings were confirmed against the code and origin/trunk. phpcs passes clean.

🔴 Blocking

1. Uncapped full-body handle resolution — includes/transformer/class-facet.php:395-419 + includes/transformer/class-post.php:721-727
body_mentions() runs Facet::resolve_handles() over the entire untruncated post body (render_post_content_plain()), with no cap on distinct handles. Every unresolvable handle now costs a DNS TXT lookup plus a wp_safe_remote_get( 'https://<handle>/.well-known/atproto-did', timeout 10 ) (class-resolver.php:45-49). An Author-role user can publish a body with thousands of @fake-N.example.com tokens → tens of thousands of seconds of blocking outbound I/O in one atmosphere_publish_post cron run, aimed at attacker-chosen public hosts. Backfill (500 posts/chunk) multiplies it. Every other Facet::extract() caller has an implicit ~300-grapheme bound; this one doesn't. Suggest a hard cap on distinct handles resolved per call (precedent: META_ORPHAN_RECORDS caps at 10).

2. Egress widening onto the lower-trust comment path — includes/transformer/class-facet.php:487-516
On trunk, resolve_mention() did DNS-only then fabricated did:web:<handle> (no HTTP). It now calls Resolver::handle_to_did(), which makes a real HTTPS request. Comment::transform() calls Facet::extract() on commenter-supplied content (class-comment.php:100), so an approved comment containing @target.example.com makes the server issue outbound HTTPS to an arbitrary public host. Bounded by the 300-grapheme comment cap and wp_safe_remote_get (blocks internal hosts) — so public-host egress amplification, not internal SSRF. The docblock at class-facet.php:475-481 names "skip mention resolution on the commenter path" as the fix but doesn't implement it. Gate the HTTP fallback off the comment path, or document as accepted risk.

3. Preview/publish grapheme-count divergence — includes/transformer/class-post.php:721-727, 225-247
body_mentions() returns array() in projection mode (good — no DNS in preview), but the carry-over mutates the composed text (inserts a whole @handle line), unlike facet extraction which only annotates. So project() measures text without the carried line and reports e.g. "120/300" to the pre-publish panel while the real record is longer — and near the cap the real prose can be silently truncated shorter than previewed. This violates the documented invariant ("the grapheme count the preview reports is identical either way", class-post.php:225-229).

4. Self-closed protected tag suppresses linkification — includes/class-mention.php:93-113
The tokenizer detects closing tags only by a leading / after <, not a trailing />. A self-closed protected tag (<iframe … />, <svg/>) is pushed onto $tag_stack and never popped, so every mention after it in the render is silently left as plain text — no error, no log. Verified against real wp_html_split() output.

🟡 Non-blocking

  • No cross-request cache (class-facet.php:67). $resolution_cache is per-request static only; publishing is a fresh process per post, so backfill/republish re-resolves the same handle every time. docs/php-coding-standards.md prescribes a transient layer here; Reaction_Sync::resolve_author() is the precedent (HOUR_IN_SECONDS).
  • Undocumented behavior change. Dropping the did:web fallback means unresolvable handles now produce no #mention facet (was a dead-link facet). Legitimate fix, but the changelog only mentions the new feature — worth a line so "notifications stopped" reports trace back here.
  • resolve_mention() swallows WP_Error silently (class-facet.php:502-505) — route through debug_log() to distinguish "doesn't exist" from "transiently unreachable".
  • Display links skip Resolver::is_valid_handle() — the linkifier links reserved TLDs (server.local) the publish path would reject; cosmetic dead-link drift from the "single source of truth" claim.
  • Minor: consecutive // comment paragraphs at class-base.php:200-203 and class-facet.php:507-510 (repo convention is /* */ blocks); missing @since on new Mention members and $resolution_cache; carry_body_mentions()/carry_mentions_into_teaser() duplicate a greedy-fit loop; build_text() $available <= 0 path bypasses carry-over without a drop log.

✅ Positives

Escaping is solid (esc_url/esc_html at every call site); the display linkifier does zero network I/O by design; projection correctly skips DNS; MENTION_PATTERN correctly rejects webfinger/email collisions; without_links() uses try/finally restoring prior state; the did:web→real-resolution swap is a real bug fix; success-only caching is well-reasoned. phpcs clean.


Risk: HIGH — changes the core publish path for every post (full-body scan + live handle resolution) and adds network egress to both the publish and comment-sync paths.

Findings #1#4 are concrete; #1 and #2 (egress) are the most important. Happy to draft fixes.

@kraftbj kraftbj left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two additional items from a review pass, both inline below. Neither is covered by the existing threads (verified against the prior passes).

Comment thread includes/transformer/class-post.php Outdated
Comment thread includes/transformer/class-post.php Outdated
jeherve added 3 commits July 2, 2026 12:00
Address PR review (pfefferle): the body @mention scan resolved an
unbounded number of distinct handles, and the comment-sync path
resolved commenter-supplied mentions. Each resolution is a DNS +
HTTPS lookup to the mentioned host, so both were egress-amplification
surfaces.

- Cap distinct handles resolved per body scan at MAX_RESOLVED_HANDLES
  (20); handles past the cap are left unresolved and logged once.
- Add a $with_mentions gate to Facet::extract(); the comment path
  passes false so an approved comment can no longer steer the server's
  outbound requests at an arbitrary host. Link/hashtag facets, which
  never touch the network, are unaffected.
- Add Facet::handles_in(), a resolution-free companion used for
  boundary-safe mention membership tests.
Address PR review (pfefferle): the linkifier's tokenizer detected
closing tags only by a leading slash, so a self-closed protected tag
(`<svg/>`, `<iframe … />`) was pushed onto the tag stack and never
popped — silently leaving every mention after it in the render as
plain text.

Share the tokenizer between the linkifier and a new strip_protected()
helper via a private walk() method, and skip pushing self-closed tags.
strip_protected() returns HTML with protected-tag text removed, so the
publish-side body scan can honour the same protected-tag rules the
display linkifier does.
Address PR review (kraftbj, pfefferle):

- Match carried mentions on token boundaries. The "already visible"
  test used mb_stripos(), so a body handle that is a string prefix of a
  visible one (@alice.com inside @alice.company) was treated as present
  and never carried — the account was silently not notified. Compare
  handle sets via Facet::handles_in() instead. (kraftbj)

- Honour protected tags when scanning the body for mentions. The scan
  ran on fully tag-stripped text, so a @handle inside a <code> sample
  or an existing <a> link was carried into the Bluesky post and minted
  a #mention facet even though the display linkifier leaves it as plain
  text. Scan the rendered HTML through Mention::strip_protected() so
  publish and display agree. (kraftbj)

- Size the carry-over in the pre-publish projection. The carry-over
  lengthens the composed record, but projection skipped it entirely and
  under-reported the grapheme count. Projection now reserves space from
  the syntactic body handles (no DNS, an upper bound) so the reported
  count matches the record the publisher writes. (pfefferle)

Split render_post_content_html() out of render_post_content_plain() so
the body scan can work from HTML without re-running the_content.
@jeherve

jeherve commented Jul 2, 2026

Copy link
Copy Markdown
Member Author

Thanks @pfefferle — all four blocking findings from the HIGH-risk pass are addressed across three commits, each with regression coverage. composer lint and the full PHPUnit suite (833 tests) pass.

  1. Uncapped full-body handle resolution → a5530fd
    Facet::resolve_handles() now stops after MAX_RESOLVED_HANDLES (20) distinct handles and logs once when it truncates, so an author can't fan a body of @fake-N.example.com tokens into tens of thousands of DNS/HTTPS lookups. 20 is generous headroom over any post that can fit in 300 graphemes while keeping egress bounded. Test: test_resolve_handles_caps_distinct_lookups (30 handles → 20 lookups).

  2. Egress widening onto the comment path → a5530fd
    Facet::extract() gained a $with_mentions flag; Comment::transform() passes false, so commenter-supplied @handle content no longer triggers any DNS/HTTPS resolution (link/hashtag facets still emit — they never touch the network). The resolve_mention() docblock that named this fix now points at the implemented gate. Tests: test_comment_mention_is_not_resolved (tripwire on pre_http_request) and test_extract_without_mentions_skips_resolution.

  3. Preview/publish grapheme divergence → 71e4983
    Projection now sizes the carry-over instead of skipping it: in projecting mode body_mentions() returns the syntactic body handles (via Facet::handles_in(), no DNS) as an upper bound, so the reported grapheme count is never shorter than the record the publisher writes. Test: test_projection_accounts_for_carried_body_mention asserts the projected count equals the published record length.

  4. Self-closed protected tag suppresses linkification → 2106156
    The tokenizer no longer pushes self-closed tags (<svg/>, <iframe … />) onto the tag stack, so they can't leave a phantom protected tag open for the rest of the render. Test: test_self_closed_protected_tag_does_not_swallow_later_mentions.

On the non-blocking items: the resolve_mention() docblock is updated to reflect the new comment-path gate and the resolve cap; success-only caching stays as-is (deliberate, and now documented alongside the cap). The cross-request transient cache and the "unresolvable handle now yields no facet" changelog note I'd prefer to keep as separate follow-ups rather than fold into these correctness commits — happy to open an issue for them if you'd like.

pfefferle added 2 commits July 2, 2026 12:06
Normalize indentation in `class-test-document.php` by replacing stray space-indented lines with tabs around the method closing brace and following docblock. This is a formatting-only change to keep test code consistent with project coding standards.

@pfefferle pfefferle left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran a deep review pass over the mention handling; 10 inline notes below, roughly ordered by severity.

The common thread in several of them (the protected-tag gap, per-chunk linkification, tag-strip gluing, the unquoted-attribute self-close check, and the 1 MB guard) is publish/display parity: the publish path and the front-end linkifier tokenize the content differently, so they disagree about what is a mention. A structural fix — both sides consuming one shared tokenization — would close that class at once, rather than patching case by case.

Three others are deliberate-looking behavior changes vs trunk (dot-mentions no longer matching, comment mention facets removed, the dropped did:web: fallback) that probably each deserve an explicit decision rather than a silent change.

Comment thread includes/transformer/class-post.php
Comment thread includes/class-mention.php Outdated
Comment thread includes/transformer/class-post.php Outdated
Comment thread includes/transformer/class-post.php Outdated
Comment thread includes/transformer/class-facet.php Outdated
Comment thread includes/class-mention.php Outdated
Comment thread includes/transformer/class-facet.php Outdated
Comment thread includes/transformer/class-comment.php Outdated
Comment thread includes/transformer/class-facet.php
Comment thread includes/class-mention.php Outdated
jeherve added 2 commits July 3, 2026 16:03
The publish path and the front-end linkifier tokenized rendered content
differently, so they disagreed about which `@handle.tld` counts as a
mention. That gap let a handle the site displays as plain text still mint
a Bluesky `#mention` facet (and notify the account), and vice versa.

Route both sides through a single `Mention` walk:

- `Mention::classify_handles()` replaces `strip_protected()` and returns
  the handles the linkifier would link (`linkable`) plus the ones that
  live only inside a protected `<code>`/`<pre>`/`<a>` region
  (`protected`). The post path resolves exactly the linkable set and
  passes `protected − linkable` to `Facet::extract()` as a deny-set, so a
  code-buried handle never mints a facet on the short-form, truncate-link,
  teaser, or link-card paths — not just the carry-over path.
- The walk now threads the last plain-text character across inline tags,
  so `<b>bob</b>@example.com` reads as the email it renders as instead of
  being linkified as a bogus profile link.
- Self-closing detection ignores quoted/unquoted attribute values, so
  `<a href=https://example.com/>` is no longer mistaken for a self-closed
  tag that leaves a nested `<a>` around an inner mention.
- Both sides bail identically on >1 MB content, so a code-buried mention
  in a huge post is neither linked nor notified.

Also harden resolution and restore a lost behaviour:

- `Facet::extract()`'s mention pass caps distinct handle resolution at
  `MAX_RESOLVED_HANDLES` and skips re-resolving a within-call miss, so a
  record stuffed with unresolvable tokens can't fan out into an unbounded
  run of blocking DNS/HTTPS lookups per teaser entry.
- The shared mention pattern no longer excludes a leading `.`, so the
  Twitter-style dot-mention idiom (`.@alice.bsky.social`) resolves and
  links again.
- `carry_body_mentions()` fits the carried line to a fixpoint: a handle
  visible in the prose but truncated away when the prose shrinks to make
  room is rescued into the carried line rather than silently dropped.
- An unresolvable handle still mints no fabricated `did:web:` facet by
  default, but a new `atmosphere_mention_didweb_fallback` filter lets a
  did:web-hosting site restore the pre-1.0 fallback.
Comment-path mention resolution stays off by default (an approved comment
should not be able to steer the server's outbound DNS/HTTPS at an
arbitrary host), but removing it outright left no way for a site that
trusts its moderated commenters to keep that capability. Gate it on a new
`atmosphere_resolve_comment_mentions` filter instead of a hard `false`.
@jeherve

jeherve commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

Addressed all 10 notes across 29654c0 (parity + hardening) and 2b41fb8 (comment-mention filter); replies inline under each.

Took the structural route you suggested: the protected-tag gap, per-chunk linkification, tag-strip gluing, the unquoted-attribute self-close, and the 1 MB guard are all now driven by a single Mention tokenizer — Mention::classify_handles() produces the linkable/protected split that both the display linkifier and Facet::extract() consume, so publish and display can no longer disagree about what is a mention.

The three behaviour-vs-trunk notes each got an explicit decision: dot-mentions restored (regex boundary), comment mentions kept off but now opt-in via atmosphere_resolve_comment_mentions, and the did:web fallback kept off (a guess links to a dead profile for the *.bsky.social majority) but restorable via atmosphere_mention_didweb_fallback.

composer lint clean; full suite green (888 tests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Transformer AT Protocol record transformers [Tests] Includes Tests PR includes test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-link @mentions in post content and trigger Bluesky notifications

4 participants