Skip to content

feat: syndicate blog posts to ATProto via standard.site lexicons#404

Merged
coreyja merged 13 commits into
mainfrom
implement/BLOG-b31d0c4d3b164931
Jun 5, 2026
Merged

feat: syndicate blog posts to ATProto via standard.site lexicons#404
coreyja merged 13 commits into
mainfrom
implement/BLOG-b31d0c4d3b164931

Conversation

@byte-the-bot

@byte-the-bot byte-the-bot commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Review Brief

What changed and why

This PR ships full blog posts to the ATProto PDS as site.standard.publication + site.standard.document records and enhances the Bluesky POSSE syndication so each bsky post carries associatedRefs pointing at those records for richer link cards. Two new CLI commands (publish-standard-site init <key> and sync --key <key>) are added and wired into the GH Actions deploy workflow so syndication happens automatically after each Fly Deploy.

Architecture decisions and trade-offs

  • Deterministic rkey from slug, not a TID, so putRecord is idempotent: re-syncing a post overwrites the same rkey rather than creating duplicates. Newsletter slugs (weekly/20230713/index.mdweekly-20230713) are handled by joining parent path segments with -.
  • publications.toml is the source of truth for publication config and caches at_uri/at_cid after init. toml::to_string_pretty strips comments on round-trip, so the file must remain comment-free.
  • include_str! bakes publications.toml into the binary at compile time for the blog page <link> tags. A parse error in the TOML surfaces as a .expect() panic on first HTTP request rather than a build failure. Missing at_uri does not panic — the link tag simply isn't emitted.
  • <link> tags lag one deploy: blog content is include_dir!'d at compile time, so the tags only appear after the next Fly Deploy following the frontmatter sync commit.
  • putRecord always bumps updatedAt even if post content is unchanged (accepted tradeoff for simplicity in the BskyOnly recovery path).
  • Publication strongRef staleness: re-running init --force changes the publication's CID; all existing document records on the PDS embed the old CID. Recovery: re-run sync --key blog to re-putRecord all documents with the fresh CID.
  • Auto-bootstrap on first deploy: init is idempotent — it short-circuits when at_uri+at_cid are already cached in publications.toml. The workflow runs init blog on every deploy; the first run creates the publication record and commits the populated config, every subsequent run is a no-op. No manual pre-merge step required.

Risk assessment

  • Blast radius: Blog page rendering (new <link> tags in <head>); Bluesky publish GHA workflow (new init+sync steps that can fail the run); BlogFrontMatter serialization (two new optional fields); ExternalEmbed struct (new associated_refs field — both literal constructions updated).
  • Confidence level: Medium — the site.standard.* lexicon carries an explicit instability notice; new required fields in a future schema version would break putRecord.
  • Rollback safety: Clean. New BlogFrontMatter fields are optional/defaulted, OpenGraph.head_links defaults to empty, and the GHA sync step failure is surfaced independently from the note publish step. Reverting bluesky.yml and removing publish-standard-site from the binary restores prior behavior completely.

Summary

Implements BLOG-b31d0c4d3b164931: publish full blog posts as site.standard.publication + site.standard.document records on the PDS, and enhance the existing Bluesky POSSE syndication with associatedRefs for richer link cards.

  • New CLI: publish-standard-site init <key> [--force] and publish-standard-site sync [--all|--key <k>]
  • init is idempotent: no-op when at_uri+at_cid already cached. Use --force to re-upload config changes (title/description/cover).
  • New repo-root config: publications.toml (cached at_uri + at_cid after first init)
  • Extends BlogFrontMatter with optional atproto_uri and publication (defaults to "blog")
  • BlueskyClient gains create_publication, put_publication, put_document, upload_blob, create_blog_post; ExternalEmbed gets associatedRefs
  • Shared YAML frontmatter helpers extracted to server/src/commands/frontmatter.rs; note publisher now uses them
  • Blog post pages emit <link rel="site.standard.document"> / <link rel="site.standard.publication"> head tags for verification
  • GH Actions bluesky.yml workflow runs init blog (auto-bootstrap, idempotent) followed by sync --key blog after the existing note publish step

Test plan

  • cargo fmt --check clean
  • cargo clippy --all-targets --all-features --workspace --tests -- -D warnings clean
  • New unit tests covering record serialization, frontmatter helpers, standard_site classification/rkey logic, and head-link rendering
  • Property test all_publishable_blog_posts_fit_within_bsky_post_limit covers all blog/**/index.md recursively
  • Property test all_blog_posts_have_unique_rkeys ensures no two posts collide on the PDS
  • After merge: first deploy creates the publication record + commits populated publications.toml; subsequent deploys log "already initialized" and skip
  • After merge: visually verify the enhanced link card on bsky for a fresh blog post

Closes BLOG-b31d0c4d3b164931.

Publish full blog posts as `site.standard.publication` +
`site.standard.document` records on the PDS, and enhance the
existing Bluesky POSSE syndication with `associatedRefs` for richer
link cards (BLOG-b31d0c4d3b164931).

- Add `publish-standard-site init <key>` / `sync [--all|--key]`
  CLI commands backed by a new `publications.toml` config.
- Extend `BlogFrontMatter` with optional `atproto_uri` and a
  `publication` field (defaults to "blog"). These act as the
  idempotency keys for sync.
- Extend `BlueskyClient` with `create_publication`,
  `put_publication`, `put_document`, `upload_blob`, and
  `create_blog_post`. `ExternalEmbed` now carries
  `associatedRefs` (skipped when empty).
- Factor shared YAML frontmatter helpers into a new
  `commands::frontmatter` module; rewire the note publisher to use
  it.
- Emit `<link rel="site.standard.document">` /
  `<link rel="site.standard.publication">` tags on rendered blog
  posts for verification, sourced from `publications.toml` baked
  in at compile time.
- Extend the bluesky GH Actions workflow to run `sync --key blog`
  after the note publish step and commit any frontmatter updates.

The standard.site publication record still needs to be
bootstrapped against prod creds (`init blog`) before the workflow
will succeed; until then the sync step errors out with an
actionable message.
Make `publish-standard-site init` an idempotent no-op when the
publication already has `at_uri` + `at_cid` cached in
`publications.toml`, and wire it into the bluesky workflow as a
pre-sync step. The first deploy after merge creates the publication
record on the PDS and commits the populated config; every subsequent
run short-circuits without touching the network.

Removes the manual `init blog` bootstrap step from the merge
checklist — the workflow now handles it.

A new `--force` flag preserves the old behavior (re-`putRecord`
the publication) for when config fields like `title`/`description`
need to be pushed to the PDS.
byte-the-bot and others added 2 commits June 4, 2026 07:46
- standard_site init now fetches the publication's branded OG card as a
  PNG through imgproxy (`/og/publication/{key}.svg` → imgproxy 1200×630
  PNG → upload as blob) instead of reading a static `opengraph.png`
  from disk. New `/og/publication/{key}` SVG route on the server owns
  the rendering; the CLI just wraps it in the same imgproxy URL format
  used by per-post cards. `publications.toml` loses the `cover_image`
  field; `PublicationConfig` no longer exposes it.

- Sequencing: the bluesky workflow `workflow_run`-chains off Fly
  Deploy, so on merge the new SVG route is live before init runs.

- Bump review-app memory 512MB → 2048MB and add the env vars currently
  missing from review-app secrets (boot-time required by
  `AppState::from_env`): `ANTHROPIC_API_KEY`, `LINEAR_CLIENT_ID`,
  `LINEAR_CLIENT_SECRET`, `LINEAR_WEBHOOK_SECRET`. Without these the
  app crashes during `*Config::from_env()` before serving any request.

- Wire `IMGPROXY_URL` secret into the bluesky workflow init step so
  init can rasterize the SVG card.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boot crashes were the env-var gap (LINEAR_*, ANTHROPIC_API_KEY), not
OOM. Keep memory at 512MB; revisit only if we see actual OOM signal.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The `2026-05-29` cutoff prevented historical posts from syndicating to
standard.site. Drop it so the first post-merge run picks up the full
back-catalog; subsequent runs stay cheap via the existing `atproto_uri`
idempotency check in `classify_blog_post`.

Per-post `published_at` is read from frontmatter `date`, so historical
posts keep their original publication date on the PDS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`post_get` was calling `.unwrap()` on `fetch_thread`, which panics the
entire page when Bluesky is down or rate-limits us. Mirror the notes
handler: log the warning, render the page without the comments block.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reintroduce the date cutoff (`2026-05-29`), but use it solely to gate
Bluesky post creation — every post still gets a `site.standard.document`
record on first deploy. Historical posts (before cutoff) syndicate to
standard.site only, recent posts continue to get both. Avoids spamming
the Bluesky feed with backdated posts on backfill.

Idempotency:
- Historical Both (no atproto_uri, no bsky_url): write atproto_uri only
- Historical BskyOnly (atproto_uri set, bsky_url unset): terminal state,
  skip without calling put_document
- Recent posts: unchanged — Both writes both, BskyOnly writes bsky_url

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drives both the publication-level OG card text (currently looks sparse
with just "coreyja") and the standard.site publication record's title
field. Matching the displayed name to the actual domain reads better
on the card and avoids name/URL drift in standard.site clients.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The publication card was looking sparse — big logo, small tag pill, one
short title line, then a lot of empty space. Re-purpose the
per-post-card date slot to render the publication description from
\`publications.toml\` instead. For \"blog\", that surfaces
\"Personal blog: Rust, side projects, Battlesnake, AI agents.\" under
the title.

Description is truncated to 75 chars (word-boundary + ellipsis) so
future longer descriptions don't overflow the card width at Quicksand
28px.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Shift the tag pill (and everything below it) down 25-30px so the logo
has more breathing room above the tag. Affects per-post cards too —
the title and date moved with it to preserve their relative spacing
to the tag pill. Date baseline now sits at y=595 (35px from the
canvas bottom of 630), still comfortably inside the safe area.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a \`subtitle\` frontmatter field to blog posts. When set, it renders
in the bottom-left slot of the OG card (where the date used to be) and
the date moves to a new bottom-right slot. When unset, the layout
matches the previous behavior: date in the bottom-left, no bottom-right
text. Publication cards continue to use the bottom-left slot for the
publication description; their bottom-right slot is always stripped.

Subtitle uses the same 75-char word-boundary truncation as the
publication description so longer taglines don't overflow.

Wires subtitle in for blog post (\`og_post_svg\`) and weekly newsletter
(\`og_weekly_svg\`) cards; notes and podcast cards pass None for now
(their frontmatter types don't carry a subtitle yet).

Adds a subtitle to the \"Look Ma' no AI\" post as the first example.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ests

Caught by workspace-wide clippy on CI (\`cargo clippy --all-targets\`);
I was only running clippy on the server crate locally and missed the
two literals in posts/src/blog.rs's own tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tag pill bottom was at y=404, title baseline at y=455 — only ~5px
between pill bottom and the visual top of the 64px title letters.
Shift title down 10px (title line 1 → y=465, line 2 → y=535). Bottom
slot at y=595 stays put; title-to-bottom gap tightens by 10px which
still leaves comfortable room.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coreyja coreyja merged commit a6cc5c3 into main Jun 5, 2026
6 checks passed
@coreyja coreyja deleted the implement/BLOG-b31d0c4d3b164931 branch June 5, 2026 12:37
coreyja pushed a commit that referenced this pull request Jun 6, 2026
## Summary

The `Publish Newsletter` workflow has been failing on every push to
`main` since 2026-06-05 13:42 UTC. The failure is **not** in the publish
step — that succeeds and creates the Buttondown draft. The failure is
the trailing `Commit buttondown_id updates` step: the default
`GITHUB_TOKEN` isn't a bypass actor on the main-branch ruleset, so `git
push` is rejected.

Net effect: every push to main yesterday (every `Sync syndication state`
commit) re-published ep3 as a fresh draft, failed to commit the id back,
and the next push did it again. Ep 3 (`blog/weekly/20260407/index.md`)
accumulated ~50 duplicate Buttondown drafts.

## Fix

Mirror the GitHub-App-token pattern already used by
`.github/workflows/bluesky.yml`:

- Mint a token via `actions/create-github-app-token` (the App is the
configured ruleset bypass actor).
- Hand it to `actions/checkout` so subsequent git ops use it.
- Check out `ref: main` rather than the trigger SHA so a queued run sees
the freshest frontmatter.
- `git pull --rebase origin main` before push to absorb any commits
landed during the run.
- Add `concurrency: newsletter-publish` (`cancel-in-progress: false`) so
two pushes can't race the same unpublished file through publish and
double-create the draft.

## Tests

Two regression tests on `commands::buttondown::parse_frontmatter`
covering the new `subtitle` / `atproto_uri` / `atproto_pub_cid` /
`publication` fields from PRs #404-#408 — confirming serde is not the
regression. All 25 buttondown tests pass; clippy clean (`-D warnings`).

## Followup not in this PR

~50 duplicate ep3 Buttondown drafts to clean up manually. Once this PR
merges, the next push triggers the workflow, publishes ep3 one more time
(51st), and the commit-back finally lands the `buttondown_id` so
subsequent runs short-circuit.

## Test plan

- [ ] Merge PR
- [ ] Next push to main turns workflow green
- [ ] `blog/weekly/20260407/index.md` ends up with `buttondown_id:` via
auto-commit
- [ ] Manually clean up duplicate Buttondown drafts (keep one canonical,
~50 dupes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants