Rescope pdf-research → pdf-viewer: expose annotation, form-filling, signing#1
Merged
bryan-anthropic merged 4 commits intobryan-anthropic:add-pdf-research-pluginfrom Mar 26, 2026
Conversation
The underlying @modelcontextprotocol/server-pdf gained an `interact` tool with 10 annotation types (highlight, underline, strikethrough, note, rectangle, circle, line, freetext, stamp, image), form filling, auto-highlight-text, page extraction, and annotated-PDF download. This reshapes the plugin around interactive workflows: - Rename plugin: pdf-research → pdf - Rename skill: pdf-reading → pdf (unified, covers all capabilities) - Commands: drop /summarize, rename /read → /open, add /annotate, /fill-form, /sign - SKILL.md: document interact tool, all annotation types, coordinate system, interactive workflows (AI-driven annotate, form filling, visual signing) - Positioning: explicitly steer AWAY from viewer for pure ingestion (use native Read instead) — the viewer's value is user interactivity - /sign disclaimer: visual signature image, not certified digital sig Out of scope (documented): summarization, cert signing, PDF generation.
Accuracy: - list_pdfs: 'remote origins' → 'local directories' (server has no remote allowlist, any HTTPS works) - get_screenshot: 'PNG' → 'image' (server returns JPEG) - Supported sources: clarify only arXiv auto-converts /abs/→PDF; others need direct PDF URLs Conventions (matching legal/, productivity/ patterns): - Add argument-hint frontmatter to all 4 commands - Add CONNECTORS.md callout to all commands Positioning — form filling: - Emphasize this is VISUAL form filling (vs programmatic tools) - Document the unnamed-field workflow: screenshot → match bounding boxes to visual labels → fill by name. Many real PDFs have fields named 'Text1', 'Field_7' with labels only on the rendered page. - User gets live feedback and can edit directly in viewer
Plugin name: pdf → pdf-viewer - 'Viewer' makes the interactive/visual modality self-evident - Naturally explains why NOT to use for summarization - Matches enterprise-search precedent (capability-named, not job-function) Skill name: pdf → view-pdf - Verb-noun matches repo convention (review-contract, triage-nda) Trigger description (SKILL.md frontmatter): - Hybrid style: situation framing + verb list for keyword matching - Includes 'open, show, view' so bare 'open this PDF' triggers the skill — SKILL body then disambiguates viewer vs Read - Keeps 'Not for summarization (use Read)' guard Marketplace/plugin.json description: - Marketing-oriented (benefits-first) to match other plugins' style - 'View, annotate, sign... Mark up contracts, fill forms with visual feedback, stamp approvals, place signatures — download annotated copy' Updated all /pdf:* cross-references → /pdf-viewer:*
Without -y, npx prompts 'Ok to proceed? (y)' on first install of a package, which hangs the stdio MCP connection (no TTY to answer). Keeping version unpinned (auto-latest) — currently pulls 1.3.1 which has the interact/annotation capabilities. Tested end-to-end via 'claude --plugin-dir pdf-viewer/'.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Proposes amendments to anthropics#72 based on the
interacttool capabilities merged in modelcontextprotocol/ext-apps#506. The underlying@modelcontextprotocol/server-pdfnow supports 10 annotation types (including image for signatures), form filling, auto-highlight, page extraction, and annotated-PDF download — none of which the originalpdf-researchscoping exposed.Changes
Identity
pdf-research→pdf-viewer— the name self-documents the interactive/visual modalitypdf-reading→view-pdf— verb-noun to match repo convention (review-contract,triage-nda)Commands
/read/pdf-viewer:open/summarize/pdf-viewer:annotate/pdf-viewer:fill-form/pdf-viewer:signPositioning
Every file now steers Claude away from the viewer for pure ingestion ('summarize this PDF' → use native Read) and toward it for interactive workflows. The viewer's value is showing the user the document and collaborating on markup.
SKILL.md (147 lines)
Form-filling differentiator
Unlike programmatic form tools, this gives live visual feedback and handles PDFs with cryptic/unnamed fields (
Text1,f1_2) where the label is printed on the rendered page rather than in field metadata — screenshot → match bounding boxes → fill by name.Related