Skip to content

Rescope pdf-research → pdf-viewer: expose annotation, form-filling, signing#1

Merged
bryan-anthropic merged 4 commits intobryan-anthropic:add-pdf-research-pluginfrom
anthropics:pdf-plugin-interact-scope
Mar 26, 2026
Merged

Rescope pdf-research → pdf-viewer: expose annotation, form-filling, signing#1
bryan-anthropic merged 4 commits intobryan-anthropic:add-pdf-research-pluginfrom
anthropics:pdf-plugin-interact-scope

Conversation

@ochafik
Copy link
Copy Markdown

@ochafik ochafik commented Mar 25, 2026

Summary

Proposes amendments to anthropics#72 based on the interact tool capabilities merged in modelcontextprotocol/ext-apps#506. The underlying @modelcontextprotocol/server-pdf now supports 10 annotation types (including image for signatures), form filling, auto-highlight, page extraction, and annotated-PDF download — none of which the original pdf-research scoping exposed.

Changes

Identity

  • Plugin: pdf-researchpdf-viewer — the name self-documents the interactive/visual modality
  • Skill: pdf-readingview-pdf — verb-noun to match repo convention (review-contract, triage-nda)

Commands

Old New
/read /pdf-viewer:open Entry point, offers next steps by doc type
/summarize (dropped) Use native Read — viewer is for interactivity, not ingestion
/pdf-viewer:annotate AI-driven: propose markup → review in viewer → iterate
/pdf-viewer:fill-form Visual form filling; handles cryptic field names via screenshot
/pdf-viewer:sign Image-based signatures/initials (not certified, disclaimer included)

Positioning

Every file now steers Claude away from the viewer for pure ingestion ('summarize this PDF' → use native Read) and toward it for interactive workflows. The viewer's value is showing the user the document and collaborating on markup.

SKILL.md (147 lines)

  • Trigger: hybrid situation+verb style — fires on 'open/show/view PDF' and annotate/fill/sign/stamp verbs
  • Documents all 10 annotation types, coordinate system, interact batching
  • Three workflow recipes: collaborative annotation, visual form filling, signing
  • Out-of-scope: summarization, certified signing, PDF generation

Form-filling differentiator

Unlike programmatic form tools, this gives live visual feedback and handles PDFs with cryptic/unnamed fields (Text1, f1_2) where the label is printed on the rendered page rather than in field metadata — screenshot → match bounding boxes → fill by name.

Related

  • ext-apps#565 removes the stale duplicate plugin dir from the server example

ochafik added 4 commits March 24, 2026 15:12
The underlying @modelcontextprotocol/server-pdf gained an `interact`
tool with 10 annotation types (highlight, underline, strikethrough,
note, rectangle, circle, line, freetext, stamp, image), form filling,
auto-highlight-text, page extraction, and annotated-PDF download.

This reshapes the plugin around interactive workflows:

- Rename plugin: pdf-research → pdf
- Rename skill: pdf-reading → pdf (unified, covers all capabilities)
- Commands: drop /summarize, rename /read → /open, add /annotate,
  /fill-form, /sign
- SKILL.md: document interact tool, all annotation types, coordinate
  system, interactive workflows (AI-driven annotate, form filling,
  visual signing)
- Positioning: explicitly steer AWAY from viewer for pure ingestion
  (use native Read instead) — the viewer's value is user interactivity
- /sign disclaimer: visual signature image, not certified digital sig

Out of scope (documented): summarization, cert signing, PDF generation.
Accuracy:
- list_pdfs: 'remote origins' → 'local directories' (server has no
  remote allowlist, any HTTPS works)
- get_screenshot: 'PNG' → 'image' (server returns JPEG)
- Supported sources: clarify only arXiv auto-converts /abs/→PDF;
  others need direct PDF URLs

Conventions (matching legal/, productivity/ patterns):
- Add argument-hint frontmatter to all 4 commands
- Add CONNECTORS.md callout to all commands

Positioning — form filling:
- Emphasize this is VISUAL form filling (vs programmatic tools)
- Document the unnamed-field workflow: screenshot → match bounding
  boxes to visual labels → fill by name. Many real PDFs have fields
  named 'Text1', 'Field_7' with labels only on the rendered page.
- User gets live feedback and can edit directly in viewer
Plugin name: pdf → pdf-viewer
- 'Viewer' makes the interactive/visual modality self-evident
- Naturally explains why NOT to use for summarization
- Matches enterprise-search precedent (capability-named, not job-function)

Skill name: pdf → view-pdf
- Verb-noun matches repo convention (review-contract, triage-nda)

Trigger description (SKILL.md frontmatter):
- Hybrid style: situation framing + verb list for keyword matching
- Includes 'open, show, view' so bare 'open this PDF' triggers the
  skill — SKILL body then disambiguates viewer vs Read
- Keeps 'Not for summarization (use Read)' guard

Marketplace/plugin.json description:
- Marketing-oriented (benefits-first) to match other plugins' style
- 'View, annotate, sign... Mark up contracts, fill forms with visual
  feedback, stamp approvals, place signatures — download annotated copy'

Updated all /pdf:* cross-references → /pdf-viewer:*
Without -y, npx prompts 'Ok to proceed? (y)' on first install of a
package, which hangs the stdio MCP connection (no TTY to answer).

Keeping version unpinned (auto-latest) — currently pulls 1.3.1 which
has the interact/annotation capabilities. Tested end-to-end via
'claude --plugin-dir pdf-viewer/'.
@bryan-anthropic bryan-anthropic merged commit f0c53a1 into bryan-anthropic:add-pdf-research-plugin Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants