feat(slides,drive): add createFromJson, drive primitives, and theme system#348
feat(slides,drive): add createFromJson, drive primitives, and theme system#348n0012 wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces new Google Slides tools for creating presentations, performing batch updates, and generating slides from JSON blueprints. The review feedback highlights the need for consistency by using the registerTool wrapper to respect feature flags and suggests enhancing input schemas to support structured objects. Additionally, the feedback addresses a potential crash in the createFromJson service method and recommends adjusting the slide insertion logic to append new slides to the end of a presentation.
| slidesService.getSlideThumbnail, | ||
| ); | ||
|
|
||
| server.registerTool( |
There was a problem hiding this comment.
The slides.create tool is being registered using server.registerTool directly, which bypasses the registerTool wrapper defined on line 154. This wrapper is responsible for checking if the tool is enabled via feature flags (WORKSPACE_FEATURE_OVERRIDES). Using the wrapper ensures consistency and allows users to disable these tools if needed.
| server.registerTool( | |
| registerTool( |
| ); | ||
|
|
||
| server.registerTool( | ||
| 'slides.batchUpdate', |
| }); | ||
|
|
||
| server.registerTool( | ||
| 'slides.createFromJson', |
| requests: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The requests field is restricted to a string, but the SlidesService.batchUpdate implementation (line 329) and most MCP clients support passing structured arrays directly. Allowing both a JSON string and an array of objects provides a better experience for AI agents.
| requests: z | |
| .string() | |
| .describe( | |
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | |
| ), | |
| requests: z | |
| .union([z.string(), z.array(z.any())]) | |
| .describe( | |
| 'An array of Slides API request objects or a JSON string of that array (e.g., [{"createSlide":{}}, {"createShape":{...}}]).', | |
| ), |
| slideJson: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of the slide blueprint. Use {"slides":[{"elements":[...]},...]} for multiple slides or {"elements":[...]} for one slide. Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The slideElementSchema defined on lines 493-591 is currently unused. It should be applied to the slideJson input schema to provide the AI agent with structured validation and clear documentation of the expected blueprint format. Additionally, allowing both objects and strings makes the tool more robust.
slideJson: z
.union([
z.object({
slides: z.array(z.object({ elements: z.array(slideElementSchema) })),
}),
z.object({
elements: z.array(slideElementSchema),
}),
z.string(),
])
.describe(
'The slide blueprint. Use {"slides":[{"elements":[...]}]} for multiple slides or {"elements":[...]} for one slide. Can be a JSON string or object.',
),| const slideDefs = (slideJson as any).slides | ||
| ? (slideJson as any).slides | ||
| : [{ elements: (slideJson as any).elements || [] }]; |
There was a problem hiding this comment.
If the slides format is used but an individual slide object is missing the elements property (e.g., { "slides": [{}] }), slideDefs[i].elements will be undefined. This will cause a crash in buildSlideRequests when it attempts to spread or iterate over elements (line 421).
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides | |
| : [{ elements: (slideJson as any).elements || [] }]; | |
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides.map((s: any) => ({ ...s, elements: s.elements || [] })) | |
| : [{ elements: (slideJson as any).elements || [] }]; |
| requests.push({ | ||
| createSlide: { | ||
| objectId: slideId, | ||
| insertionIndex: i + 1, |
There was a problem hiding this comment.
Hardcoding insertionIndex: i + 1 causes new slides to always be inserted at the beginning of the presentation (after the first slide). For an 'add slides' tool, the expected behavior is usually to append slides to the end. Omitting insertionIndex entirely will cause the Slides API to append the new slides to the end of the presentation.
| insertionIndex: i + 1, | |
| slideLayoutReference: { predefinedLayout: 'BLANK' }, |
|
test with new tool that can batch create slides is slides.createFromJson |
Bug fix bundled in this PR:
|
02b3502 to
abf1ecd
Compare
abf1ecd to
7379d66
Compare
|
@allenhutchison — would appreciate a review when you get a chance! This adds |
… and theme system ## slides.createFromJson Agent-friendly blueprint-to-slides tool. Agents describe slides as JSON; the server translates to Slides API batchUpdate in one round trip. - Color alias system: named colors (blue, red, green, yellow, text, text_muted, primary, primary_text, background, surface, secondary) → Google brand RGB values. Agents never need to specify RGB directly. - Theme system: 12 named themes (google, exec, pitch, technical, workshop, dark, demo, hcls, customer, simple, google-dark, google-minimal) drive font, accent color, and footer guidance. - Speaker notes: include "speaker_notes" in each slide object → written automatically. Tool description warns when notes are missing and prompts a second pass. - Layer ordering: shapes render before images before text, then by layer value. Background shapes reliably appear behind text without manual sequencing. - Auto-deletes default blank slide "p" created by Google on new presentations. - Sanitizes template placeholder URLs from LLM output (replaces with info icon). - Addresses review feedback: uses server.registerTool, registered in feature-config, slide insertion appends to end by default. ## slides.insertImageSlide Inserts a local image as a full-bleed slide. Handles the full lifecycle: upload to Drive → OAuth-embedded URL (file stays private) → createImage via batchUpdate → delete Drive file. No manual Drive sharing required. Optional label chip rendered in top-right corner. ## drive.uploadFile Uploads a local file to Drive. Returns fileId and an OAuth-embedded imageUrl suitable for use in slides.createFromJson image elements. File stays private — access token embedded in URL so Slides API can fetch without public sharing. ## slides.create / slides.batchUpdate / slides.get* / slides.updateSpeakerNotes - slides.create: create a blank presentation - slides.batchUpdate: raw Slides API request passthrough - slides.getText / getMetadata / getImages / getSlideThumbnail: read tools - slides.getSpeakerNotes / updateSpeakerNotes: read and write speaker notes ## feature-config.ts - drive.uploadFile added to drive write group - slides read group: getSpeakerNotes added - slides write group: create, batchUpdate, createFromJson, updateSpeakerNotes, insertImageSlide all registered (defaultEnabled: false, requires opt-in)
7379d66 to
119b16a
Compare
drive.uploadFile previously granted anyone:reader on every upload as a convenience for the Slides API workflow. This violates least-privilege: files become public-link-readable by default, and there is no symmetric "close the share" primitive. Three changes: - drive.uploadFile now uploads PRIVATE. No share is granted; response drops the imageUrl field (file is not fetchable without further action). - drive.addPublicAccess (new): grants anyone:reader on an existing file and returns the public imageUrl. Explicit opt-in. Returns the permission ID for symmetric revocation. Surfaces the Workspace publishOutNotPermitted policy clearly so callers can fall back to GCS staging or another host. - drive.removePublicAccess (new): revokes every anyone:* permission on a file. Idempotent (returns empty list if none exist). File stays in Drive — only the public link is closed. Callers that used the old uploadFile-grants-share behavior should now call uploadFile + addPublicAccess together, and pair every addPublicAccess with removePublicAccess when the share is no longer needed.
The bundled-lifecycle tool is redundant with the new Drive primitives. The same outcome — local image → full-bleed slide — is now composable from four small explicit calls: drive.uploadFile (private upload) drive.addPublicAccess (explicit share) slides.createFromJson (image element) drive.removePublicAccess (close the share) In addition, the old insertImageSlide implementation embedded an OAuth access token in the Drive download URL, but the Slides API rejects authenticated URLs for createImage (publicly accessible URL required, per https://developers.google.com/workspace/slides/api/guides/add-image). So the tool was effectively non-functional anyway. Removing reduces surface area, eliminates a broken primitive, and keeps the public API consistent: every share is explicit, every share is reversible.
What this adds
Five new tools for building Google Slides presentations programmatically, plus three new Drive primitives for safe-by-default image staging.
slides.createFromJson— blueprint-to-slides in one callCallers describe a deck as a JSON blueprint; the server translates it into a Slides API
batchUpdate. No knowledge of raw API shape required.Color aliases — named colors, never RGB:
textprimaryprimary_textbluered/yellow/greensurfacetext_mutedTheme system — 12 named themes (
google,exec,pitch,technical,workshop,dark,demo,hcls,customer,simple,google-dark,google-minimal) drive font family, accent color, and layout guidance in the tool description.Speaker notes — include
"speaker_notes"in each slide object and they're written automatically. Tool warns when notes are missing and requests a second pass.Layer ordering — elements render shapes → images → text, then by
layervalue. Background shapes reliably appear behind text without manual sequencing.Blueprint format:
{ "slides": [ { "speaker_notes": "...", "elements": [...] }, { "speaker_notes": "...", "elements": [...] } ] }Element schema:
type(text|shape|image),position({x,y,w,h}in points on 720×405 canvas),layer(z-index),content,url, and astyleobject with:size,bold,color,bg_color,no_border,align,vertical_align.Drive primitives — split for safe-by-default uploads
The Slides API's
createImageendpoint requires a publicly accessible URL (per Google's docs) — OAuth tokens in URLs are not honored. To support image-heavy workflows without making the upload tool itself dangerous, the share lifecycle is split across three tools:drive.uploadFile— uploads a local file to Drive. File is PRIVATE by default (no share granted). Returnsid,name,webViewLink.drive.addPublicAccess— explicit opt-in: grantsanyone:readeron an existing file, returns the publicimageUrland the permission ID. Surfaces WorkspacepublishOutNotPermittedclearly so callers can fall back to another hosting path (GCS signed URLs, etc.).drive.removePublicAccess— revokes everyanyone:*permission on a file. Idempotent. File stays in Drive — only the public link is closed.Typical use for embedding a local image in a slide:
Supporting tools
slides.create— create a blank presentation, returns{presentationId, url}slides.batchUpdate— raw Slides API request array passthroughslides.getText/getMetadata/getImages/getSlideThumbnail— read toolsslides.getSpeakerNotes/updateSpeakerNotes— read and write speaker notes per slideDesign notes
server.registerTooland gated throughfeature-config.ts.slides.writeanddrive.writegroups carry the new tools with correct scope requirements.Validation
npm run build)publishOutNotPermittedblocksaddPublicAccess, the error is clearly surfaced so callers can route images through an alternative public host.