Skip to content

feat(slides,drive): add createFromJson, drive primitives, and theme system#348

Open
n0012 wants to merge 3 commits into
gemini-cli-extensions:mainfrom
n0012:feat/slides-create-from-json
Open

feat(slides,drive): add createFromJson, drive primitives, and theme system#348
n0012 wants to merge 3 commits into
gemini-cli-extensions:mainfrom
n0012:feat/slides-create-from-json

Conversation

@n0012
Copy link
Copy Markdown

@n0012 n0012 commented Apr 25, 2026

What this adds

Five new tools for building Google Slides presentations programmatically, plus three new Drive primitives for safe-by-default image staging.


slides.createFromJson — blueprint-to-slides in one call

Callers describe a deck as a JSON blueprint; the server translates it into a Slides API batchUpdate. No knowledge of raw API shape required.

Color aliases — named colors, never RGB:

Alias Value Use
text #202124 near-black body text
primary #101828 dark dark backgrounds
primary_text #FFFFFF white text on dark
blue #1A73E8 Google Blue accents, labels
red/yellow/green Google brand colors brand bar
surface #F1F3F4 light gray card backgrounds
text_muted #757575 gray secondary text

Theme system — 12 named themes (google, exec, pitch, technical, workshop, dark, demo, hcls, customer, simple, google-dark, google-minimal) drive font family, accent color, and layout guidance in the tool description.

Speaker notes — include "speaker_notes" in each slide object and they're written automatically. Tool warns when notes are missing and requests a second pass.

Layer ordering — elements render shapes → images → text, then by layer value. Background shapes reliably appear behind text without manual sequencing.

Blueprint format:

{ "slides": [
    { "speaker_notes": "...", "elements": [...] },
    { "speaker_notes": "...", "elements": [...] }
  ]
}

Element schema: type (text | shape | image), position ({x,y,w,h} in points on 720×405 canvas), layer (z-index), content, url, and a style object with: size, bold, color, bg_color, no_border, align, vertical_align.


Drive primitives — split for safe-by-default uploads

The Slides API's createImage endpoint requires a publicly accessible URL (per Google's docs) — OAuth tokens in URLs are not honored. To support image-heavy workflows without making the upload tool itself dangerous, the share lifecycle is split across three tools:

  • drive.uploadFile — uploads a local file to Drive. File is PRIVATE by default (no share granted). Returns id, name, webViewLink.
  • drive.addPublicAccess — explicit opt-in: grants anyone:reader on an existing file, returns the public imageUrl and the permission ID. Surfaces Workspace publishOutNotPermitted clearly so callers can fall back to another hosting path (GCS signed URLs, etc.).
  • drive.removePublicAccess — revokes every anyone:* permission on a file. Idempotent. File stays in Drive — only the public link is closed.

Typical use for embedding a local image in a slide:

1. id       = drive.uploadFile(localPath)            # private
2. imageUrl = drive.addPublicAccess(id)              # explicit opt-in
3. slides.createFromJson(..., url: imageUrl, ...)
4. drive.removePublicAccess(id)                       # close the window

Supporting tools

  • slides.create — create a blank presentation, returns {presentationId, url}
  • slides.batchUpdate — raw Slides API request array passthrough
  • slides.getText / getMetadata / getImages / getSlideThumbnail — read tools
  • slides.getSpeakerNotes / updateSpeakerNotes — read and write speaker notes per slide

Design notes

  • All new tools registered via server.registerTool and gated through feature-config.ts. slides.write and drive.write groups carry the new tools with correct scope requirements.
  • Color alias system replaces raw RGB so output stays consistent with the active theme.
  • Speaker notes written inline from blueprint (no second pass required unless omitted).
  • Drive upload split into three primitives so the safe default is private; sharing is always explicit and reversible.

Validation

  • TypeScript build passes (npm run build)
  • Full round-trip exercised end-to-end on both personal and corporate Google accounts: presentation creation, multi-slide blueprints with shapes/text/images, color aliases, speaker notes, image upload + share + revoke.
  • On corporate Workspace domains where publishOutNotPermitted blocks addPublicAccess, the error is clearly surfaced so callers can route images through an alternative public host.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new Google Slides tools for creating presentations, performing batch updates, and generating slides from JSON blueprints. The review feedback highlights the need for consistency by using the registerTool wrapper to respect feature flags and suggests enhancing input schemas to support structured objects. Additionally, the feedback addresses a potential crash in the createFromJson service method and recommends adjusting the slide insertion logic to append new slides to the end of a presentation.

slidesService.getSlideThumbnail,
);

server.registerTool(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The slides.create tool is being registered using server.registerTool directly, which bypasses the registerTool wrapper defined on line 154. This wrapper is responsible for checking if the tool is enabled via feature flags (WORKSPACE_FEATURE_OVERRIDES). Using the wrapper ensures consistency and allows users to disable these tools if needed.

Suggested change
server.registerTool(
registerTool(

);

server.registerTool(
'slides.batchUpdate',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to slides.create, this tool should be registered using the registerTool wrapper to respect feature flags.

  registerTool(

});

server.registerTool(
'slides.createFromJson',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This tool should also use the registerTool wrapper to ensure it can be managed via feature flags.

  registerTool(

Comment on lines +482 to +486
requests: z
.string()
.describe(
'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.',
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The requests field is restricted to a string, but the SlidesService.batchUpdate implementation (line 329) and most MCP clients support passing structured arrays directly. Allowing both a JSON string and an array of objects provides a better experience for AI agents.

Suggested change
requests: z
.string()
.describe(
'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.',
),
requests: z
.union([z.string(), z.array(z.any())])
.describe(
'An array of Slides API request objects or a JSON string of that array (e.g., [{"createSlide":{}}, {"createShape":{...}}]).',
),

Comment on lines +602 to +606
slideJson: z
.string()
.describe(
'JSON string of the slide blueprint. Use {"slides":[{"elements":[...]},...]} for multiple slides or {"elements":[...]} for one slide. Will be parsed server-side.',
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The slideElementSchema defined on lines 493-591 is currently unused. It should be applied to the slideJson input schema to provide the AI agent with structured validation and clear documentation of the expected blueprint format. Additionally, allowing both objects and strings makes the tool more robust.

        slideJson: z
          .union([
            z.object({
              slides: z.array(z.object({ elements: z.array(slideElementSchema) })),
            }),
            z.object({
              elements: z.array(slideElementSchema),
            }),
            z.string(),
          ])
          .describe(
            'The slide blueprint. Use {"slides":[{"elements":[...]}]} for multiple slides or {"elements":[...]} for one slide. Can be a JSON string or object.',
          ),

Comment on lines +679 to +681
const slideDefs = (slideJson as any).slides
? (slideJson as any).slides
: [{ elements: (slideJson as any).elements || [] }];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the slides format is used but an individual slide object is missing the elements property (e.g., { "slides": [{}] }), slideDefs[i].elements will be undefined. This will cause a crash in buildSlideRequests when it attempts to spread or iterate over elements (line 421).

Suggested change
const slideDefs = (slideJson as any).slides
? (slideJson as any).slides
: [{ elements: (slideJson as any).elements || [] }];
const slideDefs = (slideJson as any).slides
? (slideJson as any).slides.map((s: any) => ({ ...s, elements: s.elements || [] }))
: [{ elements: (slideJson as any).elements || [] }];

requests.push({
createSlide: {
objectId: slideId,
insertionIndex: i + 1,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding insertionIndex: i + 1 causes new slides to always be inserted at the beginning of the presentation (after the first slide). For an 'add slides' tool, the expected behavior is usually to append slides to the end. Omitting insertionIndex entirely will cause the Slides API to append the new slides to the end of the presentation.

Suggested change
insertionIndex: i + 1,
slideLayoutReference: { predefinedLayout: 'BLANK' },

@n0012
Copy link
Copy Markdown
Author

n0012 commented Apr 25, 2026

test with gemini extensions install https://github.com/n0012/workspace

new tool that can batch create slides is slides.createFromJson

@n0012
Copy link
Copy Markdown
Author

n0012 commented Apr 29, 2026

Bug fix bundled in this PR: docs.getText field mask error

While testing this PR I hit a systematic breakage in docs_getText (and writeText/replaceText) affecting all Google Docs — including single-tab docs with no suggestions. Root cause and fix:

Root cause: docs.documents.get with includeTabsContent: true and a wildcard field mask (tabs, tabs.documentTab.body, etc.) causes the API to validate the mask as including suggestion/comment sub-fields (suggestedInsertionIds, suggestedParagraphStyleChanges, etc.). The API then rejects with:

Field mask cannot retrieve comment-specific fields when include_comments is false.

A previous attempt added suggestionsViewMode: 'PREVIEW_WITHOUT_SUGGESTIONS' to suppress suggestion data, but this doesn't help — the field mask is validated before the view-mode filter is applied.

Fix (commit 02b3502): Replaced all three documents.get calls in DocsService with explicit field masks (DOCS_READ_FIELDS, DOCS_END_INDEX_FIELDS) that enumerate only the fields _readStructuralElement actually reads — no suggestion or comment sub-fields anywhere in the tree. Handles up to 3 levels of tab nesting (Google Docs maximum) and one level of table nesting.

@n0012 n0012 force-pushed the feat/slides-create-from-json branch from 02b3502 to abf1ecd Compare May 9, 2026 19:10
@n0012 n0012 changed the title feat(slides): add slides.createFromJson — agent-friendly blueprint-to-slides tool feat(slides,drive): add createFromJson, insertImageSlide, uploadFile, and theme system May 9, 2026
@n0012 n0012 force-pushed the feat/slides-create-from-json branch from abf1ecd to 7379d66 Compare May 9, 2026 19:17
@n0012
Copy link
Copy Markdown
Author

n0012 commented May 9, 2026

@allenhutchison — would appreciate a review when you get a chance! This adds slides.createFromJson, slides.insertImageSlide, and drive.uploadFile — tools we've been using in production with Claude Code + Gemini CLI for building AI-generated slide decks. CLA is green. Happy to address any feedback.

… and theme system

## slides.createFromJson
Agent-friendly blueprint-to-slides tool. Agents describe slides as JSON;
the server translates to Slides API batchUpdate in one round trip.

- Color alias system: named colors (blue, red, green, yellow, text, text_muted,
  primary, primary_text, background, surface, secondary) → Google brand RGB values.
  Agents never need to specify RGB directly.
- Theme system: 12 named themes (google, exec, pitch, technical, workshop, dark,
  demo, hcls, customer, simple, google-dark, google-minimal) drive font, accent
  color, and footer guidance.
- Speaker notes: include "speaker_notes" in each slide object → written automatically.
  Tool description warns when notes are missing and prompts a second pass.
- Layer ordering: shapes render before images before text, then by layer value.
  Background shapes reliably appear behind text without manual sequencing.
- Auto-deletes default blank slide "p" created by Google on new presentations.
- Sanitizes template placeholder URLs from LLM output (replaces with info icon).
- Addresses review feedback: uses server.registerTool, registered in feature-config,
  slide insertion appends to end by default.

## slides.insertImageSlide
Inserts a local image as a full-bleed slide. Handles the full lifecycle:
upload to Drive → OAuth-embedded URL (file stays private) → createImage via
batchUpdate → delete Drive file. No manual Drive sharing required.
Optional label chip rendered in top-right corner.

## drive.uploadFile
Uploads a local file to Drive. Returns fileId and an OAuth-embedded imageUrl
suitable for use in slides.createFromJson image elements. File stays private —
access token embedded in URL so Slides API can fetch without public sharing.

## slides.create / slides.batchUpdate / slides.get* / slides.updateSpeakerNotes
- slides.create: create a blank presentation
- slides.batchUpdate: raw Slides API request passthrough
- slides.getText / getMetadata / getImages / getSlideThumbnail: read tools
- slides.getSpeakerNotes / updateSpeakerNotes: read and write speaker notes

## feature-config.ts
- drive.uploadFile added to drive write group
- slides read group: getSpeakerNotes added
- slides write group: create, batchUpdate, createFromJson, updateSpeakerNotes,
  insertImageSlide all registered (defaultEnabled: false, requires opt-in)
@n0012 n0012 force-pushed the feat/slides-create-from-json branch from 7379d66 to 119b16a Compare May 10, 2026 03:19
n0012 added 2 commits May 19, 2026 05:47
drive.uploadFile previously granted anyone:reader on every upload as a
convenience for the Slides API workflow. This violates least-privilege:
files become public-link-readable by default, and there is no symmetric
"close the share" primitive.

Three changes:

- drive.uploadFile now uploads PRIVATE. No share is granted; response
  drops the imageUrl field (file is not fetchable without further action).

- drive.addPublicAccess (new): grants anyone:reader on an existing file
  and returns the public imageUrl. Explicit opt-in. Returns the
  permission ID for symmetric revocation. Surfaces the Workspace
  publishOutNotPermitted policy clearly so callers can fall back to
  GCS staging or another host.

- drive.removePublicAccess (new): revokes every anyone:* permission on
  a file. Idempotent (returns empty list if none exist). File stays in
  Drive — only the public link is closed.

Callers that used the old uploadFile-grants-share behavior should now
call uploadFile + addPublicAccess together, and pair every
addPublicAccess with removePublicAccess when the share is no longer
needed.
The bundled-lifecycle tool is redundant with the new Drive primitives.
The same outcome — local image → full-bleed slide — is now composable
from four small explicit calls:

  drive.uploadFile      (private upload)
  drive.addPublicAccess (explicit share)
  slides.createFromJson (image element)
  drive.removePublicAccess (close the share)

In addition, the old insertImageSlide implementation embedded an OAuth
access token in the Drive download URL, but the Slides API rejects
authenticated URLs for createImage (publicly accessible URL required,
per https://developers.google.com/workspace/slides/api/guides/add-image).
So the tool was effectively non-functional anyway.

Removing reduces surface area, eliminates a broken primitive, and keeps
the public API consistent: every share is explicit, every share is
reversible.
@n0012 n0012 changed the title feat(slides,drive): add createFromJson, insertImageSlide, uploadFile, and theme system feat(slides,drive): add createFromJson, drive primitives, and theme system May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant