Skip to content

Commit 7379d66

Browse files
committed
feat(slides,drive): add createFromJson, insertImageSlide, uploadFile, and theme system
## slides.createFromJson Agent-friendly blueprint-to-slides tool. Agents describe slides as JSON; the server translates to Slides API batchUpdate in one round trip. - Color alias system: named colors (blue, red, green, yellow, text, text_muted, primary, primary_text, background, surface, secondary) → Google brand RGB values. Agents never need to specify RGB directly. - Theme system: 12 named themes (google, exec, pitch, technical, workshop, dark, demo, hcls, customer, simple, google-dark, google-minimal) drive font, accent color, and footer guidance. - Speaker notes: include "speaker_notes" in each slide object → written automatically. Tool description warns when notes are missing and prompts a second pass. - Layer ordering: shapes render before images before text, then by layer value. Background shapes reliably appear behind text without manual sequencing. - Auto-deletes default blank slide "p" created by Google on new presentations. - Sanitizes template placeholder URLs from LLM output (replaces with info icon). - Addresses review feedback: uses server.registerTool, registered in feature-config, slide insertion appends to end by default. ## slides.insertImageSlide Inserts a local image as a full-bleed slide. Handles the full lifecycle: upload to Drive → OAuth-embedded URL (file stays private) → createImage via batchUpdate → delete Drive file. No manual Drive sharing required. Optional label chip rendered in top-right corner. ## drive.uploadFile Uploads a local file to Drive. Returns fileId and an OAuth-embedded imageUrl suitable for use in slides.createFromJson image elements. File stays private — access token embedded in URL so Slides API can fetch without public sharing. ## slides.create / slides.batchUpdate / slides.get* / slides.updateSpeakerNotes - slides.create: create a blank presentation - slides.batchUpdate: raw Slides API request passthrough - slides.getText / getMetadata / getImages / getSlideThumbnail: read tools - slides.getSpeakerNotes / updateSpeakerNotes: read and write speaker notes ## feature-config.ts - drive.uploadFile added to drive write group - slides read group: getSpeakerNotes added - slides write group: create, batchUpdate, createFromJson, updateSpeakerNotes, insertImageSlide all registered (defaultEnabled: false, requires opt-in)
1 parent c3fe282 commit 7379d66

4 files changed

Lines changed: 1225 additions & 2 deletions

File tree

workspace-server/src/features/feature-config.ts

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@ export const FEATURE_GROUPS: readonly FeatureGroup[] = [
9999
'drive.moveFile',
100100
'drive.trashFile',
101101
'drive.renameFile',
102+
'drive.uploadFile',
102103
],
103104
defaultEnabled: true,
104105
},
@@ -203,14 +204,21 @@ export const FEATURE_GROUPS: readonly FeatureGroup[] = [
203204
'slides.getMetadata',
204205
'slides.getImages',
205206
'slides.getSlideThumbnail',
207+
'slides.getSpeakerNotes',
206208
],
207209
defaultEnabled: true,
208210
},
209211
{
210212
service: 'slides',
211213
group: 'write',
212214
scopes: scopes('presentations'),
213-
tools: [],
215+
tools: [
216+
'slides.create',
217+
'slides.batchUpdate',
218+
'slides.createFromJson',
219+
'slides.updateSpeakerNotes',
220+
'slides.insertImageSlide',
221+
],
214222
defaultEnabled: false,
215223
},
216224

workspace-server/src/index.ts

Lines changed: 232 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -476,6 +476,212 @@ async function main() {
476476
slidesService.getSlideThumbnail,
477477
);
478478

479+
// Speaker notes tools — approach adapted from PR #235
480+
// https://github.com/gemini-cli-extensions/workspace/pull/235 by @stefanoamorelli
481+
server.registerTool(
482+
'slides.getSpeakerNotes',
483+
{
484+
description:
485+
'Retrieves speaker notes for every slide in a presentation. Returns an array of {slideIndex, slideObjectId, speakerNotesObjectId, notes} — one entry per slide. Use slideObjectId with slides.updateSpeakerNotes to write notes back.',
486+
inputSchema: {
487+
presentationId: z
488+
.string()
489+
.describe('The ID or URL of the presentation.'),
490+
},
491+
},
492+
slidesService.getSpeakerNotes,
493+
);
494+
495+
server.registerTool(
496+
'slides.updateSpeakerNotes',
497+
{
498+
description:
499+
'Writes speaker notes for a specific slide. Replaces any existing notes. Get slideObjectId from slides.getSpeakerNotes or slides.getMetadata.',
500+
inputSchema: {
501+
presentationId: z
502+
.string()
503+
.describe('The ID or URL of the presentation.'),
504+
slideObjectId: z
505+
.string()
506+
.describe('The object ID of the slide to update (from getSpeakerNotes or getMetadata).'),
507+
notes: z
508+
.string()
509+
.describe('The speaker notes text. Pass an empty string to clear existing notes.'),
510+
},
511+
},
512+
slidesService.updateSpeakerNotes,
513+
);
514+
515+
server.registerTool(
516+
'slides.create',
517+
{
518+
description:
519+
'Creates a new blank Google Slides presentation. Returns the presentation ID and URL.',
520+
inputSchema: {
521+
title: z.string().describe('The title for the new presentation.'),
522+
},
523+
},
524+
slidesService.create,
525+
);
526+
527+
server.registerTool(
528+
'slides.batchUpdate',
529+
{
530+
description:
531+
'Executes a batch of updates (create, modify, delete) on a Google Slides presentation. Takes an array of raw Slides API request objects.',
532+
inputSchema: {
533+
presentationId: z
534+
.string()
535+
.describe('The ID or URL of the presentation to modify.'),
536+
requests: z
537+
.string()
538+
.describe(
539+
'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.',
540+
),
541+
},
542+
},
543+
slidesService.batchUpdate,
544+
);
545+
546+
// Shared element schema for createFromJson
547+
const slideElementSchema = z.object({
548+
type: z.enum(['text', 'shape', 'image']).describe('Element type.'),
549+
content: z
550+
.string()
551+
.optional()
552+
.describe('Text content (for text elements).'),
553+
shape_type: z
554+
.string()
555+
.optional()
556+
.describe(
557+
'Shape type (e.g., RECTANGLE, RIGHT_ARROW, TEXT_BOX). Default: RECTANGLE.',
558+
),
559+
url: z.string().optional().describe('Image URL (for image elements).'),
560+
layer: z
561+
.number()
562+
.optional()
563+
.describe(
564+
'Z-index layer for rendering order. Lower layers render first.',
565+
),
566+
position: z
567+
.object({
568+
x: z.number().describe('X position in points.'),
569+
y: z.number().describe('Y position in points.'),
570+
w: z.number().describe('Width in points.'),
571+
h: z.number().describe('Height in points.'),
572+
})
573+
.describe('Position and size on a 720x405 point grid.'),
574+
style: z
575+
.object({
576+
size: z.number().optional().describe('Font size in points.'),
577+
bold: z.boolean().optional().describe('Bold text.'),
578+
italic: z.boolean().optional().describe('Italic text.'),
579+
align: z
580+
.enum(['START', 'CENTER', 'END'])
581+
.optional()
582+
.describe('Horizontal text alignment.'),
583+
vertical_align: z
584+
.enum(['TOP', 'MIDDLE', 'BOTTOM'])
585+
.optional()
586+
.describe('Vertical content alignment.'),
587+
color: z
588+
.object({
589+
red: z.number(),
590+
green: z.number(),
591+
blue: z.number(),
592+
})
593+
.optional()
594+
.describe('Text color (RGB 0-1).'),
595+
bg_color: z
596+
.object({
597+
red: z.number(),
598+
green: z.number(),
599+
blue: z.number(),
600+
})
601+
.optional()
602+
.describe('Shape background color (RGB 0-1).'),
603+
border_color: z
604+
.object({
605+
red: z.number(),
606+
green: z.number(),
607+
blue: z.number(),
608+
})
609+
.optional()
610+
.describe('Shape border color (RGB 0-1).'),
611+
border_weight: z
612+
.number()
613+
.optional()
614+
.describe('Border weight in points.'),
615+
no_border: z
616+
.boolean()
617+
.optional()
618+
.describe('Remove border from shape.'),
619+
font_family: z
620+
.string()
621+
.optional()
622+
.describe('Font family name (e.g. "Arial", "Roboto"). Defaults to "Arial".'),
623+
underline: z.boolean().optional().describe('Underline text.'),
624+
strikethrough: z.boolean().optional().describe('Strikethrough text.'),
625+
indent: z
626+
.number()
627+
.optional()
628+
.describe('Left indent of paragraph text in points (e.g. 18 for one level of bullet indentation).'),
629+
bold_phrases: z
630+
.array(z.string())
631+
.optional()
632+
.describe('Phrases within content to bold.'),
633+
bold_until: z
634+
.number()
635+
.optional()
636+
.describe('Bold text from start to this character index.'),
637+
links: z
638+
.array(
639+
z.object({
640+
text: z.string().describe('Link text to find in content.'),
641+
url: z.string().describe('URL to link to.'),
642+
}),
643+
)
644+
.optional()
645+
.describe('Hyperlinks to apply to matching text.'),
646+
})
647+
.optional()
648+
.describe('Styling options for the element.'),
649+
});
650+
651+
server.registerTool(
652+
'slides.createFromJson',
653+
{
654+
description:
655+
'Creates one or more slides in a presentation from a JSON blueprint. Supports optional per-slide speaker_notes that are written automatically.\n\nFORMATS: {"slides":[{"elements":[...],"speaker_notes":"..."},...]} for multiple slides, or {"elements":[...]} for a single slide.\n\nCANVAS: 720×405 pt (16:9). Origin is top-left.\n\nELEMENT TYPES: type ("text"|"shape"|"image"), position ({x,y,w,h} in points), optional content, shape_type (e.g. "RECTANGLE","TEXT_BOX"), url (images), layer (z-index).\n\nCOLOR ALIASES — IMPORTANT: Use color aliases ("primary", "surface", "text", "blue", "red", etc.) instead of hardcoded RGB values. Aliases resolve to the Google brand palette automatically: near-black headers, Google Sans font, four brand accent colors. font_family:"theme" gives you Google Sans. Hardcoding RGB bypasses the palette entirely.\n\nSPEAKER NOTES (REQUIRED): Include "speaker_notes" in each slide object of the blueprint for automatic writing. If you omit them, the response will include action_required asking you to call slides.updateSpeakerNotes for each slideId. Either approach works — inline is simpler, but a second pass lets you focus on layout first and notes second. Write ~45 seconds of spoken content per slide (4-6 sentences): opening line, key points, transition to next slide. A deck without speaker notes is incomplete.\n\nDESIGN INTENT: Let the content drive the layout. A single strong idea may need only a title and whitespace. A comparison needs two columns. Avoid defaulting to the same structure every slide — vary density, emphasis, and composition to match what each slide is communicating.\n\nCONSISTENCY: Use the same theme, ~18pt margin rhythm, and font size hierarchy throughout. Consistency in the system lets individual slides be visually distinct without feeling disconnected.\n\nLESS IS MORE: Color is for emphasis, not decoration. Most slides should be mostly white/background with dark text. Use colored elements sparingly — a thin accent line, a highlighted key metric, a section label. Not every slide needs a colored header bar. Whitespace IS the design.\n\nTECHNICAL NOTES:\n- Layers: lower values render first (backgrounds=0, boxes=1, text=2+). Missing layers cause text to be hidden behind shapes.\n- Font sizes: titles ~20-24pt bold, subheadings ~12-14pt, body ~10-12pt.\n- Text boxes clip silently — size h generously.\n\nSTYLE PROPERTIES: size, bold, italic, underline, strikethrough, align (START|CENTER|END), vertical_align (TOP|MIDDLE|BOTTOM), indent, color, bg_color, border_color, border_weight, no_border, font_family ("theme" to inherit theme font), bold_phrases, bold_until, links ([{text,url}]).\n\nCOLOR ALIASES: "primary" (#202124 near-black), "primary_text" (white), "secondary" (#1A73E8 Blue 600), "text" (#1F1F1F), "text_muted" (#444746), "surface" (Blue 50), "surface_alt" (Green 50), "background" (white). Brand colors: "blue" (#4285F4), "red" (#EA4335), "yellow" (#FBBC05), "green" (#34A853). OR use RGB 0-1 objects for one-off colors. Image URLs with unresolved placeholders are replaced with a fallback icon.',
656+
inputSchema: {
657+
presentationId: z
658+
.string()
659+
.describe('The ID or URL of the presentation to add slides to.'),
660+
slideJson: z
661+
.string()
662+
.describe(
663+
'JSON string of the slide blueprint. Use {"slides":[{"elements":[...],"speaker_notes":"..."},...]} for multiple slides or {"elements":[...]} for one slide. REQUIRED: every slide object MUST include "speaker_notes" — a string with a full talk track (what the presenter should say, not just what the slide shows). The server writes notes automatically. Omitting speaker_notes produces an unprofessional deck.',
664+
),
665+
},
666+
},
667+
slidesService.createFromJson,
668+
);
669+
670+
registerTool(
671+
'slides.insertImageSlide',
672+
{
673+
description:
674+
'Inserts a local image file as a new full-bleed slide into an existing presentation. Handles Drive upload and image embedding internally — no separate upload step needed. Use for inserting concept sketches or visual slides at a specific position in the deck.',
675+
inputSchema: {
676+
presentationId: z.string().describe('The ID or URL of the presentation.'),
677+
localImagePath: z.string().describe('Absolute path to the local image file to insert as a slide.'),
678+
insertionIndex: z.number().optional().describe('Zero-based index where the slide should be inserted. Omit to append at end.'),
679+
label: z.string().optional().describe('Optional text label to overlay on the slide (e.g. "CONCEPT SKETCH").'),
680+
},
681+
},
682+
slidesService.insertImageSlide,
683+
);
684+
479685
// Sheets tools
480686
registerTool(
481687
'sheets.getText',
@@ -630,6 +836,32 @@ async function main() {
630836
driveService.renameFile,
631837
);
632838

839+
registerTool(
840+
'drive.uploadFile',
841+
{
842+
description:
843+
'Uploads a local file to Google Drive (file stays private). Returns an OAuth-authenticated imageUrl that the Slides API can fetch directly — use this URL in slides.createFromJson image elements. Also returns the file ID and webViewLink.',
844+
inputSchema: {
845+
localPath: z
846+
.string()
847+
.describe('Absolute path to the local file to upload.'),
848+
name: z
849+
.string()
850+
.optional()
851+
.describe('Name for the file in Drive. Defaults to the local filename.'),
852+
mimeType: z
853+
.string()
854+
.optional()
855+
.describe('MIME type of the file (e.g. "image/png"). Defaults to application/octet-stream.'),
856+
parentId: z
857+
.string()
858+
.optional()
859+
.describe('Drive folder ID to upload into. Defaults to root.'),
860+
},
861+
},
862+
driveService.uploadFile,
863+
);
864+
633865
registerTool(
634866
'calendar.list',
635867
{

workspace-server/src/services/DriveService.ts

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -639,4 +639,76 @@ export class DriveService {
639639
};
640640
}
641641
};
642+
643+
public uploadFile = async ({
644+
localPath,
645+
name,
646+
mimeType,
647+
parentId,
648+
}: {
649+
localPath: string;
650+
name?: string;
651+
mimeType?: string;
652+
parentId?: string;
653+
}) => {
654+
logToFile(`Uploading file from ${localPath}`);
655+
try {
656+
const auth = await this.authManager.getAuthenticatedClient();
657+
const drive = await this.getDriveClient();
658+
659+
const absolutePath = path.isAbsolute(localPath)
660+
? localPath
661+
: path.join(PROJECT_ROOT, localPath);
662+
663+
if (!fs.existsSync(absolutePath)) {
664+
return {
665+
content: [{ type: 'text' as const, text: JSON.stringify({ error: `File not found: ${absolutePath}` }) }],
666+
};
667+
}
668+
669+
const fileName = name ?? path.basename(absolutePath);
670+
const fileMime = mimeType ?? 'application/octet-stream';
671+
672+
const fileMetadata: drive_v3.Schema$File = { name: fileName };
673+
if (parentId) fileMetadata.parents = [parentId];
674+
675+
const file = await drive.files.create({
676+
requestBody: fileMetadata,
677+
media: { mimeType: fileMime, body: fs.createReadStream(absolutePath) },
678+
fields: 'id, name, webViewLink',
679+
supportsAllDrives: true,
680+
});
681+
682+
const fileId = file.data.id!;
683+
684+
// Ensure we have a fresh access token, then embed it in the URL so the
685+
// Slides API can fetch the image without the file needing to be public.
686+
const tokenResponse = await auth.getAccessToken();
687+
const accessToken = tokenResponse.token;
688+
const imageUrl = `https://www.googleapis.com/drive/v3/files/${fileId}?alt=media&access_token=${accessToken}`;
689+
690+
logToFile(`Uploaded ${fileName}${fileId}`);
691+
692+
return {
693+
content: [
694+
{
695+
type: 'text' as const,
696+
text: JSON.stringify({
697+
id: fileId,
698+
name: file.data.name,
699+
imageUrl, // use this in slides.createFromJson {"type":"image","url":imageUrl}
700+
webViewLink: file.data.webViewLink,
701+
}),
702+
},
703+
],
704+
};
705+
} catch (error) {
706+
const errorMessage =
707+
error instanceof Error ? error.message : String(error);
708+
logToFile(`Error during drive.uploadFile: ${errorMessage}`);
709+
return {
710+
content: [{ type: 'text' as const, text: JSON.stringify({ error: errorMessage }) }],
711+
};
712+
}
713+
};
642714
}

0 commit comments

Comments
 (0)