Skip to content

Commit 2017acd

Browse files
authored
Merge pull request #744 from kvz/add-transloadit-media-processing
Add transloadit-media-processing skill
2 parents a279021 + ef1cdcd commit 2017acd

File tree

2 files changed

+195
-0
lines changed

2 files changed

+195
-0
lines changed

docs/README.skills.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ Skills differ from other primitives by supporting bundled assets (scripts, code
6363
| [snowflake-semanticview](../skills/snowflake-semanticview/SKILL.md) | Create, alter, and validate Snowflake semantic views using Snowflake CLI (snow). Use when asked to build or troubleshoot semantic views/semantic layer definitions with CREATE/ALTER SEMANTIC VIEW, to validate semantic-view DDL against Snowflake via CLI, or to guide Snowflake CLI installation and connection setup. | None |
6464
| [sponsor-finder](../skills/sponsor-finder/SKILL.md) | Find which of a GitHub repository's dependencies are sponsorable via GitHub Sponsors. Uses deps.dev API for dependency resolution across npm, PyPI, Cargo, Go, RubyGems, Maven, and NuGet. Checks npm funding metadata, FUNDING.yml files, and web search. Verifies every link. Shows direct and transitive dependencies with OSSF Scorecard health data. Invoke with /sponsor followed by a GitHub owner/repo (e.g. "/sponsor expressjs/express"). | None |
6565
| [terraform-azurerm-set-diff-analyzer](../skills/terraform-azurerm-set-diff-analyzer/SKILL.md) | Analyze Terraform plan JSON output for AzureRM Provider to distinguish between false-positive diffs (order-only changes in Set-type attributes) and actual resource changes. Use when reviewing terraform plan output for Azure resources like Application Gateway, Load Balancer, Firewall, Front Door, NSG, and other resources with Set-type attributes that cause spurious diffs due to internal ordering changes. | `references/azurerm_set_attributes.json`<br />`references/azurerm_set_attributes.md`<br />`scripts/.gitignore`<br />`scripts/README.md`<br />`scripts/analyze_plan.py` |
66+
| [transloadit-media-processing](../skills/transloadit-media-processing/SKILL.md) | Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale. | None |
6667
| [vscode-ext-commands](../skills/vscode-ext-commands/SKILL.md) | Guidelines for contributing commands in VS Code extensions. Indicates naming convention, visibility, localization and other relevant attributes, following VS Code extension development guidelines, libraries and good practices | None |
6768
| [vscode-ext-localization](../skills/vscode-ext-localization/SKILL.md) | Guidelines for proper localization of VS Code extensions, following VS Code extension development guidelines, libraries and good practices | None |
6869
| [web-design-reviewer](../skills/web-design-reviewer/SKILL.md) | This skill enables visual inspection of websites running locally or remotely to identify and fix design issues. Triggers on requests like "review website design", "check the UI", "fix the layout", "find design problems". Detects issues with responsive design, accessibility, visual consistency, and layout breakage, then performs fixes at the source code level. | `references/framework-fixes.md`<br />`references/visual-checklist.md` |
Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
---
2+
name: transloadit-media-processing
3+
description: 'Process media files (video, audio, images, documents) using Transloadit. Use when asked to encode video to HLS/MP4, generate thumbnails, resize or watermark images, extract audio, concatenate clips, add subtitles, OCR documents, or run any media processing pipeline. Covers 86+ processing robots for file transformation at scale.'
4+
license: MIT
5+
compatibility: Requires a free Transloadit account (https://transloadit.com/signup). Uses the @transloadit/mcp-server MCP server or the @transloadit/node CLI.
6+
---
7+
8+
# Transloadit Media Processing
9+
10+
Process, transform, and encode media files using Transloadit's cloud infrastructure.
11+
Supports video, audio, images, and documents with 86+ specialized processing robots.
12+
13+
## When to Use This Skill
14+
15+
Use this skill when you need to:
16+
17+
- Encode video to HLS, MP4, WebM, or other formats
18+
- Generate thumbnails or animated GIFs from video
19+
- Resize, crop, watermark, or optimize images
20+
- Convert between image formats (JPEG, PNG, WebP, AVIF, HEIF)
21+
- Extract or transcode audio (MP3, AAC, FLAC, WAV)
22+
- Concatenate video or audio clips
23+
- Add subtitles or overlay text on video
24+
- OCR documents (PDF, scanned images)
25+
- Run speech-to-text or text-to-speech
26+
- Apply AI-based content moderation or object detection
27+
- Build multi-step media pipelines that chain operations together
28+
29+
## Setup
30+
31+
### Option A: MCP Server (recommended for Copilot)
32+
33+
Add the Transloadit MCP server to your IDE config. This gives the agent direct access
34+
to Transloadit tools (`create_template`, `create_assembly`, `list_assembly_notifications`, etc.).
35+
36+
**VS Code / GitHub Copilot** (`.vscode/mcp.json` or user settings):
37+
38+
```json
39+
{
40+
"servers": {
41+
"transloadit": {
42+
"command": "npx",
43+
"args": ["-y", "@transloadit/mcp-server", "stdio"],
44+
"env": {
45+
"TRANSLOADIT_KEY": "YOUR_AUTH_KEY",
46+
"TRANSLOADIT_SECRET": "YOUR_AUTH_SECRET"
47+
}
48+
}
49+
}
50+
}
51+
```
52+
53+
Get your API credentials at https://transloadit.com/c/-/api-credentials
54+
55+
### Option B: CLI
56+
57+
If you prefer running commands directly:
58+
59+
```bash
60+
npx -y @transloadit/node assemblies create \
61+
--steps '{"encoded": {"robot": "/video/encode", "use": ":original", "preset": "hls-1080p"}}' \
62+
--wait \
63+
--input ./my-video.mp4
64+
```
65+
66+
## Core Workflows
67+
68+
### Encode Video to HLS (Adaptive Streaming)
69+
70+
```json
71+
{
72+
"steps": {
73+
"encoded": {
74+
"robot": "/video/encode",
75+
"use": ":original",
76+
"preset": "hls-1080p"
77+
}
78+
}
79+
}
80+
```
81+
82+
### Generate Thumbnails from Video
83+
84+
```json
85+
{
86+
"steps": {
87+
"thumbnails": {
88+
"robot": "/video/thumbs",
89+
"use": ":original",
90+
"count": 8,
91+
"width": 320,
92+
"height": 240
93+
}
94+
}
95+
}
96+
```
97+
98+
### Resize and Watermark Images
99+
100+
```json
101+
{
102+
"steps": {
103+
"resized": {
104+
"robot": "/image/resize",
105+
"use": ":original",
106+
"width": 1200,
107+
"height": 800,
108+
"resize_strategy": "fit"
109+
},
110+
"watermarked": {
111+
"robot": "/image/resize",
112+
"use": "resized",
113+
"watermark_url": "https://example.com/logo.png",
114+
"watermark_position": "bottom-right",
115+
"watermark_size": "15%"
116+
}
117+
}
118+
}
119+
```
120+
121+
### OCR a Document
122+
123+
```json
124+
{
125+
"steps": {
126+
"recognized": {
127+
"robot": "/document/ocr",
128+
"use": ":original",
129+
"provider": "aws",
130+
"format": "text"
131+
}
132+
}
133+
}
134+
```
135+
136+
### Concatenate Audio Clips
137+
138+
```json
139+
{
140+
"steps": {
141+
"imported": {
142+
"robot": "/http/import",
143+
"url": ["https://example.com/clip1.mp3", "https://example.com/clip2.mp3"]
144+
},
145+
"concatenated": {
146+
"robot": "/audio/concat",
147+
"use": "imported",
148+
"preset": "mp3"
149+
}
150+
}
151+
}
152+
```
153+
154+
## Multi-Step Pipelines
155+
156+
Steps can be chained using the `"use"` field. Each step references a previous step's output:
157+
158+
```json
159+
{
160+
"steps": {
161+
"resized": {
162+
"robot": "/image/resize",
163+
"use": ":original",
164+
"width": 1920
165+
},
166+
"optimized": {
167+
"robot": "/image/optimize",
168+
"use": "resized"
169+
},
170+
"exported": {
171+
"robot": "/s3/store",
172+
"use": "optimized",
173+
"bucket": "my-bucket",
174+
"path": "processed/${file.name}"
175+
}
176+
}
177+
}
178+
```
179+
180+
## Key Concepts
181+
182+
- **Assembly**: A single processing job. Created via `create_assembly` (MCP) or `assemblies create` (CLI).
183+
- **Template**: A reusable set of steps stored on Transloadit. Created via `create_template` (MCP) or `templates create` (CLI).
184+
- **Robot**: A processing unit (e.g., `/video/encode`, `/image/resize`). See full list at https://transloadit.com/docs/transcoding/
185+
- **Steps**: JSON object defining the pipeline. Each key is a step name, each value configures a robot.
186+
- **`:original`**: Refers to the uploaded input file.
187+
188+
## Tips
189+
190+
- Use `--wait` with the CLI to block until processing completes.
191+
- Use `preset` values (e.g., `"hls-1080p"`, `"mp3"`, `"webp"`) for common format targets instead of specifying every parameter.
192+
- Chain `"use": "step_name"` to build multi-step pipelines without intermediate downloads.
193+
- For batch processing, use `/http/import` to pull files from URLs, S3, GCS, Azure, FTP, or Dropbox.
194+
- Templates can include `${variables}` for dynamic values passed at assembly creation time.

0 commit comments

Comments
 (0)