You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: implicit flash endpoint resolution via sentinel headers
* refactor: drop flash.toml support in favor of env vars
* feat: rename flash run to flash dev, require explicit context for remote calls
* feat: rename flash run to flash dev, require explicit context for remote calls
* refactor: clean up CLI output formatting
* refactor: establish consistent color palette across CLI
* refactor: flatten deploy output, remove nesting
* refactor: use tree chars for deploy endpoint listing
* fix: handle hyphenated directory names in flash dev codegen
* refactor: route worker logs through print() instead of logging
* fix: improve worker log filtering and add color to runtime output
* fix: indent user stdout under request, print before completion line
* fix: worker log filters now handle timezone offsets and JSON-wrapped messages
* fix: drop Rich Status spinner for pull progress
* fix: duplicate logs on subsequent requests to warm workers
* feat: redesign dev console lifecycle output
* fix: strip 'live-' prefix from endpoint names in dev console output
* feat: redesign flash dev startup and shutdown output
* fix: detect duplicate endpoint names across files in manifest builder
* fix: clean up flash dev startup route table
* feat: redesign flash deploy output
* fix: standardize spinner styles and add completion lines
* feat: add upload progress bar to flash deploy
* feat: redesign flash app and env command output
* feat: add column headers to app and env list/get output
* feat: simplify app list and env list output
* feat: redesign undeploy command output
* feat: G1a log format for flash dev runtime
* fix: align name columns in dev console output
* fix: use resource_name not name on WorkerInfo
* fix: set_name_width in generated server.py not parent process
* fix: catch remote execution errors in dev server route handlers
* Update pyproject.toml
* style: run ruff format
* fix: lint errors (F541 f-string, F401 unused import)
* fix: unused variable lint errors
* fix: update tests to match new CLI output format
* fix: set FLASH_IS_LIVE_PROVISIONING in integration tests
* fix: set .name on mock resources in LB and live serverless tests
* fix: set FLASH_IS_LIVE_PROVISIONING in concurrency integration tests
* fix: pad empty sentinel input to prevent runpod dropping input field
* style: format
* fix: remove unused os import
* fix: update handler generator tests for empty input acceptance
* fix: skip sentinel for client-mode endpoints, update empty input tests
* fix: keep sentinel for client endpoints, set live provisioning in image-mode test
* fix: live provisioning only in flash dev, guard fallback path
* fix: use Live resource classes for all non-deploy contexts
* fix: catch sentinel timeout with clear error message, 30s default
* fix: sentinel timeout 90s
* fix: update _is_live_provisioning tests for new default behavior
* fix: address PR review feedback (docstrings, response validation, timeout config, LB error handling)
* docs: rename flash run to flash dev, update execution model, add sentinel env vars
* fix: class sentinel uses plain kwargs instead of cloudpickle round-trip, skip self in arg mapping
* fix: class sentinel maps positional args via method_ref, keep cloudpickle
* fix: always pop method key from class handler input, update test assertion
Copy file name to clipboardExpand all lines: README.md
+53-44Lines changed: 53 additions & 44 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,55 +1,46 @@
1
1
# Flash
2
2
3
-
Flash is a Python SDK for developing cloud-native AI apps where you define everything—hardware, remote functions, and dependencies—using local code.
3
+
Flash is a Python SDK for developing cloud-native AI apps where you define everything -- hardware, remote functions, and dependencies -- using local code.
Write `@Endpoint` decorated Python functions on your local machine. Run them, and Flash automatically handles GPU/CPU provisioning and worker scaling on [Runpod Serverless](https://docs.runpod.io/serverless/overview).
20
+
Write `@Endpoint` decorated Python functions on your local machine. Deploy them with `flash deploy`, then call them by running the same script. Flash handles GPU/CPU provisioning and worker scaling on [RunPod Serverless](https://docs.runpod.io/serverless/overview).
22
21
23
22
## Setup
24
23
25
24
### Install Flash
26
25
27
-
Install Flash using `pip` or `uv`:
28
-
29
26
```bash
30
-
# Install with pip
31
27
pip install runpod-flash
32
-
33
-
# Or uv
28
+
# or
34
29
uv add runpod-flash
35
30
```
36
31
37
-
Flash requires [Python 3.10+](https://www.python.org/downloads/), and is currently available for macOS and Linux. Windows support is in development.
32
+
Flash requires [Python 3.10+](https://www.python.org/downloads/) on macOS or Linux. Windows support is in development.
38
33
39
34
### Authentication
40
35
41
-
Before you can use Flash, you need to authenticate with your Runpod account:
42
-
43
36
```bash
44
37
flash login
45
38
```
46
39
47
-
This saves your API key securely and allows you to use the Flash CLI and run`@Endpoint` functions.
40
+
This saves your API key and allows you to use the Flash CLI and call`@Endpoint` functions.
48
41
49
42
### Coding agent integration (optional)
50
43
51
-
Install the Flash skill package for AI coding agents like Claude Code, Cline, and Cursor:
52
-
53
44
```bash
54
45
npx skills add runpod/skills
55
46
```
@@ -71,18 +62,12 @@ from runpod_flash import Endpoint, GpuType
First run takes 30-60 seconds (provisioning). Subsequent runs take 2-3 seconds.
97
+
## How it works
98
+
99
+
Flash has two modes: **deploy** and **dev**.
100
+
101
+
### Deploy and run (`flash deploy` + `python script.py`)
102
+
103
+
Deploy packages your code and provisions endpoints on RunPod. After deploying, run your script directly and Flash routes calls to your deployed endpoints via implicit resolution:
Flash resolves endpoints by matching the app name (defaults to the current directory name) and environment (defaults to `production`). Configure with env vars or `.env`:
111
+
112
+
```bash
113
+
FLASH_APP=my-project # defaults to current directory name
114
+
FLASH_ENV=staging # defaults to "production"
115
+
```
116
+
117
+
### Dev mode (`flash dev`)
118
+
119
+
For local development and testing, `flash dev` starts a hybrid dev server that runs your FastAPI app locally while provisioning live ephemeral workers on RunPod:
120
+
121
+
```bash
122
+
flash dev # starts local server + provisions workers
123
+
flash dev --port 3000 # custom port
124
+
flash dev --auto-provision # provision all endpoints at startup
125
+
```
114
126
115
127
## What Flash does
116
128
117
-
-**Remote execution**: `@Endpoint` functions run on Runpod Serverless GPUs/CPUs
118
-
-**Auto-scaling**: Workers scale from 0 to N based on demand
119
-
-**Dependency management**: Packages install automatically on remote workers
120
-
-**Two patterns**: Queue-based (`@Endpoint`) for batch work, load-balanced (`Endpoint()` + routes) for REST APIs
129
+
-**Remote execution**: `@Endpoint` functions run on RunPod Serverless GPUs/CPUs
130
+
-**Implicit endpoint resolution**: `python script.py` routes to deployed endpoints automatically
131
+
-**Auto-scaling**: workers scale from 0 to N based on demand
132
+
-**Dependency management**: packages install automatically on remote workers
133
+
-**Two patterns**: queue-based (`@Endpoint`) for batch work, load-balanced (`Endpoint()` + routes) for REST APIs
121
134
-**Concurrency control**: `max_concurrency` lets each worker process multiple jobs simultaneously
122
135
123
136
## Documentation
@@ -126,47 +139,43 @@ Full documentation: **[docs.runpod.io/flash](https://docs.runpod.io/flash)**
126
139
127
140
-[Quickstart](https://docs.runpod.io/flash/quickstart) - First GPU workload in 5 minutes
128
141
-[Create endpoints](https://docs.runpod.io/flash/endpoint-functions) - Queue-based, load-balancing, and custom Docker endpoints
-[Configuration](https://docs.runpod.io/flash/configuration/parameters) - All endpoint parameters
131
144
132
145
## Flash apps
133
146
134
-
When you're ready to move beyond scripts and build a production-ready API, you can create a [Flash app](https://docs.runpod.io/flash/apps/overview) (a collection of interconnected endpoints with diverse hardware configurations) and deploy it to Runpod.
147
+
When you're ready to move beyond scripts and build a production-ready API, you can create a [Flash app](https://docs.runpod.io/flash/apps/overview) (a collection of interconnected endpoints with diverse hardware configurations) and deploy it to RunPod.
135
148
136
149
[Follow this tutorial to build your first Flash app](https://docs.runpod.io/flash/apps/build-app).
137
150
138
151
## Flash CLI
139
152
140
-
The Flash CLI provides a set of commands for managing your Flash apps and endpoints.
141
-
142
153
```bash
143
154
flash --help
144
155
```
145
156
146
157
[Learn more about the Flash CLI](https://docs.runpod.io/flash/cli/overview).
147
158
148
-
149
159
## Examples
150
160
151
161
Browse working examples: **[github.com/runpod/flash-examples](https://github.com/runpod/flash-examples)**
152
162
153
163
## Requirements
154
164
155
-
- Python 3.12
165
+
- Python 3.10-3.12
156
166
- macOS or Linux (Windows support in development)
157
-
- A [Runpod account](https://runpod.io/console) (email must be verified) with an API key
167
+
- A [RunPod account](https://runpod.io/console) (email must be verified) with an API key
158
168
159
169
## Contributing
160
170
161
171
We welcome contributions! See [RELEASE_SYSTEM.md](RELEASE_SYSTEM.md) for development workflow.
0 commit comments