Skip to content

Commit 665e698

Browse files
authored
Merge branch 'main' into brendanmckeag-patch-9
2 parents 3b190d5 + 94eb8b0 commit 665e698

29 files changed

Lines changed: 444 additions & 100 deletions

accounts-billing/cost-centers.mdx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,15 @@ tag: "NEW"
66

77
Cost centers let you attach billing labels to your Runpod resources to track and manage spending across your organization. By grouping your compute resources into cost centers, you can attribute charges to specific teams, projects, or departments.
88

9+
<iframe
10+
className="w-full aspect-video rounded-xl"
11+
src="https://www.youtube.com/embed/0MEYF00Kno0"
12+
title="3 Minute Runpod: Allocate GPU spend to Cost Centers for reporting and invoicing"
13+
frameBorder="0"
14+
allow="fullscreen; accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
15+
allowFullScreen
16+
></iframe>
17+
918
## Why use cost centers
1019

1120
Cost centers help answer common questions about cloud GPU spending:

docs.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -391,7 +391,7 @@
391391
"flash/cli/overview",
392392
"flash/cli/init",
393393
"flash/cli/login",
394-
"flash/cli/run",
394+
"flash/cli/dev",
395395
"flash/cli/build",
396396
"flash/cli/deploy",
397397
"flash/cli/env",

flash/apps/build-app.mdx

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ If you haven't already, we recommend starting with the [Quickstart](/flash/quick
1414

1515
- You've [created a Runpod account](/get-started/manage-accounts).
1616
- You've [created a Runpod API key](/get-started/api-keys).
17-
- You've installed [Python 3.12](https://www.python.org/downloads/).
17+
- You've installed [Python 3.10, 3.11, 3.12, or 3.13](https://www.python.org/downloads/).
1818

1919
## Step 1: Initialize a new project
2020

@@ -80,10 +80,10 @@ uv pip install -r requirements.txt
8080

8181
## Step 4: Start the local API server
8282

83-
Use `flash run` to start the API server:
83+
Use `flash dev` to start the API server:
8484

8585
```bash
86-
uv run flash run
86+
uv run flash dev
8787
```
8888

8989
Open a new terminal tab or window and test your endpoints using cURL:
@@ -100,21 +100,21 @@ curl -X POST http://localhost:8888/lb_worker/process \
100100
-d '{"input_data": {"message": "Hello from Flash"}}'
101101
```
102102

103-
If you switch back to the terminal tab where you used `flash run`, you'll see the details of the job's progress.
103+
If you switch back to the terminal tab where you used `flash dev`, you'll see the details of the job's progress.
104104

105105
### Faster testing with auto-provisioning
106106

107107
For development with multiple endpoints, use `--auto-provision` to deploy all resources before testing:
108108

109109
```bash
110-
uv run flash run --auto-provision
110+
uv run flash dev --auto-provision
111111
```
112112

113113
This eliminates cold-start delays by provisioning all serverless endpoints upfront. Endpoints are cached and reused across server restarts, making subsequent runs faster. Resources are identified by name, so the same endpoint won't be re-deployed if the configuration hasn't changed.
114114

115115
## Step 5: Open the API explorer
116116

117-
Besides starting the API server, `flash run` also starts an interactive API explorer. Point your web browser at [http://localhost:8888/docs](http://localhost:8888/docs) to explore the API.
117+
Besides starting the API server, `flash dev` also starts an interactive API explorer. Point your web browser at [http://localhost:8888/docs](http://localhost:8888/docs) to explore the API.
118118

119119
To run endpoint functions in the explorer:
120120

flash/apps/customize-app.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,13 +145,13 @@ For details, see:
145145

146146
## Test your customizations
147147

148-
After customizing your app, test locally with `flash run`:
148+
After customizing your app, test locally with `flash dev`:
149149

150150
```bash
151-
flash run
151+
flash dev
152152

153153
# If using uv:
154-
uv run flash run
154+
uv run flash dev
155155
```
156156

157157
This starts a development server at http://localhost:8888 with:
@@ -169,7 +169,7 @@ Make sure to test:
169169

170170
<CardGroup cols={2}>
171171
<Card title="Test locally" href="/flash/apps/local-testing" icon="flask" horizontal>
172-
Use `flash run` for local development and testing.
172+
Use `flash dev` for local development and testing.
173173
</Card>
174174
<Card title="Deploy to Runpod" href="/flash/apps/deploy-apps" icon="rocket" horizontal>
175175
Deploy your application to production with `flash deploy`.

flash/apps/deploy-apps.mdx

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -275,6 +275,7 @@ The `flash_manifest.json` file is the brain of your deployment. It tells each en
275275
- Which functions to execute.
276276
- What Docker image to use.
277277
- How to configure resources (GPUs, workers, scaling).
278+
- Environment variables for workers.
278279
- How to route HTTP requests (for load balancer endpoints).
279280

280281
```json
@@ -293,6 +294,10 @@ The `flash_manifest.json` file is the brain of your deployment. It tells each en
293294
"imageName": "runpod/flash:latest",
294295
"gpuIds": "AMPERE_16",
295296
"workersMax": 3,
297+
"env": {
298+
"HF_TOKEN": "your_token",
299+
"MODEL_ID": "gpt2"
300+
},
296301
"functions": [
297302
{"name": "gpu_hello", "module": "gpu_worker"}
298303
]
@@ -329,6 +334,143 @@ When one endpoint needs to call a function on another endpoint:
329334

330335
Each endpoint maintains its own connection to the state manager, querying for peer endpoint URLs as needed and caching results for 300 seconds to minimize API calls.
331336

337+
#### Calling another endpoint from your code
338+
339+
To call one endpoint from another, import the target endpoint function **inside** your function body. Flash automatically detects these imports and generates the necessary dispatch stubs.
340+
341+
For example, if you have a GPU worker for inference:
342+
343+
```python gpu_worker.py
344+
from runpod_flash import Endpoint, GpuType
345+
346+
@Endpoint(
347+
name="gpu-inference",
348+
gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
349+
dependencies=["torch"]
350+
)
351+
async def gpu_inference(payload: dict) -> dict:
352+
import torch
353+
# GPU inference logic
354+
return {"result": "processed"}
355+
```
356+
357+
You can call it from a CPU-based pipeline endpoint:
358+
359+
```python cpu_worker.py
360+
from runpod_flash import Endpoint
361+
362+
@Endpoint(name="pipeline", cpu="cpu5c-4-8")
363+
async def classify(text: str) -> dict:
364+
# Import the GPU endpoint inside the function body
365+
from gpu_worker import gpu_inference
366+
367+
# Flash routes this call to the gpu-inference endpoint
368+
result = await gpu_inference({"text": text})
369+
return {"classification": result}
370+
```
371+
372+
## Call deployed endpoints from scripts
373+
374+
After deploying your Flash app, you can call your `@Endpoint` functions directly from Python scripts. Flash automatically resolves the app context from your project structure, so in most cases you can run scripts without any additional configuration.
375+
376+
### How it works
377+
378+
When you run a script that calls an `@Endpoint` function, Flash:
379+
380+
1. Detects the app context from the project directory structure.
381+
2. Looks up the deployed endpoint by name within the resolved app and environment.
382+
3. Routes the request to that endpoint using Flash's sentinel service.
383+
4. Returns the result to your script.
384+
385+
This lets you reuse the same `@Endpoint` function definitions to interact with deployed endpoints without modifying your code.
386+
387+
### Example: calling within the same script
388+
389+
The simplest approach is to call the endpoint directly in the same file where it's defined:
390+
391+
```python
392+
# gpu_worker.py
393+
import asyncio
394+
from runpod_flash import Endpoint, GpuType
395+
396+
@Endpoint(
397+
name="inference",
398+
gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
399+
dependencies=["torch"]
400+
)
401+
async def run_inference(data: dict) -> dict:
402+
import torch
403+
# Inference logic
404+
return {"result": "processed"}
405+
406+
async def main():
407+
result = await run_inference({"input": "data"})
408+
print(result)
409+
410+
if __name__ == "__main__":
411+
asyncio.run(main())
412+
```
413+
414+
Run the script:
415+
416+
```bash
417+
python gpu_worker.py
418+
```
419+
420+
### Example: importing from another script
421+
422+
You can also import and call endpoints from a separate script:
423+
424+
```python
425+
# call_inference.py
426+
import asyncio
427+
from gpu_worker import run_inference
428+
429+
async def main():
430+
# Flash resolves the app context automatically
431+
result = await run_inference({"input": "data"})
432+
print(result)
433+
434+
if __name__ == "__main__":
435+
asyncio.run(main())
436+
```
437+
438+
Run the script:
439+
440+
```bash
441+
python call_inference.py
442+
```
443+
444+
### Override the resolved context
445+
446+
Flash resolves the app name from your project's directory structure. Use `FLASH_APP` and `FLASH_ENV` environment variables to override this automatic resolution when needed.
447+
448+
A common use case is when you move a script to a different directory. Since the resolved app name depends on the directory location, moving the script changes the resolved context. To continue targeting the original app, set `FLASH_APP` explicitly:
449+
450+
```bash
451+
FLASH_APP=my-app python call_inference.py
452+
```
453+
454+
You can also override the environment:
455+
456+
```bash
457+
FLASH_APP=my-app FLASH_ENV=production python call_inference.py
458+
```
459+
460+
### Error without context
461+
462+
If Flash cannot resolve the app context and you haven't set the environment variables, it raises an error:
463+
464+
```text
465+
RuntimeError: no flash context for endpoint 'inference'. either:
466+
- use 'flash dev' for local development
467+
- set FLASH_APP and FLASH_ENV to target a deployed environment
468+
```
469+
470+
### Automatic context in deployed workers
471+
472+
When Flash deploys your app, it automatically sets `FLASH_APP` and `FLASH_ENV` environment variables on each worker. This enables cross-endpoint communication within your deployed application without additional configuration.
473+
332474
## Troubleshooting
333475

334476
### No @Endpoint functions found

flash/apps/initialize-project.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ import { LoadBalancingEndpointsTooltip, QueueBasedEndpointsTooltip } from "/snip
88

99
The `flash init` command creates a new Flash project with a complete project structure, including example <LoadBalancingEndpointsTooltip /> and <QueueBasedEndpointsTooltip />, and configuration files. This gives you a working starting point for building Flash applications.
1010

11-
Use `flash init` whenever you want to start a new Flash project, fully configured for you to run `flash run` and `flash deploy`.
11+
Use `flash init` whenever you want to start a new Flash project, fully configured for you to run `flash dev` and `flash deploy`.
1212

1313
## Create a new project
1414

@@ -105,13 +105,13 @@ Once your project is set up:
105105

106106
```bash
107107
# Start the development server
108-
flash run
108+
flash dev
109109

110110
# Open the API explorer
111111
# http://localhost:8888/docs
112112

113113
# If using uv:
114-
uv run flash run
114+
uv run flash dev
115115
```
116116

117117
Make changes to your worker files, and the server reloads automatically. When you're ready, deploy with:
@@ -126,6 +126,6 @@ uv run flash deploy
126126
## Next steps
127127

128128
- [Customize your app](/flash/apps/customize-app) to add endpoints and modify configurations.
129-
- [Test locally](/flash/apps/local-testing) with `flash run`.
129+
- [Test locally](/flash/apps/local-testing) with `flash dev`.
130130
- [Deploy to production](/flash/apps/deploy-apps) with `flash deploy`.
131131
- [View the flash init reference](/flash/cli/init) for all options.

0 commit comments

Comments
 (0)