Skip to content

Commit e837b26

Browse files
[Blog] Case study: how EA uses dstack to fast-track AI development (#2682)
1 parent 056eb71 commit e837b26

File tree

1 file changed

+87
-0
lines changed

1 file changed

+87
-0
lines changed

docs/blog/posts/ea-gtc25.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
---
2+
title: "Case study: how EA uses dstack to fast-track AI development"
3+
date: 2025-05-22
4+
description: "TBA"
5+
slug: ea-gtc25
6+
image: https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-2-background-min.png
7+
categories:
8+
- Case studies
9+
- NVIDIA
10+
---
11+
12+
# Case study: how EA uses dstack to fast-track AI development
13+
14+
At NVIDIA GTC 2025, Electronic Arts [shared :material-arrow-top-right-thin:{ .external }](https://www.nvidia.com/en-us/on-demand/session/gtc25-s73667/){:target="_blank"} how they’re scaling AI development and managing infrastructure across teams. They highlighted using tools like `dstack` to provision GPUs quickly, flexibly, and cost-efficiently. This case study summarizes key insights from their talk.
15+
16+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-1.png" width="630" />
17+
18+
EA has over 100+ AI projects running, and the number keeps growing. There are many teams with AI needs—game dev, ML engineers, AI researchers, and platform teams—supported by a central tech team. Some need full MLOps support; others have in-house expertise but need flexible tooling and infrastructure.
19+
20+
<!-- more -->
21+
22+
The central tech team ensures all teams have what they require, including tools, infrastructure, and expertise.
23+
24+
<!-- <img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-1-1.png" width="630" style="border: 0.5px dotted black"/> -->
25+
26+
As EA’s AI efforts grew, they faced major challenges:
27+
28+
* **Tool fragmentation**: Teams used different tools and workflows, leading to duplicated effort and poor collaboration.
29+
* **High GPU costs**: Spinning up GPUs could take days or weeks. To avoid delays, teams often left machines running idle, increasing costs.
30+
* **Heavy engineering burden**: ML engineers spent time managing infrastructure—setting up clusters, configuring environments, and deploying models—instead of building AI.
31+
32+
The typical AI workflow at EA includes:
33+
34+
1. Development and training
35+
2. Model storage and distribution
36+
3. Serving and scaling
37+
38+
Each stage comes with scaling challenges, from GPU compute provisioning efficiency to fragmented tooling and complex project setups.
39+
40+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-2.png" width="630" style="border: 0.5px dotted black"/>
41+
42+
EA's centralized approach uses these core ML tools:
43+
44+
* `dstack` – for provisioning compute for AI workloads at scale, covering everything related to ML development and training
45+
* ML Artifactory – for managing artifacts at scale
46+
* AXS (Kubernetes+) – for scalable inference and production serving
47+
48+
EA uses `dstack` to streamline GPU provisioning and AI workflow orchestration. It's open-source, cloud-agnostic, automated, and integrates seamlessly with teams' existing dev workflows.
49+
50+
<!-- In addition to the cloud-agnostic interface for ML teams, dstack eliminates the need for filing infra tickets and waiting days or weeks to get a GPU box or cluster, it just spins up what you need in minutes. -->
51+
52+
> *Because our teams are fragmented, we want them to be able to run on any environment of their choosing... It has to work with all of these. That means a centralized, unified interface to talk to all of them.*
53+
>
54+
> *— Wah Loon Keng, Sr. AI Engineer, Electronic Arts*
55+
56+
EA teams use `dstack` for three types of ML workloads:
57+
58+
* [Dev environments](../../docs/concepts/dev-environments.md): spining up GPU boxes pre-setup with a Gitrepo, and ready to use via desktop IDE such as VS Code, Cursor, etc
59+
* [Tasks](../../docs/concepts/tasks.md): seamless single-node or distributed training using open-source PyTorch libraries
60+
* [Services](../../docs/concepts/services.md): running model endpoints and Streamlit-style apps for quick internal demos and prototyping
61+
62+
Introducing `dstack` had a significant impact on EA’s ML teams. Before, getting access to GPU infrastructure could take days or even weeks. With dstack, teams can now spin up what they need in just minutes. This shift accelerated development by removing delays and freeing engineers to focus on building models.
63+
64+
> *With dstack, what used to take weeks, provisioning GPUs, setting up environments, now takes minutes. It changed how fast teams at EA can move.*
65+
>
66+
> *— Wah Loon Keng, Sr. AI Engineer, Electronic Arts*
67+
68+
Costs dropped by nearly a factor of three, largely due to dstack’s ability to automatically start and stop resources using spot and on-demand instances.
69+
70+
<img src="https://dstack.ai/static-assets/static-assets/images/dstack-ea-slide-3.png" width="630" />
71+
72+
Workflows became standardized, reproducible, and easier to trace—thanks to the use of version-controlled YAML configurations. Teams across different departments and cloud providers now follow the same setup and processes.
73+
74+
> `dstack` provisions compute on demand and automatically shuts it down when no longer needed. That alone saves you over three times in cost.”
75+
>
76+
> — Wah Loon Keng, Sr. AI Engineer, Electronic Arts
77+
78+
<!-- EA’s experience highlights how critical standardized, open-source tooling is for scaling AI across teams. Instead of each group reinventing infrastructure and workflows, they’ve moved toward a common stack that supports fast iteration, reproducibility, and cost-efficient use of compute. -->
79+
80+
By adopting tools that are cloud-agnostic and developer-friendly, EA has reduced friction—from provisioning GPUs to deploying models—and enabled teams to spend more time on actual ML work.
81+
82+
*Huge thanks to Chris and Keng from EA’s central tech team for sharing these insights. For more details, including the recording and slides, check out the full talk on the [NVIDIA GTC website :material-arrow-top-right-thin:{ .external }](https://www.nvidia.com/en-us/on-demand/session/gtc25-s73667/){:target="_blank"}.*
83+
84+
!!! info "What's next?"
85+
1. Check [dev environments](../../docs/concepts/dev-environments.md), [tasks](../../docs/concepts/tasks.md), [services](../../docs/concepts/services.md), and [fleets](../../docs/concepts/fleets.md)
86+
2. Follow [Quickstart](../../docs/quickstart.md)
87+
3. Browse [Examples](../../examples.md)

0 commit comments

Comments
 (0)