In this workshop we will revise Module 5 of Machine Learning Zoomcamp.
In particular, we will introduce more modern tools:
- Scikit-Learn pipelines
- uv instead of Pipenv
- FastAPI instead of Flask
- Fly.io instead of AWS EBS
In this workshop, we will follow the same order as in the module:
- Saving and loading the model with pickle
- Turning the notebook into a train script
- Introduction to FastAPI (instead of Flask)
- Serving the model with FastAPI
- Input validation with Pydantic (new)
- Virtual environment management - uv (instead of Pipenv)
- Containerization - Docker
- Deployment with Fly.io
For the environment, you can use GitHub Codespaces
Install the required libraries:
pip install jupyter scikit-learn pandasThen download the starter notebook and save it as workshop-uv-fastapi.ipynb. We will base our work on it.
wget https://raw.githubusercontent.com/alexeygrigorev/workshops/main/mlzoomcamp-fastapi-uv/starter.ipynb -O workshop-uv-fastapi.ipynbOpen it in Jupyter.
Let's make a prediction for this datapoint:
datapoint = {
'gender': 'female',
'seniorcitizen': 0,
'partner': 'yes',
'dependents': 'no',
'phoneservice': 'no',
'multiplelines': 'no_phone_service',
'internetservice': 'dsl',
'onlinesecurity': 'no',
'onlinebackup': 'yes',
'deviceprotection': 'no',
'techsupport': 'no',
'streamingtv': 'no',
'streamingmovies': 'no',
'contract': 'month-to-month',
'paperlessbilling': 'yes',
'paymentmethod': 'electronic_check',
'tenure': 1,
'monthlycharges': 29.85,
'totalcharges': 29.85
}We first transform it with the dictionary vectorizer
X = dv.transform(datapoint)And then get the predictions
model.predict_proba(X)[0, 1]Let's save this to pickle:
with open('model.bin', 'wb') as f_out:
pickle.dump((dv, model), f_out)This is how we load:
with open('model.bin', 'rb') as f_in:
(dv, model) = pickle.load(f_in)It's not convenient to deal with two objects: dv and model.
Let's combine them into one:
from sklearn.pipeline import make_pipeline
pipeline = make_pipeline(
DictVectorizer(),
LogisticRegression(solver='liblinear')
)
pipeline.fit(train_dict, y_train)Now predicting becomes simpler too:
pipeline.predict_proba(datapoint)[0, 1]We can now turn this notebook into a training script:
jupyter nbconvert --to=script workshop-uv-fastapi.ipynb
mv workshop-uv-fastapi.py train.pyLet's edit it.
At the end, we have the code similar to train.py
df = load_data()
pipeline = train_model(df)
save_model(pipeline, 'model.bin')
print('Model saved to model.bin')Let's load the saved model. Create predict.py and load the model there:
import pickle
with open('model.bin', 'rb') as f_in:
pipeline = pickle.load(f_in)
# apply the modelNow we will turn predict.py into a web service.
Let's install FastAPI and uvicorn for that:
pip install fastapi uvicornThe simplest FastAPI app (created with ChatGPT by translating the old Flask code).
Let's put it to ping.py:
from fastapi import FastAPI
import uvicorn
app = FastAPI(title="ping")
@app.get("/ping")
def ping():
return "PONG"
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=9696)Run it:
python ping.py"Proper" way of running it:
uvicorn ping:app --host 0.0.0.0 --port 9696 --reloadYou can now open it in the browser at http://localhost:9696/ping
Or send a request with curl:
curl localhost:9696/pingNo differences with Flask so far. But we can see the docs (not possible with Flask):
Let's now turn our script into a web application:
import pickle
from fastapi import FastAPI
import uvicorn
app = FastAPI(title="customer-churn-prediction")
with open('model.bin', 'rb') as f_in:
pipeline = pickle.load(f_in)
def predict_single(customer):
result = pipeline.predict_proba(customer)[0, 1]
return float(result)
@app.post("/predict")
def predict(customer):
prob = predict_single(customer)
return {
"churn_probability": prob,
"churn": bool(prob >= 0.5)
}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=9696)Run it:
uvicorn predict:app --host 0.0.0.0 --port 9696 --reloadRight now it doesn't recognize it as JSON, so let's add type hints:
from typing import Dict, Any
@app.post("/predict")
def predict(customer: Dict[str, Any]):
prob = predict_single(customer)
return {
"churn_probability": prob,
"churn": bool(prob >= 0.5)
}Open the docs and send a request:
{
"gender": "female",
"seniorcitizen": 0,
"partner": "yes",
"dependents": "no",
"phoneservice": "no",
"multiplelines": "no_phone_service",
"internetservice": "dsl",
"onlinesecurity": "no",
"onlinebackup": "yes",
"deviceprotection": "no",
"techsupport": "no",
"streamingtv": "no",
"streamingmovies": "no",
"contract": "month-to-month",
"paperlessbilling": "yes",
"paymentmethod": "electronic_check",
"tenure": 1,
"monthlycharges": 29.85,
"totalcharges": 29.85
}We can also do it with curl:
curl -X 'POST' 'http://localhost:9696/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"gender": "female",
"seniorcitizen": 0,
"partner": "yes",
"dependents": "no",
"phoneservice": "no",
"multiplelines": "no_phone_service",
"internetservice": "dsl",
"onlinesecurity": "no",
"onlinebackup": "yes",
"deviceprotection": "no",
"techsupport": "no",
"streamingtv": "no",
"streamingmovies": "no",
"contract": "month-to-month",
"paperlessbilling": "yes",
"paymentmethod": "electronic_check",
"tenure": 1,
"monthlycharges": 29.85,
"totalcharges": 29.85
}'We need to include headers -- FastAPI is more strict about schemas and validation than Flask.
To do it from a script, we'll use the requests library. Install it:
pip install requestsCreate test.py:
import requests
url = 'http://localhost:9696/predict'
customer = {
'gender': 'female',
'seniorcitizen': 0,
'partner': 'yes',
'dependents': 'no',
'phoneservice': 'no',
'multiplelines': 'no_phone_service',
'internetservice': 'dsl',
'onlinesecurity': 'no',
'onlinebackup': 'yes',
'deviceprotection': 'no',
'techsupport': 'no',
'streamingtv': 'no',
'streamingmovies': 'no',
'contract': 'month-to-month',
'paperlessbilling': 'yes',
'paymentmethod': 'electronic_check',
'tenure': 1,
'monthlycharges': 29.85,
'totalcharges': 29.85
}
response = requests.post(url, json=customer)
predictions = response.json()
print(predictions)
if predictions['churn']:
print('customer is likely to churn, send promo')
else:
print('customer is not likely to churn')Another feature of FastAPI that we didn't have in Flask is input and output validation
To come up with this schema, I used ChatGPT. I gave it an example, and also the output of this piece of code:
for n in numerical:
print(df[n].describe())
print()
for c in categorical:
print(df[c].value_counts())
print()The models (input and output):
from typing import Literal
from pydantic import BaseModel, Field
class Customer(BaseModel):
gender: Literal["male", "female"]
seniorcitizen: Literal[0, 1]
partner: Literal["yes", "no"]
dependents: Literal["yes", "no"]
phoneservice: Literal["yes", "no"]
multiplelines: Literal["no", "yes", "no_phone_service"]
internetservice: Literal["dsl", "fiber_optic", "no"]
onlinesecurity: Literal["no", "yes", "no_internet_service"]
onlinebackup: Literal["no", "yes", "no_internet_service"]
deviceprotection: Literal["no", "yes", "no_internet_service"]
techsupport: Literal["no", "yes", "no_internet_service"]
streamingtv: Literal["no", "yes", "no_internet_service"]
streamingmovies: Literal["no", "yes", "no_internet_service"]
contract: Literal["month-to-month", "one_year", "two_year"]
paperlessbilling: Literal["yes", "no"]
paymentmethod: Literal[
"electronic_check",
"mailed_check",
"bank_transfer_(automatic)",
"credit_card_(automatic)",
]
tenure: int = Field(..., ge=0)
monthlycharges: float = Field(..., ge=0.0)
totalcharges: float = Field(..., ge=0.0)
class PredictResponse(BaseModel):
churn_probability: float
churn: boolNow we can be more explicit with the input we expect and the output we generate:
@app.post("/predict")
def predict(customer: Customer) -> PredictResponse:
prob = predict_single(customer.model_dump())
return PredictResponse(
churn_probability=prob,
churn=prob >= 0.5
)Note: if you use customer.dict() instead of model_dump(), you can get the following warning:
PydanticDeprecatedSince20: The `dict` method is deprecated; use `model_dump` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.11/migration/
We now can test how it behaves with incorrect input. Let's add a
field "whatever": 31337 to our test.py and execute it.
When we run it, nothing happens: it continues working like previously.
In order to make Pydantic raise an error, we need to add model_config:
from pydantic import ConfigDict
class Customer(BaseModel):
model_config = ConfigDict(extra="forbid")
... # rest of the fieldsNow we will get an error:
response: {'detail': [{'type': 'extra_forbidden', 'loc': ['body', 'whatever'], 'msg': 'Extra inputs are not permitted', 'input': 31337}]}What if we send a value that is not defined by the model? For example,
{
...
"streamingtv": "maybe"
...
}In this case, it works as expected: it throws an error:
response: {'detail': [{'type': 'literal_error', 'loc': ['body', 'streamingtv'], 'msg': "Input should be 'no', 'yes' or 'no_internet_service'", 'input': 'maybe', 'ctx': {'expected': "'no', 'yes' or 'no_internet_service'"}}]}It works now but we can have version conflicts with other projects. So we need to isolate this project from the others.
We will not go into theoretical details about why you want to use virtual environments. Check module 5 for more information
For that, we will use uv -- a tool
for dependency and environment management
Install it:
pip install uvInitialize the project:
uv initWe don't need main.py, so we can remove it:
rm main.pyNotice that it created some files:
- .python-version
- pyproject.toml
We need to have Scikit-Learn and FastAPI for this project. So let's add them:
uv add scikit-learn fastapi uvicornA few more things appeared:
- .venv with all the packages
- uv.lock with a more detailed description of the dependencies
We also have a development dependency -- we won't need it in production:
uv add --dev requestsIf we want to run something in this virtual environment, simply
prefix it with uv run:
uv run uvicorn predict:app --host 0.0.0.0 --port 9696 --reload
uv run python test.pyWhen you get a fresh copy of a project that already uses uv, you can install all the dependencies using the sync command:
uv syncLet's use Docker for complete isolation. If you want to learn more about Docker, check module 5.
In this workshop, we will adjust the Dockerfile we created in the module.
First, we need to decide, which Python version we need. You can check the version of Python using this command:
$ python -V
Python 3.13.5So let's use the 3.13.5 image of Python:
# Use the official Python 3.13.5 slim version based on Debian Bookworm as the base image
FROM python:3.13.5-slim-bookworm
# Copy the 'uv' and 'uvx' executables from the latest uv image into /bin/ in this image
# 'uv' is a fast Python package installer and environment manager
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/
# Set the working directory inside the container to /code
# All subsequent commands will be run from here
WORKDIR /code
# Add the virtual environment's bin directory to the PATH so Python tools work globally
ENV PATH="/code/.venv/bin:$PATH"
# Copy the project configuration files into the container
# pyproject.toml → project metadata and dependencies
# uv.lock → locked dependency versions (for reproducibility)
# .python-version → Python version specification
COPY "pyproject.toml" "uv.lock" ".python-version" ./
# Install dependencies exactly as locked in uv.lock, without updating them
RUN uv sync --locked
# Copy application code and model data into the container
COPY "predict.py" "model.bin" ./
# Expose TCP port 9696 so it can be accessed from outside the container
EXPOSE 9696
# Run the application using uvicorn (ASGI server)
# predict:app → refers to 'app' object inside predict.py
# --host 0.0.0.0 → listen on all interfaces
# --port 9696 → listen on port 9696
ENTRYPOINT ["uvicorn", "predict:app", "--host", "0.0.0.0", "--port", "9696"](The comments are added by ChatGPT)
Build it:
docker build -t predict-churn .And run it:
docker run -it --rm -p 9696:9696 predict-churnOnce the application is dockerized, you can deploy it everywhere.
In the course, we showed Elastic Beanstalk. Other alternatives:
- Google CloudRun
- AWS App Runner
- Fly.io
- Check the course for contributions from other students, there are a lot of other options
According to ChatGPT, using Fly.io is very simple, so let's do that:
# for other OS, check https://fly.io/docs/flyctl/install/
# you may also need to replace fly with flyctl
curl -L https://fly.io/install.sh | sh
fly auth signup
fly launch --generate-name
fly deployGet the URL from the logs, it should be something along these lines:
Visit your newly deployed app at https://mlzoomcamp-flask-uv.fly.dev/
Put the url into test.py and check that it works.
Now you can terminate the deployment
fly apps destroy <app-name>You can see the list of apps by using the apps list command.
Note: check the pricing information.
In this workshop we dockerized our ML model and deployed it to the cloud.
If you want to learn more about ML Engineering, check our ML Zoomcamp course.