Deploys MLflow on Nebari with Keycloak authentication, PostgreSQL backend storage, and automatic TLS.
-
Create the PostgreSQL credentials secret:
kubectl create namespace mlflow kubectl create secret generic mlflow-pack-postgresql \ --namespace mlflow \ --from-literal=password="$(openssl rand -base64 32)" \ --from-literal=postgres-password="$(openssl rand -base64 32)"
The secret name must be
<release-name>-postgresqland contain keyspassword(mlflow DB user) andpostgres-password(superuser). -
Copy the example ArgoCD Application and edit it for your cluster:
cp examples/nebari-values.yaml /path/to/your/gitops-repo/apps/mlflow-pack.yaml
Update
nebariapp.hostname,nebariapp.keycloakHostname, andmlflow.postgresql.primary.persistence.storageClassfor your environment. -
Add
mlflow.<your-domain>to your gateway certificate and DNS.
See examples/nebari-values.yaml for the full ArgoCD Application manifest.
To allow nebari-data-science-pack notebooks to log experiments to MLflow, add the following to your data-science-pack ArgoCD Application values:
jupyterhub:
singleuser:
extraEnv:
MLFLOW_TRACKING_URI: "http://mlflow-pack.mlflow.svc.cluster.local:80"
networkPolicy:
egress:
- ports:
- port: 5000
protocol: TCP
to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: mlflowThe egress rule uses port 5000 (the pod port) because NetworkPolicy operates at the pod IP level, not the ClusterIP service level (which maps 80 to 5000).
After applying, existing JupyterLab sessions must be restarted (stop/start from the hub control panel) to pick up the new environment variable and NetworkPolicy.
import mlflow
mlflow.set_experiment("test")
with mlflow.start_run():
mlflow.log_param("framework", "pytorch")
mlflow.log_metric("accuracy", 0.95)
print("Run ID:", mlflow.last_active_run().info.run_id)By default, this chart bundles a Bitnami PostgreSQL instance. For dev/testing you can pass the password inline instead of creating a secret:
helm install mlflow-pack . \
--set mlflow.postgresql.auth.password=my-dev-passwordDo not use inline passwords in production or commit them to a gitops repository.
To disable PostgreSQL and use in-memory SQLite (data lost on pod restart):
mlflow:
postgresql:
enabled: falseThe chart automatically whitelists the NebariApp hostname and the
cluster-internal service name via MLFLOW_SERVER_ALLOWED_HOSTS. To allow
additional hosts:
security:
additionalAllowedHosts:
- custom-alias.internalkubectl get nebariapp -n mlflow
kubectl describe nebariapp -n mlflowCheck conditions: RoutingReady, TLSReady, AuthReady should all be True.
Apache 2.0 - see LICENSE.