11# MLflow Cloudsmith Plugin
22
3- [ ![ CI] ( https://github.com/bblizniak /mlflow-cloudsmith-plugin/actions/workflows/ci.yml/badge.svg )] ( https://github.com/bblizniak /mlflow-cloudsmith-plugin/actions/workflows/ci.yml )
3+ [ ![ CI] ( https://github.com/BartoszBlizniak /mlflow-cloudsmith-plugin/actions/workflows/ci.yml/badge.svg?branch=master )] ( https://github.com/BartoszBlizniak /mlflow-cloudsmith-plugin/actions/workflows/ci.yml )
44[ ![ License: Apache 2.0] ( https://img.shields.io/badge/License-Apache%202.0-blue.svg )] ( https://opensource.org/licenses/Apache-2.0 )
55[ ![ Python 3.8–3.12] ( https://img.shields.io/badge/python-3.8%E2%80%933.12-blue.svg )] ( https://www.python.org/downloads/ )
66[ ![ MLflow 2.x] ( https://img.shields.io/badge/MLflow-2.x-orange.svg )] ( https://mlflow.org/ )
77
8- A minimal MLflow Artifact Repository plugin that stores artifacts as Cloudsmith RAW packages.
8+ A minimal ** MLflow Artifact Repository plugin** that stores artifacts as ** Cloudsmith RAW packages** .
99
10- Highlights
11- - Seamless MLflow integration (entry point: cloudsmith://owner/repo)
12- - Preloads and lists artifacts with correct immediate-children semantics for the MLflow UI
13- - Organized via tags (mlflow, experiment-<id >, run-<id >, path-<artifact_path>)
10+ ---
1411
15- ## Install
12+ ## Highlights
13+
14+ * Seamless MLflow integration (` cloudsmith://owner/repo ` )
15+ * Preloads and lists artifacts with correct immediate-children semantics for the MLflow UI
16+ * Organized via tags (` mlflow ` , ` experiment-<id> ` , ` run-<id> ` , ` path-<artifact_path> ` )
17+
18+ ---
19+
20+ ## Installation
1621
1722``` bash
1823pip install -e .
1924```
2025
21- ## Use with MLflow
26+ ---
27+
28+ ## Usage with MLflow
2229
2330``` python
2431import os
@@ -28,149 +35,189 @@ os.environ["CLOUDSMITH_API_KEY"] = "<your-api-key>"
2835os.environ[" MLFLOW_ARTIFACT_URI" ] = " cloudsmith://<owner>/<repo>"
2936
3037with mlflow.start_run():
31- mlflow.log_param(" learning_rate" , 0.01 )
32- mlflow.log_metric(" accuracy" , 0.95 )
33- mlflow.log_artifact(" model.pkl" )
38+ mlflow.log_param(" learning_rate" , 0.01 )
39+ mlflow.log_metric(" accuracy" , 0.95 )
40+ mlflow.log_artifact(" model.pkl" )
3441```
3542
36- Or use the repository directly:
43+ ### Direct Repository Usage
3744
3845``` python
3946from plugin.cloudsmith_repository import CloudsmithArtifactRepository
4047
4148repo = CloudsmithArtifactRepository(" cloudsmith://<owner>/<repo>" )
4249repo.log_artifact(" model.pkl" , " models/production" )
50+
4351for info in repo.list_artifacts(" models" ):
44- print (info.path, info.file_size, info.is_dir)
52+ print (info.path, info.file_size, info.is_dir)
4553```
4654
47- ## URI format
55+ ---
56+
57+ ## URI Format
4858
4959```
5060cloudsmith://<owner>/<repository>[/<path>]
5161```
5262
53- Examples:
54- - cloudsmith://my-org/ml-artifacts
55- - cloudsmith://my-org/ml-artifacts/experiments
63+ ** Examples:**
64+
65+ * ` cloudsmith://my-org/ml-artifacts `
66+ * ` cloudsmith://my-org/ml-artifacts/experiments `
67+
68+ ---
5669
5770## Configuration
5871
59- - CLOUDSMITH_API_KEY: Cloudsmith API token (required)
60- - CLOUDSMITH_DEBUG: true/false to toggle verbose logging (optional)
61- - MLFLOW_EXPERIMENT_ID, MLFLOW_RUN_ID: used for tagging (optional)
62-
63- ## How it works (brief)
64- - Each MLflow artifact is uploaded as a Cloudsmith RAW package with preserved original filename and descriptive metadata.
65- - The plugin builds an in-memory tree of artifact paths and returns only immediate children for any requested subpath, matching MLflow UI browsing behavior.
66- - Packages are tagged for simple filtering (mlflow, experiment-* , run-* , path-* with slashes mapped to dashes).
67-
68- ## Artifact representation and tagging (example)
69-
70- When you log artifacts during a run, each file becomes a Cloudsmith RAW package.
71-
72- Example run context
73- - experiment_id: 123
74- - run_id: 0123456789abcdef0123456789abcdef
75- - files logged:
76- - models/model.pkl
77- - conda.yaml
78-
79- For each file, the package will have:
80- - name: mlflow-<base-filename >-<run_id_first8>-<timestamp >
81- - e.g., mlflow-model-01234567-1754914964
82- - version: "<experiment_id>+<run_id>"
83- - e.g., "123+0123456789abcdef0123456789abcdef"
84- - filename: the original filename (e.g., model.pkl)
85- - description:
86- - MLflow artifact: <artifact_path> (experiment: <experiment_id>, run: <run_id>)
87- - e.g., "MLflow artifact: models/model.pkl (experiment: 123, run: 0123456789abcdef0123456789abcdef)"
88- - tags (info):
89- - mlflow
90- - experiment-123
91- - run-0123456789abcdef0123456789abcdef
92- - path-models-model.pkl
93-
94- Notes
95- - The description contains the authoritative artifact path used by listing and downloads.
96- - The path-* tag replaces slashes with dashes and is a fallback for path reconstruction.
97- - Listing in MLflow UI uses immediate-children semantics, so for the example above:
98- - list_artifacts("") returns [ models/ (dir), conda.yaml]
99- - list_artifacts("models") returns [ models/model.pkl]
72+ | Variable | Description | Required |
73+ | ---------------------- | ------------------------------------------ | -------- |
74+ | ` CLOUDSMITH_API_KEY ` | Cloudsmith API token | ✅ |
75+ | ` CLOUDSMITH_DEBUG ` | ` true ` / ` false ` to toggle verbose logging | ❌ |
76+ | ` MLFLOW_EXPERIMENT_ID ` | Used for tagging | ❌ |
77+ | ` MLFLOW_RUN_ID ` | Used for tagging | ❌ |
78+
79+ ---
80+
81+ ## How It Works (Brief)
82+
83+ * Each MLflow artifact is uploaded as a ** Cloudsmith RAW package** with preserved original filename and metadata.
84+ * The plugin builds an ** in-memory tree** of artifact paths and returns ** only immediate children** for UI browsing.
85+ * Packages are tagged for easy filtering:
86+ ` mlflow ` , ` experiment-* ` , ` run-* ` , ` path-* ` (slashes replaced with dashes).
87+
88+ ---
89+
90+ ## Artifact Representation & Tagging
91+
92+ ** Example Run Context**
93+
94+ * ` experiment_id ` : ` 123 `
95+ * ` run_id ` : ` 0123456789abcdef0123456789abcdef `
96+ * Files logged:
97+
98+ * ` models/model.pkl `
99+ * ` conda.yaml `
100+
101+ ** Package Details**
102+
103+ * ** Name:** ` mlflow-<base-filename>-<run_id_first8>-<timestamp> `
104+ e.g., ` mlflow-model-01234567-1754914964 `
105+ * ** Version:** ` <experiment_id>+<run_id> `
106+ e.g., ` 123+0123456789abcdef0123456789abcdef `
107+ * ** Filename:** Original filename (e.g., ` model.pkl ` )
108+ * ** Description:**
109+
110+ ```
111+ MLflow artifact: <artifact_path> (experiment: <experiment_id>, run: <run_id>)
112+ ```
113+
114+ e.g., ` MLflow artifact: models/model.pkl (experiment: 123, run: 0123456789abcdef0123456789abcdef) `
115+ * ** Tags:**
116+
117+ * ` mlflow `
118+ * ` experiment-123 `
119+ * ` run-0123456789abcdef0123456789abcdef `
120+ * ` path-models-model.pkl `
121+
122+ ** Notes**
123+
124+ * Descriptions hold the authoritative artifact path for listing & downloads.
125+ * ` path-* ` tags replace ` / ` with ` - ` for fallback reconstruction.
126+ * ** MLflow UI listing:**
127+
128+ * ` list_artifacts("") ` → ` [models/ (dir), conda.yaml] `
129+ * ` list_artifacts("models") ` → ` [models/model.pkl] `
130+
131+ ---
100132
101133## Testing
102134
103135``` bash
104136pytest -q
105137```
106138
107- Integration test (opt -in):
139+ ### Integration Tests (Opt -in)
108140
109141``` bash
110142export CLOUDSMITH_RUN_INTEGRATION=1
111- export CLOUDSMITH_API_KEY=... # required
112- export CLOUDSMITH_TEST_OWNER=... # required
113- export CLOUDSMITH_TEST_REPO=... # required
143+ export CLOUDSMITH_API_KEY=... # required
144+ export CLOUDSMITH_TEST_OWNER=... # required
145+ export CLOUDSMITH_TEST_REPO=... # required
146+
114147pytest -q
115148```
116149
117- ## Cleanup script (delete by run/experiment)
150+ ---
151+
152+ ## Cleanup Script (Delete by Run/Experiment)
118153
119- Use ` scripts/cleanup_orphans.sh ` to delete Cloudsmith RAW packages for a given run id and/or experiment id. No MLflow server is contacted. Requirements: bash, curl, jq.
154+ ` scripts/cleanup_orphans.sh ` deletes Cloudsmith RAW packages for a given run or experiment ID.
155+ ** No MLflow server is contacted.** Requires: ` bash ` , ` curl ` , ` jq ` .
120156
121- Environment variables or flags:
157+ ** Environment Variables / Flags **
122158
123- - CLOUDSMITH_API_KEY – Cloudsmith API key (required)
124- - CLOUDSMITH_OWNER – Cloudsmith owner/org slug (required)
125- - CLOUDSMITH_REPO – Cloudsmith repo slug (required)
126- - RUN_ID / --run-id – MLflow run id to match (optional)
127- - EXPERIMENT_ID / --experiment-id – MLflow experiment id to match (optional)
128- - CLEANUP_CONFIRM=1 or --confirm – Perform deletion (dry-run by default)
159+ | Variable / Flag | Description | Required |
160+ | ----------------------------------- | ---------------------------------- | -------- |
161+ | ` CLOUDSMITH_API_KEY ` | API key | ✅ |
162+ | ` CLOUDSMITH_OWNER ` | Owner/org slug | ✅ |
163+ | ` CLOUDSMITH_REPO ` | Repo slug | ✅ |
164+ | ` RUN_ID ` / ` --run-id ` | MLflow run ID | ❌ |
165+ | ` EXPERIMENT_ID ` / ` --experiment-id ` | MLflow experiment ID | ❌ |
166+ | ` CLEANUP_CONFIRM=1 ` / ` --confirm ` | Perform deletion (dry-run default) | ❌ |
129167
130- Examples:
168+ ** Examples:**
131169
132170``` bash
133171# Dry-run: show packages for a run-id
134172CLOUDSMITH_API_KEY=... CLOUDSMITH_OWNER=myorg CLOUDSMITH_REPO=myrepo \
135- scripts/cleanup_orphans.sh --run-id 0123456789abcdef0123456789abcdef
173+ scripts/cleanup_orphans.sh --run-id 0123456789abcdef0123456789abcdef
136174
137- # Delete for a run-id (confirm required)
175+ # Delete for a run-id (confirmation required)
138176CLOUDSMITH_API_KEY=... CLOUDSMITH_OWNER=myorg CLOUDSMITH_REPO=myrepo \
139- scripts/cleanup_orphans.sh --run-id 0123456789abcdef0123456789abcdef --confirm
177+ scripts/cleanup_orphans.sh --run-id 0123456789abcdef0123456789abcdef --confirm
140178
141- # Combine experiment-id + run-id for precise matching
179+ # Combine experiment-id + run-id
142180CLOUDSMITH_API_KEY=... CLOUDSMITH_OWNER=myorg CLOUDSMITH_REPO=myrepo \
143- scripts/cleanup_orphans.sh --experiment-id 123 --run-id 0123456789abcdef0123456789abcdef --confirm
181+ scripts/cleanup_orphans.sh --experiment-id 123 --run-id 0123456789abcdef0123456789abcdef --confirm
144182```
145183
146- ## Contributing
184+ ---
147185
148- We welcome contributions.
186+ ## Contributing
149187
150- Set up
151- - Create and activate a Python environment.
152- - Install dev dependencies and the package in editable mode.
188+ We welcome contributions!
153189
154- Run checks
155- - Lint/format: flake8, black --check
156- - Tests: pytest
190+ ** Setup**
157191
158- Example
159192``` bash
193+ # Create & activate virtual environment
160194pip install -e .
161195pip install -r requirements.txt
196+ ```
162197
163- # Lint
198+ ** Run Checks**
199+
200+ ``` bash
201+ # Lint & Format
164202flake8 plugin
165203black --check plugin tests
166204
167205# Tests
168206pytest -q
169207```
170208
209+ ---
210+
171211## Continuous Integration
172- - GitHub Actions runs flake8, black --check, and pytest with coverage on pushes and PRs.
173- - Python versions: 3.8, 3.9, 3.10, 3.11, 3.12.
212+
213+ * ** GitHub Actions** runs:
214+
215+ * ` flake8 `
216+ * ` black --check `
217+ * ` pytest ` (with coverage)
218+ * Python versions: ** 3.8–3.12**
219+
220+ ---
174221
175222## License
176223
0 commit comments