Skip to content

Latest commit

 

History

History
210 lines (163 loc) · 8.61 KB

File metadata and controls

210 lines (163 loc) · 8.61 KB
icon lucide/blocks

Architecture

arch

STAC Catalog Structure

!!! info "Three collections, all items use MLM Extension and Version Extension"

Catalog: fair-models
|
+-- Collection: base-models
|     Model blueprints contributed via PR.
|     Each item = complete model card (weights, code, Docker, MLM spec).
|     Versioned by contributors, registered via CLI utility.
|     |
|     +-- Item: unet-segmentation (v1)           category: semantic-segmentation
|     +-- Item: resnet18-classification (v1)      category: classification
|     +-- Item: yolo11n-detection (v1)            category: object-detection
|
+-- Collection: local-models
|     Finetuned models produced by ZenML pipelines.
|     Only promoted (production) versions appear here.
|     |
|     +-- Item: unet-segmentation-finetuned-banepa-v2   (production, latest-version)
|     +-- Item: unet-segmentation-finetuned-banepa-v1   (deprecated: true)
|     +-- Item: yolo11n-detection-finetuned-banepa-v1   (production)
|
+-- Collection: datasets
      Training data registered via fAIr UI/backend.
      |
      +-- Item: buildings-banepa-semantic-segmentation    category: semantic-segmentation
      +-- Item: buildings-banepa-object-detection         category: object-detection

??? note "What STAC Items Contain"

Standard STAC/MLM fields are used wherever possible. A small set of
`fair:*` fields fills gaps where MLM has no equivalent:
`fair:metrics_spec` (evaluation metrics vocabulary),
`fair:split_spec` (train/val split strategy), and
`fair:hyperparameters_spec` (hyperparameter types, ranges, descriptions).

### Base model item

See [`models/unet_segmentation/stac-item.json`](https://github.com/hotosm/fAIr-models/tree/master/models/unet_segmentation/stac-item.json) for a complete example.
All three base models (`unet_segmentation`, `resnet18_classification`, `yolo11n_detection`) follow this structure.

Key properties: `mlm:name`, `mlm:architecture`, `mlm:tasks`, `mlm:framework`,
`mlm:input` (with `pre_processing_function`), `mlm:output` (with `post_processing_function`
and `classification:classes`), `mlm:hyperparameters`, `keywords`,
`fair:metrics_spec`, `fair:split_spec`, `fair:hyperparameters_spec`.

Key assets: `checkpoint` (torch weights, HTTPS URL), `model` (ONNX, optional for base models),
`source-code` (with `mlm:entrypoint`), `mlm:training` / `mlm:inference` (Docker image OCI references).

The `mlm:entrypoint` tells the backend which Python function to call.
`pre_processing_function` / `post_processing_function` are standard MLM
Processing Expression fields.

### Local model item

Same MLM fields as base model, plus:

- `derived_from` link pointing to the base model item
- `derived_from` link pointing to the dataset item used for training
- `checkpoint` asset (torch weights) + `model` asset (ONNX) pointing to S3 finetuned artifacts
- Runtime assets reference the same Docker image as parent base model
- Version Extension: `version`, `deprecated`, `predecessor-version` / `successor-version` / `latest-version` links
- `mlm:hyperparameters` reflects the actual training params used

### Dataset item

Label + file extensions. Properties: `label:type`, `label:tasks`, `label:classes`, `keywords`.
Assets: `chips` (image directory), `labels` (GeoJSON).

Tagging and Classification

Concept Standard field Example values
ML task mlm:tasks semantic-segmentation, object-detection
Feature type tags keywords (STAC core) building, road, tree
Output geometry keywords (STAC core) polygon, line, point
Output classes classification:classes {name: "building", value: 1}
Dataset label type label:type (Label ext) vector, raster
Dataset label task label:tasks (Label ext) segmentation, detection
Pre/post processing pre_processing_function / post_processing_function (MLM) Python entrypoint

Compatibility Validation

!!! warning

The backend validates that a base model and dataset are compatible before
triggering finetuning. Validation is based on matching `keywords` and
`mlm:tasks` / `label:tasks` between the model and dataset STAC items.

Flows

fAIr-models workflow

1. Base Model Registration (PR workflow)

flowchart TD
    A[Model Developer] -->|Prepares PR| B[fAIr-Models GitHub]
    B -->|CI: build, validate, test| C{Review}
    C -->|Merge| D[Post-merge CLI / CI]
    D --> E[Build + push Docker image]
    D --> F[Upload weights to S3]
    D --> G[Register STAC item in base-models]
    G --> H[STAC: base-models/model-name v1]
Loading

2. Finetuning (ZenML pipeline)

flowchart TD
    A[User picks base model + dataset] --> B[fAIr Backend]
    B -->|Read STAC items| C[Validate compatibility]
    C --> D[Generate ZenML YAML config]
    D --> E[ZenML Pipeline in model Docker]
    E --> F[split_dataset]
    F --> G[train_model]
    G --> H[evaluate_model]
    G --> I[export_onnx]
    G --> J[ZenML Model Control Plane]
    H --> J
    I --> J
Loading

3. Promotion to STAC

flowchart TD
    A[User picks best version] --> B[fAIr Backend]
    B --> C[ZenML: set stage = production]
    B --> D[StacCatalogManager]
    D --> E[Build STAC MLM item]
    D --> F[Deprecate previous version]
    D --> G[Add Version Extension links]
    E --> H[STAC: local-models/model-v3 production]
Loading
ZenML action STAC effect
Promote to production Create item, deprecate previous
Archive version Set deprecated: true on item
Delete version Remove item from collection
Delete model Remove all items + clean up

4. Inference

Works for both base models and local models. The STAC item always has enough information to run inference: model weights, inference runtime, input/output spec.

Identity Model

Concept Example ZenML STAC
Base model unet-segmentation Not in ZenML MCP Item in base-models
Finetuned model unet-segmentation-finetuned-banepa ZenML Model (many versions) Item(s) in local-models
Specific version unet-segmentation-finetuned-banepa v2 ZenML Model Version 2 Item unet-segmentation-finetuned-banepa-v2
Dataset buildings-banepa-semantic-segmentation Not in ZenML MCP Item in datasets

Infrastructure

Component Local Production
STAC Catalog pystac JSON catalog stac-fastapi + pgstac
ZenML SQLite ZenML Server (PostgreSQL)
Orchestrator local Kubernetes
Artifact Store local filesystem S3
Experiment Tracker MLflow MLflow
Container Registry local Docker ghcr.io

??? abstract "Architecture Decisions"

1. **STAC replaces ZenML Model Registry** : STAC is a downstream publish target via `StacCatalogManager`, not a ZenML stack component.
2. **STAC item = self-sufficient source of truth** : contains everything needed to run training or inference.
3. **Finetuned models share parent pipeline code** : only weights differ between base and local models.
4. **Standards first, `fair:*` only when needed** : prefer `mlm:tasks`, `keywords`, `classification:classes` and other MLM/STAC fields; use `fair:*` only where MLM has no equivalent (metrics vocabulary, split strategy, hyperparameter spec).
5. **YAML-based training & inference** : every run is driven by a generated config logged as a ZenML artifact.
6. **MLM Processing Expression for dispatch** : `pre_processing_function` / `post_processing_function` use Python entrypoints.
7. **Pipeline contract** : every model must export `training_pipeline` and `inference_pipeline` as `@pipeline`-decorated functions.

References

STAC Extensions

ZenML

fAIr Ecosystem