Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
85b5a6e
feat(cli): add plugin catalog services
johnnygreco May 7, 2026
b732258
feat(cli): add plugins command group
johnnygreco May 7, 2026
0f5ad7e
test(cli): cover plugin catalog workflows
johnnygreco May 7, 2026
4708b34
fix(cli): align plugin taps with schema v2
johnnygreco May 7, 2026
3fccc07
fix(cli): address plugin review feedback
johnnygreco May 7, 2026
9d825c6
fix(cli): gate incompatible plugin installs
johnnygreco May 7, 2026
eaca353
style(cli): format plugin catalog files
johnnygreco May 7, 2026
4cdf522
fix(cli): reject duplicate plugin entry names
johnnygreco May 7, 2026
c94bec2
fix(cli): preserve GitHub tree tap paths
johnnygreco May 8, 2026
29d5e7b
fix(cli): verify plugin entry point names
johnnygreco May 8, 2026
70acd3d
align plugin CLI with catalog schema
johnnygreco May 8, 2026
2b1be5c
tidy plugin catalog workflow docs
johnnygreco May 8, 2026
ccc6acc
align plugin catalog CLI with package contract
johnnygreco May 8, 2026
41cc4f5
add plugin package uninstall workflow
johnnygreco May 8, 2026
62ef244
test plugin package command targets
johnnygreco May 8, 2026
e812f7b
document plugin package aliases
johnnygreco May 8, 2026
f03de7c
address plugin catalog review feedback
johnnygreco May 8, 2026
2878382
prefer runtime plugin lookup matches
johnnygreco May 8, 2026
56d628b
rename plugins command to plugin
johnnygreco May 8, 2026
1e19c89
show plugin package descriptions
johnnygreco May 8, 2026
e72a90a
rename plugin catalogs command
johnnygreco May 8, 2026
13841ba
add protected plugin package installs
johnnygreco May 8, 2026
6fe2541
document plugin package install modes
johnnygreco May 8, 2026
de1bd28
avoid building project during plugin installs
johnnygreco May 9, 2026
42864a0
harden plugin package installs
johnnygreco May 9, 2026
3622496
tighten plugin catalog contracts
johnnygreco May 9, 2026
e04a776
fix no-args help exit code
johnnygreco May 9, 2026
cd4913c
make plugin docs links robust
johnnygreco May 9, 2026
51bd7c7
document plugin CLI catalog workflows
johnnygreco May 9, 2026
18efd19
clarify plugin entry point verification
johnnygreco May 9, 2026
31ec625
simplify plugin CLI docs
johnnygreco May 9, 2026
3449d2d
narrow plugin search fields
johnnygreco May 9, 2026
787b8be
hide plugin catalog cache ttl
johnnygreco May 9, 2026
4f7d5ba
remove plugin catalog trust flag
johnnygreco May 9, 2026
9e1fe1d
improve plugin CLI recovery UX
johnnygreco May 11, 2026
0dc90ff
polish plugin catalog table display
johnnygreco May 11, 2026
e70dda4
stabilize plugin catalog table test
johnnygreco May 11, 2026
7a29966
tighten plugin catalog edge cases
johnnygreco May 11, 2026
ecbb050
harden plugin catalog verification
johnnygreco May 12, 2026
faaeea7
simplify plugin entry point verification
johnnygreco May 12, 2026
401238a
require newer uv for plugin plans
johnnygreco May 12, 2026
1bd5cba
verify pip fallback availability
johnnygreco May 12, 2026
4b4b453
polish plugin CLI status markers
johnnygreco May 12, 2026
37907ee
clarify plugin compatibility labels
johnnygreco May 12, 2026
064e239
simplify plugin info install details
johnnygreco May 12, 2026
a07a1c6
address plugin CLI review nits
johnnygreco May 12, 2026
33fb59c
support versioned plugin package installs
johnnygreco May 12, 2026
3ca6b02
share plugin install metadata rendering
johnnygreco May 12, 2026
2dd2f74
show installed plugin packages
johnnygreco May 12, 2026
2717873
harden versioned plugin installs
johnnygreco May 12, 2026
332920f
harden plugin help tests
johnnygreco May 12, 2026
3c273bc
show plugin package versions
johnnygreco May 12, 2026
5c33794
format plugin catalog tests
johnnygreco May 12, 2026
8a0f55b
harden plugin package metadata checks
johnnygreco May 13, 2026
893c70a
harden plugin CLI test coverage
johnnygreco May 13, 2026
e04bf77
add plugin discovery docs (#642)
johnnygreco May 13, 2026
c330fa0
Merge branch 'main' into johnny/feat/617/plugin-catalog-cli-core
johnnygreco May 13, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 51 additions & 7 deletions architecture/cli.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# CLI

The CLI (`data-designer`) provides an interactive command-line interface for configuring models, providers, tools, and personas, as well as running dataset generation. It uses a layered architecture for config management and delegates generation to the public `DataDesigner` API.
The CLI (`data-designer`) provides an interactive command-line interface for configuring models, providers, MCP providers, and tools, downloading managed persona datasets, discovering, installing, and uninstalling plugin packages from catalogs, and running dataset generation. It uses a layered architecture for setup workflows and delegates generation to the public `DataDesigner` API.

Source: `packages/data-designer/src/data_designer/cli/`

## Overview

The CLI is built on Typer with lazy command loading to keep startup fast. Config management commands follow a **command → controller → service → repository** layering pattern. Generation commands bypass this stack and use the public `DataDesigner` class directly.
The CLI is built on Typer with lazy command loading to keep startup fast. Config management and plugin catalog commands follow a **command → controller → service → repository** layering pattern. Generation commands bypass this stack and use the public `DataDesigner` class directly.

## Key Components

Expand All @@ -20,9 +20,9 @@ The CLI is built on Typer with lazy command loading to keep startup fast. Config

`create_lazy_typer_group` and `_LazyCommand` stubs defer importing command modules until a command is actually invoked. This keeps `data-designer --help` fast — only the command names and descriptions are loaded eagerly; the full module (and its dependencies) loads on first use.

### Layering Pattern (Config Management)
### Layering Pattern (Setup Workflows)

Config management commands (models, providers, tools, personas) follow a consistent four-layer pattern:
Config management commands (models, providers, MCP providers, tools) follow a consistent four-layer pattern:

| Layer | Role | Example |
|-------|------|---------|
Expand All @@ -31,10 +31,22 @@ Config management commands (models, providers, tools, personas) follow a consist
| **Service** | Domain rules: uniqueness, merge, delete-all | `ModelService.add/update/delete` over `ModelRepository` |
| **Repository** | File I/O for typed config registries | `ModelRepository` extends `ConfigRepository[ModelConfigRegistry]` |

Repositories: `ModelRepository`, `ProviderRepository`, `ToolRepository`, `MCPProviderRepository`, `PersonaRepository`.
Repositories: `ModelRepository`, `ProviderRepository`, `MCPProviderRepository`, and `ToolRepository`.
`PersonaRepository` provides read-only locale metadata for managed persona dataset downloads.

Services mirror the repository domains with business logic (validation, conflict resolution).

Plugin catalog commands use the same layering shape:

| Layer | Role | Example |
|-------|------|---------|
| **Command** | Thin Typer entry, wires `DATA_DESIGNER_HOME` and command options | `plugin` subcommands (`list`, `search`, `info`, `install`, `uninstall`, `installed`, `catalog`) → `PluginCatalogController(DATA_DESIGNER_HOME)` |
| **Controller** | UX flow: catalog tables, package metadata, compatibility display, install/uninstall confirmations | `PluginCatalogController` composes catalog + install services |
| **Service** | Domain rules: package listing, compatibility checks, uv/pip install and uninstall commands, runtime entry-point checks | `PluginCatalogService`, `PluginInstallService` |
| **Repository** | File/cache I/O for catalog aliases and catalog documents | `PluginCatalogRepository` |

The built-in `nvidia` catalog points at `https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json`. `NVIDIA-NeMo/DataDesignerPlugins` defines the catalog format. Each catalog entry is an installable package with docs, install metadata, compatibility constraints, and one or more runtime plugins. Users install and uninstall packages, not individual runtime plugins. Commands that take a package name also accept the package alias from the `data-designer-{alias}` package-name pattern; for example, `data-designer-calculator` can be addressed as `calculator`. If a user passes a runtime plugin name where a package is required, the CLI reports the package that owns that runtime plugin.

### Generation Commands

`preview`, `create`, and `validate` commands use `GenerationController`, which:
Expand Down Expand Up @@ -62,6 +74,37 @@ User invokes command (e.g., `data-designer config models`)
→ Repository reads/writes config files
```

### Plugin Catalog Discovery
```
User invokes command (e.g., `data-designer plugin list`)
→ Command function wires DATA_DESIGNER_HOME and catalog options
→ PluginCatalogController resolves the catalog alias and chooses table or narrow-terminal layout
→ PluginCatalogService loads packages and filters out incompatible packages by default
→ PluginCatalogRepository reads local config and cached/remote catalog JSON
```

### Plugin Install/Uninstall
```
User invokes command (e.g., `data-designer plugin install calculator`)
→ PluginCatalogController resolves the plugin package name or package alias
→ PluginCatalogService evaluates Python and Data Designer compatibility
→ PluginInstallService chooses uv or pip and builds the command.
In active uv projects it uses `uv add` so the package is recorded in
`pyproject.toml`; otherwise it installs into the current Python environment.
Data Designer itself is already installed, so its packages are not reinstalled
or replaced while installing plugin dependencies.
→ PluginInstallService verifies the package's runtime plugin entry points can load
```

```
User invokes command (e.g., `data-designer plugin uninstall calculator`)
→ PluginCatalogController resolves the plugin package name or package alias
→ PluginInstallService chooses uv or pip and builds the uninstall command.
Active uv projects remove the dependency from project metadata and uninstall
the package from the current environment.
→ PluginInstallService verifies the package's runtime plugin entry-point metadata is removed
```

### Generation
```
User invokes command (e.g., `data-designer create config.yaml`)
Expand All @@ -73,8 +116,9 @@ User invokes command (e.g., `data-designer create config.yaml`)
## Design Decisions

- **Lazy command loading** keeps `data-designer --help` responsive: command modules (and their heavy dependencies, such as the engine and model stacks) load only when a command is invoked, not at process startup.
- **Controller/service/repo for config, direct API for generation** — config management benefits from the layered pattern (testable services, swappable repositories). Generation doesn't need this indirection; it delegates to the same `DataDesigner` class that Python users call directly.
- **`DATA_DESIGNER_HOME`** centralizes all CLI-managed state (model configs, provider configs, tool configs, personas) in a single directory, defaulting to `~/.data_designer/`.
- **Controller/service/repo for setup workflows, direct API for generation** — config and plugin catalog workflows benefit from the layered pattern (testable services, swappable repositories). Generation doesn't need this indirection; it delegates to the same `DataDesigner` class that Python users call directly.
- **`DATA_DESIGNER_HOME`** centralizes CLI-managed state (model configs, provider configs, MCP provider configs, tool configs, managed assets, plugin catalog aliases, and catalog caches) in a single directory, defaulting to `~/.data-designer/`.
- **Package-first plugin catalogs** match how users install plugins: one package can provide one or more runtime plugins, but install and uninstall commands always target the package.
- **Rich-based UI** provides formatted tables, progress bars, and interactive prompts without requiring a web interface.

## Cross-References
Expand Down
20 changes: 0 additions & 20 deletions docs/plugins/available.md

This file was deleted.

101 changes: 101 additions & 0 deletions docs/plugins/discover.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Discover Plugins

The Data Designer CLI is the recommended way to discover and install published plugins. It uses plugin catalogs to show install details and compatibility before installing the selected plugin package into your current environment or active `uv` project.

Plugins are distributed as Python packages. A single package can expose one or more runtime plugins, so the CLI installs and uninstalls packages rather than individual runtime plugin names.

## NVIDIA catalog

The default `nvidia` catalog is maintained in the [DataDesignerPlugins repository](https://github.com/NVIDIA-NeMo/DataDesignerPlugins). You do not need to configure it before using the CLI.

You can also browse the first-party [plugin documentation](https://nvidia-nemo.github.io/DataDesignerPlugins/plugins/) and [plugin package source](https://github.com/NVIDIA-NeMo/DataDesignerPlugins/tree/main/plugins) directly.

## Find a plugin package

When a CLI command requires a plugin package argument, you can pass either the full package name or the package alias. The package alias is the package name without the `data-designer-` prefix. For example, `data-designer-github` can be addressed as `github`.

Start by listing or searching the compatible packages in the default catalog. Search can match package names, package aliases, descriptions, runtime plugin names, and runtime plugin types.

```bash
# List compatible plugin packages from the default NVIDIA catalog
data-designer plugin list

# Search for a package
data-designer plugin search github

# Inspect one package before installing it
data-designer plugin info github
```

## Install a plugin package

Install the package by full package name or package alias:

```bash
data-designer plugin install github
```

After installation, Data Designer discovers the package's `data_designer.plugins` entry points. Use `installed` to see the plugin packages available in the current Python environment and the runtime plugins they expose:

```bash
data-designer plugin installed
```

Uninstall with the same package name or alias:

```bash
data-designer plugin uninstall github
```

!!! note
Plugins are ordinary Python packages. You can still publish a plugin to PyPI or another package index and install it directly with `pip` or `uv`. This is the path we recommend for individual plugin developers from the community. See [Community plugins](#community-plugins) below.

## How catalogs work

A plugin catalog is a JSON file that tells Data Designer which plugin packages are available and how to install them. The catalog can be hosted anywhere that serves raw JSON. Each entry points to an installable Python package and includes its docs URL, Python and Data Designer compatibility requirements, the runtime plugins it exposes after installation, and the installer metadata needed to fetch the package.

The package itself can live in any Python package index, or be referenced with any valid [PEP 508 direct reference](https://packaging.python.org/en/latest/specifications/dependency-specifiers/#direct-references). The package does not have to live in the same repository as the catalog.

The NVIDIA catalog is published at:

```text
https://nvidia-nemo.github.io/DataDesignerPlugins/catalog/plugins.json
```

The NVIDIA plugin packages are served from a PyPI-compatible Python Simple API index published beside that catalog:

```text
https://nvidia-nemo.github.io/DataDesignerPlugins/simple/
```

Catalog discovery and runtime plugin discovery are separate. Reading a catalog lets the CLI show available packages and install plans without importing plugin code. Runtime plugins become available only after their package is installed and Data Designer discovers the package's `data_designer.plugins` entry points.

Other catalogs can follow the same pattern as the NVIDIA plugin repository: publish a raw `catalog/plugins.json` file and, for index-backed packages, a PyPI-compatible hosted package index. Catalog entries can also point to packages on the installer's default index or to direct package references.

## Use another catalog

Add a catalog when a team or community publishes a compatible catalog JSON file. For example, an internal platform team might publish a catalog that lists approved Data Designer plugin packages and points each package at an internal Python package index. Teammates can then add that catalog once and install approved plugins by package name or alias.

Choose a short catalog name and use it with `--catalog`:

```bash
data-designer plugin catalog add <name> <catalog-url-or-path>
data-designer plugin --catalog <name> list
data-designer plugin --catalog <name> install <package-or-alias>
```

For published catalogs, prefer sharing the raw catalog JSON URL. Local catalog files and directories are useful while authoring or testing a catalog before publishing it.

```bash
# See configured catalog names
data-designer plugin catalog list

# Remove a catalog
data-designer plugin catalog remove <name>
```

## Community plugins

We do not have any community plugins to list here yet, but yours could be the first! If you build a plugin that could be useful to other Data Designer users, we would love to hear about it.

To get started, follow the patterns in the [plugin overview](overview.md) and [Build Your Own](build_your_own.md) guides, then publish your plugin package to PyPI. When your plugin is ready, open an issue on the [Data Designer GitHub repository](https://github.com/NVIDIA-NeMo/DataDesigner/issues) with the package name, source repository, documentation link, supported Data Designer versions, and the plugin types it provides. The Data Designer team will review the plugin and add it here if it seems generally useful for the community.
4 changes: 2 additions & 2 deletions docs/plugins/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Data Designer supports three plugin types:

## Use an Installed Plugin

Plugin packages register their `Plugin` objects through Python package [entry points](https://packaging.python.org/en/latest/guides/creating-and-discovering-plugins/#using-package-metadata). Data Designer discovers installed plugin entry points automatically, so no extra registration code is required. Simply install the plugin package and use its new object types in your Data Designer workflow.
Plugin packages register their `Plugin` objects through Python package [entry points](https://packaging.python.org/en/latest/guides/creating-and-discovering-plugins/#using-package-metadata). Data Designer discovers installed plugin entry points automatically, so no extra registration code is required. Once a plugin package is installed, use its new object types in your Data Designer workflow.

If you install a plugin after `data_designer` has already been imported, restart the Python process so plugin discovery can rebuild from the new entry points.

Expand All @@ -22,7 +22,7 @@ For implementation instructions across all plugin types, see the [Build Your Own

## Find Plugins

NVIDIA-maintained plugin packages live in the [DataDesignerPlugins](https://github.com/NVIDIA-NeMo/DataDesignerPlugins) repository. See [Available Plugins](available.md) for lists of first-party and community-contributed plugins.
Use the Data Designer CLI to discover and install published plugin packages from catalogs. See [Discover Plugins](discover.md) for the catalog workflow, first-party plugin documentation, and source links.

## Discovery troubleshooting

Expand Down
3 changes: 0 additions & 3 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -340,9 +340,6 @@ redirects:
destination: "/nemo/datadesigner/plugins/file-system-seed-reader-plugins"
- source: "/nemo/datadesigner/plugins/example"
destination: "/nemo/datadesigner/plugins/example-plugin"
- source: "/nemo/datadesigner/plugins/available"
destination: "/nemo/datadesigner/plugins/available-plugin-list"

# Code Reference: mkdocstrings tree -> Fern Topic Overviews subsection.
# Underscored page names get kebab'd at the page-slug level too (Fern's title
# slugifier drops underscores), so the snake_case modules need per-page rules.
Expand Down
4 changes: 2 additions & 2 deletions fern/versions/latest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -137,8 +137,8 @@ navigation:
path: ./v0.5.8/pages/plugins/example.mdx
- page: FileSystemSeedReader Plugins
path: ./v0.5.8/pages/plugins/filesystem_seed_reader.mdx
- page: Available Plugin List
path: ./v0.5.8/pages/plugins/available.mdx
- page: Discover
path: ./v0.5.8/pages/plugins/discover.mdx
- section: Code Reference
contents:
- section: Topic Overviews
Expand Down
4 changes: 2 additions & 2 deletions fern/versions/v0.5.8.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,8 @@ navigation:
path: ./v0.5.8/pages/plugins/example.mdx
- page: FileSystemSeedReader Plugins
path: ./v0.5.8/pages/plugins/filesystem_seed_reader.mdx
- page: Available Plugin List
path: ./v0.5.8/pages/plugins/available.mdx
- page: Discover
path: ./v0.5.8/pages/plugins/discover.mdx
- section: Code Reference
contents:
- section: Topic Overviews
Expand Down
6 changes: 0 additions & 6 deletions fern/versions/v0.5.8/pages/plugins/available.mdx

This file was deleted.

Loading
Loading