Skip to content

Commit 307b3d7

Browse files
authored
Add torchx.plugins: @register decorator + namespace-package plugin discovery
Differential Revision: D95338096 Pull Request resolved: #1248
1 parent b3b5388 commit 307b3d7

31 files changed

Lines changed: 2806 additions & 170 deletions

File tree

docs/source/advanced.rst

Lines changed: 216 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@ Advanced Usage
44
.. tip::
55

66
This guide covers TorchX's extension points: registering custom schedulers,
7-
named resources, components, trackers, and CLI commands via Python
8-
:term:`entry points <Entry Point>` -- a standard packaging mechanism that
9-
lets installed packages advertise plugins.
7+
named resources, components, trackers, and CLI commands. The recommended
8+
approach is the ``@register`` decorator with ``torchx_plugins.*`` namespace
9+
packages. See :doc:`plugins` for the full API reference.
1010

1111
**Audience:** Platform engineers who want to integrate TorchX with custom
1212
infrastructure. If you only need to **use** TorchX to launch jobs, the
@@ -15,71 +15,63 @@ sufficient.
1515

1616
**Prerequisites:** :doc:`basics` (core concepts) and :doc:`custom_components`.
1717

18+
.. deprecated::
19+
Entry-point based plugin registration (``[torchx.*]`` sections in
20+
``setup.py`` / ``pyproject.toml``) is deprecated. Use the ``@register``
21+
decorator with ``torchx_plugins.*`` namespace packages instead.
22+
23+
Namespace plugins are **always** loaded. By default
24+
(``TORCHX_NO_ENTRYPOINTS=0`` or unset), entry points are also loaded
25+
for backward compatibility. Set ``TORCHX_NO_ENTRYPOINTS=1`` to load
26+
only namespace plugins. Entry-point loading will be removed in a
27+
future release. See :doc:`plugins` for the full API reference.
28+
1829
.. code-block:: text
1930
2031
┌──────────────────────────────────────────────────────────────┐
2132
│ TorchX Extension Points │
2233
│ │
23-
│ Entry-Point Group What You Register │
34+
│ @register Decorator What You Register │
35+
│ ────────────────────── ────────────────────────────── │
36+
│ @register.scheduler() Scheduler factory function │
37+
│ @register.named_resource Resource factory function │
38+
│ @register.tracker() Tracker factory function │
39+
│ │
40+
│ Entry Points (not yet migrated to @register) │
2441
│ ────────────────────── ────────────────────────────── │
25-
│ torchx.schedulers Scheduler factory function │
26-
│ torchx.named_resources Resource factory function │
2742
│ torchx.components Component module path │
28-
│ torchx.tracker Tracker factory function │
2943
│ torchx.cli.cmds SubCommand class │
3044
│ │
31-
│ ┌───────────────────────────┐ │
32-
│ │ setup.py / pyproject.toml │ ◄── register here │
33-
│ └─────────────┬─────────────┘ │
34-
│ │ │
35-
│ ▼ │
45+
│ ┌──────────────────────────────────────────┐ │
46+
│ │ torchx_plugins/<group>/<module>.py │ ◄── write here │
47+
│ │ @register.scheduler() │ │
48+
│ │ def my_scheduler(...): ... │ │
49+
│ └──────────────────┬───────────────────────┘ │
50+
│ │ │
51+
│ ▼ │
3652
│ ┌───────────────────────┐ │
3753
│ │ pip install . │ ◄── install package │
3854
│ └───────────┬───────────┘ │
39-
│ │
40-
│ ▼
41-
│ ┌────────────────────────────────────────────────────┐
42-
│ │ TorchX Runtime Discovery
43-
│ │
44-
│ │ Runner ──► discovers schedulers, resources
45-
│ │ CLI ──► discovers components, subcommands
46-
│ │ AppRun ──► discovers tracker backends
47-
│ └────────────────────────────────────────────────────┘
55+
│ │ │
56+
│ ▼ │
57+
│ ┌───────────────────────────────────────────────────
58+
│ │ TorchX Runtime Discovery
59+
│ │
60+
│ │ Runner ──► discovers schedulers, resources
61+
│ │ CLI ──► discovers components, subcommands
62+
│ │ AppRun ──► discovers tracker backends
63+
│ └───────────────────────────────────────────────────
4864
└──────────────────────────────────────────────────────────────┘
4965
50-
Most configuration is done through Python's
51-
`entry points <https://packaging.python.org/specifications/entry-points/>`__
52-
-- a standard mechanism that lets installed packages advertise plugins for
53-
automatic discovery at runtime.
54-
55-
.. note::
56-
57-
Entry points require an installed Python package.
58-
59-
The entry points below can be specified in ``setup.py`` or ``pyproject.toml``.
60-
Each section shows both formats.
61-
62-
.. code-block:: python
63-
64-
from setuptools import setup
65-
66-
setup(
67-
name="project foobar",
68-
version="0.0.1",
69-
entry_points={
70-
"torchx.schedulers": [
71-
"my_scheduler = my.custom.scheduler:create_scheduler",
72-
],
73-
"torchx.named_resources": [
74-
"gpu_x2 = my_module.resources:gpu_x2",
75-
],
76-
}
77-
)
66+
Plugins are discovered from ``torchx_plugins.*`` namespace packages.
67+
Decorate your factory with ``@register.<type>()`` and TorchX finds it
68+
automatically after ``pip install``.
7869

7970

8071

8172
Registering Custom Schedulers
8273
--------------------------------
74+
8375
Implement the :py:class:`~torchx.schedulers.Scheduler` interface (see
8476
:ref:`implementing-scheduler` for a full skeleton). The factory function
8577
signature:
@@ -91,7 +83,23 @@ signature:
9183
def create_scheduler(session_name: str, **kwargs: object) -> Scheduler:
9284
return MyScheduler(session_name, **kwargs)
9385
94-
Register it via entry points:
86+
**Recommended: ``@register`` decorator**
87+
88+
Place your scheduler in a ``torchx_plugins/schedulers/`` namespace package
89+
and decorate it:
90+
91+
.. code-block:: python
92+
93+
# torchx_plugins/schedulers/my_scheduler.py
94+
from torchx.plugins import register
95+
96+
@register.scheduler()
97+
def my_scheduler(session_name: str, **kwargs) -> Scheduler:
98+
return MyScheduler(session_name, **kwargs)
99+
100+
After ``pip install``, TorchX discovers it automatically.
101+
102+
**Legacy: entry points** *(deprecated)*
95103

96104
.. testcode::
97105

@@ -125,8 +133,39 @@ Registering Named Resources
125133
-------------------------------
126134

127135
A :term:`Named Resource <Resource>` maps a human-readable name (e.g.
128-
``gpu_x2``) to a :py:class:`~torchx.specs.Resource`. For example, on an AWS
129-
cluster with p3.16xlarge nodes:
136+
``gpu_x2``) to a :py:class:`~torchx.specs.Resource`.
137+
138+
**Recommended: ``@register`` decorator**
139+
140+
Place your resources in a ``torchx_plugins/named_resources/`` namespace package
141+
and decorate them:
142+
143+
.. code-block:: python
144+
145+
# torchx_plugins/named_resources/my_cluster.py
146+
from torchx.plugins import register, WHOLE
147+
from torchx.specs import Resource
148+
149+
@register.named_resource(fractionals=register.powers_of_two_gpus)
150+
def gpu_x(fractional: float = WHOLE) -> Resource:
151+
return Resource(
152+
cpu=int(64 * fractional),
153+
gpu=int(8 * fractional),
154+
memMB=int(488_000 * fractional),
155+
)
156+
# Registers: gpu_x (base), gpu_x_8, gpu_x_4, gpu_x_2, gpu_x_1
157+
158+
@register.named_resource()
159+
def cpu_x32() -> Resource:
160+
return Resource(cpu=32, gpu=0, memMB=131072)
161+
162+
The ``fractionals`` argument auto-generates fractional variants. See
163+
:py:meth:`~torchx.plugins.register.powers_of_two_gpus` and
164+
:py:meth:`~torchx.plugins.register.halve_mem_down_to` in :doc:`plugins`.
165+
166+
**Legacy: entry points** *(deprecated)*
167+
168+
For example, on an AWS cluster with p3.16xlarge nodes:
130169

131170
.. testcode:: python
132171

@@ -246,6 +285,7 @@ suggestion of close matches.
246285

247286
Registering Custom Components
248287
-------------------------------
288+
249289
Register custom components as CLI builtins:
250290

251291
.. code-block:: shell-session
@@ -431,7 +471,7 @@ subclassing :py:class:`~torchx.tracker.api.TrackerBase`.
431471
432472
def run_ids(self, **kwargs: str) -> Iterable[str]: ...
433473
434-
**Factory function.** Each entry point must point to a factory:
474+
**Factory function.** Each tracker plugin must provide a factory:
435475

436476
.. code-block:: python
437477
@@ -440,8 +480,23 @@ subclassing :py:class:`~torchx.tracker.api.TrackerBase`.
440480
def create(config: str | None) -> TrackerBase:
441481
return MyTracker(connection_str=config or "default://localhost")
442482
443-
**Entry-point registration.** Register the factory under the ``torchx.tracker``
444-
group:
483+
**Recommended: ``@register`` decorator**
484+
485+
Place your tracker in a ``torchx_plugins/tracker/`` namespace package
486+
and decorate the factory:
487+
488+
.. code-block:: python
489+
490+
# torchx_plugins/tracker/my_tracker.py
491+
from torchx.plugins import register
492+
493+
@register.tracker()
494+
def my_tracker(config: str | None) -> TrackerBase:
495+
return MyTracker(connection_str=config or "default://localhost")
496+
497+
**Legacy: entry points** *(deprecated)*
498+
499+
Register the factory under the ``torchx.tracker`` group:
445500

446501
.. code-block:: python
447502
@@ -508,8 +563,7 @@ Registering Custom CLI Commands
508563
----------------------------------
509564

510565
Extend the ``torchx`` CLI by implementing
511-
:py:class:`~torchx.cli.cmd_base.SubCommand` and registering via the
512-
``torchx.cli.cmds`` entry-point group.
566+
:py:class:`~torchx.cli.cmd_base.SubCommand`.
513567

514568
**The SubCommand ABC** defines two abstract methods:
515569

@@ -528,8 +582,8 @@ Extend the ``torchx`` CLI by implementing
528582
"""Execute the command with parsed arguments."""
529583
print(f"Running my_tool on {args.app_id} with config={args.config}")
530584
531-
**Entry-point registration.** Register the class (not a factory) under
532-
``torchx.cli.cmds``. The key becomes the subcommand name:
585+
Register the class (not a factory) under ``torchx.cli.cmds``. The key
586+
becomes the subcommand name:
533587

534588
.. code-block:: python
535589
@@ -562,8 +616,111 @@ The default built-in commands are: ``builtins``, ``cancel``, ``configure``,
562616
``delete``, ``describe``, ``list``, ``log``, ``run``, ``runopts``, ``status``,
563617
and ``tracker``.
564618

619+
620+
Packaging a Plugin
621+
---------------------
622+
623+
To create a TorchX plugin distribution, use
624+
`native namespace packages <https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages>`_
625+
-- do **not** add ``__init__.py`` files under ``torchx_plugins/``. This allows
626+
multiple independent distributions to contribute to the same namespace.
627+
628+
Structure your project like:
629+
630+
.. code-block:: text
631+
632+
my-torchx-plugin/
633+
├── pyproject.toml
634+
└── src/
635+
└── torchx_plugins/ # NO __init__.py
636+
└── schedulers/ # NO __init__.py
637+
└── my_scheduler.py # uses @register.scheduler()
638+
639+
Configure the build backend to include ``src/torchx_plugins`` as a package.
640+
For example with ``hatchling``:
641+
642+
.. code-block:: toml
643+
644+
[build-system]
645+
requires = ["hatchling"]
646+
build-backend = "hatchling.build"
647+
648+
[project]
649+
name = "my-torchx-plugin"
650+
version = "0.1.0"
651+
dependencies = ["torchx"]
652+
653+
[tool.hatch.build.targets.wheel]
654+
packages = ["src/torchx_plugins"]
655+
656+
After ``pip install my-torchx-plugin``, TorchX will automatically discover
657+
your plugins from the ``@register``-decorated functions.
658+
659+
**Single-project layout** — You don't need a separate package just for
660+
plugins. A single project can ship both your application code and a
661+
``torchx_plugins/`` namespace package side-by-side. The key is to list
662+
**both** packages in the build backend's ``packages`` configuration so the
663+
wheel includes them together.
664+
665+
The example below uses a `UV <https://docs.astral.sh/uv/>`_ project with
666+
``hatchling``. For other build backends (e.g. ``setuptools``), consult their
667+
documentation for the equivalent package-discovery configuration.
668+
669+
.. code-block:: text
670+
671+
my-project/
672+
├── pyproject.toml
673+
├── my_training/ # your application code (regular package)
674+
│ ├── __init__.py
675+
│ ├── train.py
676+
│ └── model.py
677+
└── torchx_plugins/ # NO __init__.py (namespace package)
678+
├── schedulers/ # NO __init__.py
679+
│ └── my_scheduler.py
680+
└── named_resources/ # NO __init__.py
681+
└── my_cluster.py
682+
683+
A single ``pyproject.toml`` manages both packages:
684+
685+
.. code-block:: toml
686+
687+
[project]
688+
name = "my-project"
689+
version = "0.1.0"
690+
dependencies = ["torchx", "torch"]
691+
692+
[build-system]
693+
requires = ["hatchling"]
694+
build-backend = "hatchling.build"
695+
696+
[tool.hatch.build.targets.wheel]
697+
packages = ["my_training", "torchx_plugins"]
698+
699+
After ``uv sync``, both ``my_training`` and the TorchX plugins are installed
700+
into the same virtual environment. TorchX discovers the
701+
``@register``-decorated plugins automatically — no entry points needed.
702+
703+
704+
Diagnostics
705+
-------------
706+
707+
Print a diagnostic report of all discovered plugins:
708+
709+
.. code-block:: python
710+
711+
from torchx import plugins
712+
print(plugins.registry())
713+
714+
This lists every discovered plugin, its source module, and the distribution
715+
package it belongs to. Errors encountered during discovery are reported at
716+
the bottom.
717+
718+
565719
.. seealso::
566720

721+
:doc:`plugins`
722+
Plugin API reference (``@register``, ``find()``, ``PluginRegistry``).
723+
567724
:doc:`cli`
568725
CLI module API reference.
569726

@@ -584,4 +741,3 @@ and ``tracker``.
584741

585742
:doc:`component_best_practices`
586743
Best practices for authoring reusable components.
587-

0 commit comments

Comments
 (0)