@@ -4,9 +4,9 @@ Advanced Usage
44.. tip ::
55
66 This guide covers TorchX's extension points: registering custom schedulers,
7- named resources, components, trackers, and CLI commands via Python
8- :term: ` entry points <Entry Point> ` -- a standard packaging mechanism that
9- lets installed packages advertise plugins.
7+ named resources, components, trackers, and CLI commands. The recommended
8+ approach is the `` @register `` decorator with `` torchx_plugins.* `` namespace
9+ packages. See :doc: ` plugins ` for the full API reference .
1010
1111**Audience: ** Platform engineers who want to integrate TorchX with custom
1212infrastructure. If you only need to **use ** TorchX to launch jobs, the
@@ -15,71 +15,63 @@ sufficient.
1515
1616**Prerequisites: ** :doc: `basics ` (core concepts) and :doc: `custom_components `.
1717
18+ .. deprecated ::
19+ Entry-point based plugin registration (``[torchx.*] `` sections in
20+ ``setup.py `` / ``pyproject.toml ``) is deprecated. Use the ``@register ``
21+ decorator with ``torchx_plugins.* `` namespace packages instead.
22+
23+ Namespace plugins are **always ** loaded. By default
24+ (``TORCHX_NO_ENTRYPOINTS=0 `` or unset), entry points are also loaded
25+ for backward compatibility. Set ``TORCHX_NO_ENTRYPOINTS=1 `` to load
26+ only namespace plugins. Entry-point loading will be removed in a
27+ future release. See :doc: `plugins ` for the full API reference.
28+
1829.. code-block :: text
1930
2031 ┌──────────────────────────────────────────────────────────────┐
2132 │ TorchX Extension Points │
2233 │ │
23- │ Entry-Point Group What You Register │
34+ │ @register Decorator What You Register │
35+ │ ────────────────────── ────────────────────────────── │
36+ │ @register.scheduler() Scheduler factory function │
37+ │ @register.named_resource Resource factory function │
38+ │ @register.tracker() Tracker factory function │
39+ │ │
40+ │ Entry Points (not yet migrated to @register) │
2441 │ ────────────────────── ────────────────────────────── │
25- │ torchx.schedulers Scheduler factory function │
26- │ torchx.named_resources Resource factory function │
2742 │ torchx.components Component module path │
28- │ torchx.tracker Tracker factory function │
2943 │ torchx.cli.cmds SubCommand class │
3044 │ │
31- │ ┌───────────────────────────┐ │
32- │ │ setup.py / pyproject.toml │ ◄── register here │
33- │ └─────────────┬─────────────┘ │
34- │ │ │
35- │ ▼ │
45+ │ ┌──────────────────────────────────────────┐ │
46+ │ │ torchx_plugins/<group>/<module>.py │ ◄── write here │
47+ │ │ @register.scheduler() │ │
48+ │ │ def my_scheduler(...): ... │ │
49+ │ └──────────────────┬───────────────────────┘ │
50+ │ │ │
51+ │ ▼ │
3652 │ ┌───────────────────────┐ │
3753 │ │ pip install . │ ◄── install package │
3854 │ └───────────┬───────────┘ │
39- │ │ │
40- │ ▼ │
41- │ ┌────────────────────────────────────────────────────┐ │
42- │ │ TorchX Runtime Discovery │ │
43- │ │ │ │
44- │ │ Runner ──► discovers schedulers, resources │ │
45- │ │ CLI ──► discovers components, subcommands │ │
46- │ │ AppRun ──► discovers tracker backends │ │
47- │ └────────────────────────────────────────────────────┘ │
55+ │ │ │
56+ │ ▼ │
57+ │ ┌───────────────────────────────────────────────────┐ │
58+ │ │ TorchX Runtime Discovery │ │
59+ │ │ │ │
60+ │ │ Runner ──► discovers schedulers, resources │ │
61+ │ │ CLI ──► discovers components, subcommands │ │
62+ │ │ AppRun ──► discovers tracker backends │ │
63+ │ └───────────────────────────────────────────────────┘ │
4864 └──────────────────────────────────────────────────────────────┘
4965
50- Most configuration is done through Python's
51- `entry points <https://packaging.python.org/specifications/entry-points/ >`__
52- -- a standard mechanism that lets installed packages advertise plugins for
53- automatic discovery at runtime.
54-
55- .. note ::
56-
57- Entry points require an installed Python package.
58-
59- The entry points below can be specified in ``setup.py `` or ``pyproject.toml ``.
60- Each section shows both formats.
61-
62- .. code-block :: python
63-
64- from setuptools import setup
65-
66- setup(
67- name = " project foobar" ,
68- version = " 0.0.1" ,
69- entry_points = {
70- " torchx.schedulers" : [
71- " my_scheduler = my.custom.scheduler:create_scheduler" ,
72- ],
73- " torchx.named_resources" : [
74- " gpu_x2 = my_module.resources:gpu_x2" ,
75- ],
76- }
77- )
66+ Plugins are discovered from ``torchx_plugins.* `` namespace packages.
67+ Decorate your factory with ``@register.<type>() `` and TorchX finds it
68+ automatically after ``pip install ``.
7869
7970
8071
8172Registering Custom Schedulers
8273--------------------------------
74+
8375Implement the :py:class: `~torchx.schedulers.Scheduler ` interface (see
8476:ref: `implementing-scheduler ` for a full skeleton). The factory function
8577signature:
@@ -91,7 +83,23 @@ signature:
9183 def create_scheduler(session_name: str, **kwargs: object) -> Scheduler:
9284 return MyScheduler(session_name, **kwargs)
9385
94- Register it via entry points:
86+ **Recommended: ``@register`` decorator **
87+
88+ Place your scheduler in a ``torchx_plugins/schedulers/ `` namespace package
89+ and decorate it:
90+
91+ .. code-block :: python
92+
93+ # torchx_plugins/schedulers/my_scheduler.py
94+ from torchx.plugins import register
95+
96+ @register.scheduler ()
97+ def my_scheduler (session_name : str , ** kwargs ) -> Scheduler:
98+ return MyScheduler(session_name, ** kwargs)
99+
100+ After ``pip install ``, TorchX discovers it automatically.
101+
102+ **Legacy: entry points ** *(deprecated) *
95103
96104.. testcode ::
97105
@@ -125,8 +133,39 @@ Registering Named Resources
125133-------------------------------
126134
127135A :term: `Named Resource <Resource> ` maps a human-readable name (e.g.
128- ``gpu_x2 ``) to a :py:class: `~torchx.specs.Resource `. For example, on an AWS
129- cluster with p3.16xlarge nodes:
136+ ``gpu_x2 ``) to a :py:class: `~torchx.specs.Resource `.
137+
138+ **Recommended: ``@register`` decorator **
139+
140+ Place your resources in a ``torchx_plugins/named_resources/ `` namespace package
141+ and decorate them:
142+
143+ .. code-block :: python
144+
145+ # torchx_plugins/named_resources/my_cluster.py
146+ from torchx.plugins import register, WHOLE
147+ from torchx.specs import Resource
148+
149+ @register.named_resource (fractionals = register.powers_of_two_gpus)
150+ def gpu_x (fractional : float = WHOLE ) -> Resource:
151+ return Resource(
152+ cpu = int (64 * fractional),
153+ gpu = int (8 * fractional),
154+ memMB = int (488_000 * fractional),
155+ )
156+ # Registers: gpu_x (base), gpu_x_8, gpu_x_4, gpu_x_2, gpu_x_1
157+
158+ @register.named_resource ()
159+ def cpu_x32 () -> Resource:
160+ return Resource(cpu = 32 , gpu = 0 , memMB = 131072 )
161+
162+ The ``fractionals `` argument auto-generates fractional variants. See
163+ :py:meth: `~torchx.plugins.register.powers_of_two_gpus ` and
164+ :py:meth: `~torchx.plugins.register.halve_mem_down_to ` in :doc: `plugins `.
165+
166+ **Legacy: entry points ** *(deprecated) *
167+
168+ For example, on an AWS cluster with p3.16xlarge nodes:
130169
131170.. testcode :: python
132171
@@ -246,6 +285,7 @@ suggestion of close matches.
246285
247286Registering Custom Components
248287-------------------------------
288+
249289Register custom components as CLI builtins:
250290
251291.. code-block :: shell-session
@@ -431,7 +471,7 @@ subclassing :py:class:`~torchx.tracker.api.TrackerBase`.
431471
432472 def run_ids (self , ** kwargs : str ) -> Iterable[str ]: ...
433473
434- **Factory function. ** Each entry point must point to a factory:
474+ **Factory function. ** Each tracker plugin must provide a factory:
435475
436476.. code-block :: python
437477
@@ -440,8 +480,23 @@ subclassing :py:class:`~torchx.tracker.api.TrackerBase`.
440480 def create (config : str | None ) -> TrackerBase:
441481 return MyTracker(connection_str = config or " default://localhost" )
442482
443- **Entry-point registration. ** Register the factory under the ``torchx.tracker ``
444- group:
483+ **Recommended: ``@register`` decorator **
484+
485+ Place your tracker in a ``torchx_plugins/tracker/ `` namespace package
486+ and decorate the factory:
487+
488+ .. code-block :: python
489+
490+ # torchx_plugins/tracker/my_tracker.py
491+ from torchx.plugins import register
492+
493+ @register.tracker ()
494+ def my_tracker (config : str | None ) -> TrackerBase:
495+ return MyTracker(connection_str = config or " default://localhost" )
496+
497+ **Legacy: entry points ** *(deprecated) *
498+
499+ Register the factory under the ``torchx.tracker `` group:
445500
446501.. code-block :: python
447502
@@ -508,8 +563,7 @@ Registering Custom CLI Commands
508563----------------------------------
509564
510565Extend the ``torchx `` CLI by implementing
511- :py:class: `~torchx.cli.cmd_base.SubCommand ` and registering via the
512- ``torchx.cli.cmds `` entry-point group.
566+ :py:class: `~torchx.cli.cmd_base.SubCommand `.
513567
514568**The SubCommand ABC ** defines two abstract methods:
515569
@@ -528,8 +582,8 @@ Extend the ``torchx`` CLI by implementing
528582 """ Execute the command with parsed arguments."""
529583 print (f " Running my_tool on { args.app_id} with config= { args.config} " )
530584
531- ** Entry-point registration. ** Register the class (not a factory) under
532- `` torchx.cli.cmds ``. The key becomes the subcommand name:
585+ Register the class (not a factory) under `` torchx.cli.cmds ``. The key
586+ becomes the subcommand name:
533587
534588.. code-block :: python
535589
@@ -562,8 +616,111 @@ The default built-in commands are: ``builtins``, ``cancel``, ``configure``,
562616``delete ``, ``describe ``, ``list ``, ``log ``, ``run ``, ``runopts ``, ``status ``,
563617and ``tracker ``.
564618
619+
620+ Packaging a Plugin
621+ ---------------------
622+
623+ To create a TorchX plugin distribution, use
624+ `native namespace packages <https://packaging.python.org/en/latest/guides/packaging-namespace-packages/#native-namespace-packages >`_
625+ -- do **not ** add ``__init__.py `` files under ``torchx_plugins/ ``. This allows
626+ multiple independent distributions to contribute to the same namespace.
627+
628+ Structure your project like:
629+
630+ .. code-block :: text
631+
632+ my-torchx-plugin/
633+ ├── pyproject.toml
634+ └── src/
635+ └── torchx_plugins/ # NO __init__.py
636+ └── schedulers/ # NO __init__.py
637+ └── my_scheduler.py # uses @register.scheduler()
638+
639+ Configure the build backend to include ``src/torchx_plugins `` as a package.
640+ For example with ``hatchling ``:
641+
642+ .. code-block :: toml
643+
644+ [build-system]
645+ requires = ["hatchling"]
646+ build-backend = "hatchling.build"
647+
648+ [project]
649+ name = "my-torchx-plugin"
650+ version = "0.1.0"
651+ dependencies = ["torchx"]
652+
653+ [tool.hatch.build.targets.wheel]
654+ packages = ["src/torchx_plugins"]
655+
656+ After ``pip install my-torchx-plugin ``, TorchX will automatically discover
657+ your plugins from the ``@register ``-decorated functions.
658+
659+ **Single-project layout ** — You don't need a separate package just for
660+ plugins. A single project can ship both your application code and a
661+ ``torchx_plugins/ `` namespace package side-by-side. The key is to list
662+ **both ** packages in the build backend's ``packages `` configuration so the
663+ wheel includes them together.
664+
665+ The example below uses a `UV <https://docs.astral.sh/uv/ >`_ project with
666+ ``hatchling ``. For other build backends (e.g. ``setuptools ``), consult their
667+ documentation for the equivalent package-discovery configuration.
668+
669+ .. code-block :: text
670+
671+ my-project/
672+ ├── pyproject.toml
673+ ├── my_training/ # your application code (regular package)
674+ │ ├── __init__.py
675+ │ ├── train.py
676+ │ └── model.py
677+ └── torchx_plugins/ # NO __init__.py (namespace package)
678+ ├── schedulers/ # NO __init__.py
679+ │ └── my_scheduler.py
680+ └── named_resources/ # NO __init__.py
681+ └── my_cluster.py
682+
683+ A single ``pyproject.toml `` manages both packages:
684+
685+ .. code-block :: toml
686+
687+ [project]
688+ name = "my-project"
689+ version = "0.1.0"
690+ dependencies = ["torchx", "torch"]
691+
692+ [build-system]
693+ requires = ["hatchling"]
694+ build-backend = "hatchling.build"
695+
696+ [tool.hatch.build.targets.wheel]
697+ packages = ["my_training", "torchx_plugins"]
698+
699+ After ``uv sync ``, both ``my_training `` and the TorchX plugins are installed
700+ into the same virtual environment. TorchX discovers the
701+ ``@register ``-decorated plugins automatically — no entry points needed.
702+
703+
704+ Diagnostics
705+ -------------
706+
707+ Print a diagnostic report of all discovered plugins:
708+
709+ .. code-block :: python
710+
711+ from torchx import plugins
712+ print (plugins.registry())
713+
714+ This lists every discovered plugin, its source module, and the distribution
715+ package it belongs to. Errors encountered during discovery are reported at
716+ the bottom.
717+
718+
565719.. seealso ::
566720
721+ :doc: `plugins `
722+ Plugin API reference (``@register ``, ``find() ``, ``PluginRegistry ``).
723+
567724 :doc: `cli `
568725 CLI module API reference.
569726
@@ -584,4 +741,3 @@ and ``tracker``.
584741
585742 :doc: `component_best_practices `
586743 Best practices for authoring reusable components.
587-
0 commit comments