Skip to content

Load tools on demand via LazyToolBox backed by a pluggable tool source store#22633

Draft
mvdbeek wants to merge 72 commits into
galaxyproject:devfrom
mvdbeek:lazy-toolbox-atomic
Draft

Load tools on demand via LazyToolBox backed by a pluggable tool source store#22633
mvdbeek wants to merge 72 commits into
galaxyproject:devfrom
mvdbeek:lazy-toolbox-atomic

Conversation

@mvdbeek
Copy link
Copy Markdown
Member

@mvdbeek mvdbeek commented May 4, 2026

Summary

Introduces a pluggable tool source storage layer plus a LazyToolBox that loads tools on demand from that store. The change converts batch tool endpoints from O(N tools) to O(1) on a pre-computed index, supports shipping read-only tool source bundles via a composite of named stores, and is fully opt-in — default deployments keep the existing eager ToolBox.

Motivation

Galaxy startup cost and batch endpoint latency both scale linearly with the number of tools. For installs with thousands of tools (and especially for shed-install-heavy deployments) this becomes the dominant cost of /api/tools, /api/tool_panels, requirements aggregation, and tests-summary. This PR moves the parsed tool sources into a queryable store, builds a lightweig
ht index for batch reads, and lets the toolbox materialize individual Tool objects only when actually needed.

Key components

  • tool_source_store/ToolSourceStore ABC with database, sqlalchemy (for read-only SQLite bundles), and composite backends; shared index serializer; benchmarks.
  • tool_index table — pre-computed index used by batch endpoints; LargeBinary with MySQL LONGBLOB variant.
  • scripts/tool_source/populate_store.py — walks tool_conf files, parses each tool source into the store; supports --watch for incremental updates with cross-process cache invalidation via the queue worker.
  • tools/lazy_toolbox.py — extends ToolBox, materializes tools on demand, LRU-cached by lazy_toolbox_cache_size.
  • Batch endpoints/api/tools, /api/tool_panels, tests summary, and /api/tools/all_requirements route through the index when LazyToolBox is active.
  • CIapi and integration workflows now also run under use_lazy_toolbox: true.

How tools are loaded and retrieved

  flowchart TD
      subgraph Build["Population (offline / watch mode)"]
          TC["tool_conf.xml / .yml"]
          PS["scripts/tool_source/populate_store.py"]
          DI["_discover.py"]
          TC --> DI --> PS
      end

      subgraph Stores["Tool source storage"]
          DB[("database backend<br/>tool_source + tool_index")]
          SA[("sqlalchemy backend<br/>read-only SQLite bundle")]
          COMP{{"Composite store<br/>read fall-through<br/>writes go to default"}}
          DB --- COMP
          SA --- COMP
      end

      PS -- "write StoredToolSource" --> COMP

      subgraph App["Galaxy app startup"]
          CFG["config_schema.yml<br/>tool_source_store / use_lazy_toolbox"]
          BUILD["build_tool_source_store"]
          APP["MinimalGalaxyApplication"]
          CFG --> BUILD --> APP
          APP -->|"use_lazy_toolbox=true"| LTB["LazyToolBox<br/>+ LRU cache"]
          APP -->|"use_lazy_toolbox=false"| ETB["Eager ToolBox"]
      end

      COMP -.read.-> LTB

      subgraph Runtime["Request path"]
          API["/api/tools, /api/tool_panels,<br/>tests summary, all_requirements"]
          SVC["ToolsService"]
          IDX["Pre-computed index<br/>constant-time batch reads"]
          TOOL["Tool object<br/>materialized on demand"]
          API --> SVC
          SVC -->|"batch listing"| IDX
          SVC -->|"single tool"| LTB
          LTB -->|"cache miss"| COMP
          LTB -->|"cache hit"| TOOL
          COMP -->|"StoredToolSource"| LTB
          LTB --> TOOL
      end

      subgraph Invalidate["Watch mode"]
          WATCH["populate_store --watch"]
          QW["queue_worker<br/>reload_tool_source_cache"]
          WATCH -- "control message" --> QW
          QW -- "invalidate_index_cache" --> LTB
          QW -- "invalidate_index_cache" --> COMP
      end
Loading

Configuration

# galaxy.yml
tool_source_store: database          # or "sqlalchemy"
use_lazy_toolbox: true               # explicit opt-in
lazy_toolbox_cache_size: 500

# Optional composite layering — read fall-through, default still takes writes
tool_source_stores:
  cvmfs_main:
    backend: sqlalchemy
    path: /cvmfs/example.org/tools/sources.sqlite
    read_only: true

Per-conf opt-in via store="cvmfs_main" on a <toolbox> element pulls that named store into the composite.

Backwards compatibility

  • Default deployments are unaffected: use_lazy_toolbox defaults to false, and tool_source_store initializes a database-backed store but nothing reads from it unless opted in.
  • A populated store on its own does not flip a deployment to lazy mode — opt-in is explicit.
  • All existing batch endpoint shapes are preserved; only the implementation switches when LazyToolBox is active.

Follow-ups deferred from this PR

  • /api/tool_sources/* and /api/tool_index/* admin/diagnostic endpoints (dropped here — no UI consumer; will land in a follow-up alongside an admin UI).
  • Redis and disk-sharded backends (designed in the original plan; not needed for the current rollout).

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

mvdbeek added 5 commits May 4, 2026 16:18
Foundational helpers used by the tool source storage stack:
- id_util.py: shared logic for normalizing and extracting tool IDs.
- tool_source_store/models.py: StoredToolSource and related dataclasses
  used by every backend.
Introduces the ``tool_index`` table that backs the database tool source
store and the LazyToolBox index. The data column uses ``LargeBinary``
with a MySQL ``LONGBLOB`` variant because real-world tool indexes
easily exceed the 64KB BLOB default. Concurrent writers must serialize
their updates externally; the singleton-style unique constraint on
``version`` enforces a delete-then-insert pattern.
Pluggable backends for storing pre-parsed tool sources, behind a
ToolSourceStore abstract base:

- database: stores tool sources in the Galaxy database via the existing
  ToolSource and tool_index tables.
- sqlalchemy: ships a separate SQLAlchemy-managed database (typically a
  read-only SQLite bundle) for distributing pre-built tool sources.
- composite: layers multiple stores so reads fall through declared
  backends in order while writes always land on the default.
- index: shared tool-index serialization helpers used by the database
  and sqlalchemy backends.
- benchmarks: deserialization performance suite.

Includes unit tests for the base contract, the SQLite/SQLAlchemy
backend, and the composite store.
scripts/tool_source/populate_store.py walks tool_conf files, parses
each tool source, and writes the result into the configured tool
source store. Supports an incremental --watch mode that re-hashes raw
file contents and updates the store as tools change on disk; the
matching cache-invalidation message is added in the queue worker
commit. _discover.py contains the shared tool-discovery helper.

Includes unit tests that exercise the script through fakes (rather
than mocks) so failures point at real behavior.
New config keys for the tool source storage stack and the LazyToolBox:

- tool_source_store / tool_source_disk_path / tool_source_stores:
  configure the active backend and any per-conf composite entries.
- use_lazy_toolbox / lazy_toolbox_cache_size / tool_cache_size:
  opt in to LazyToolBox and tune its in-memory cache.

LazyToolBox opt-in is explicit — a populated store does not flip a
default deployment to lazy mode unless ``use_lazy_toolbox`` is set.
Comment thread lib/galaxy/tools/__init__.py Outdated
if stored is None:
# Quick extraction of tool_id from raw XML without full macro expansion
try:
extracted_id = extract_tool_id_from_file(str(config_file), max_read=2000)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot do this without full macro expansion.

@jmchilton
Copy link
Copy Markdown
Member

Claude thinks 85% of this (their estimate) can be separated into abstractions that would fix the startup costs and that I would be really excited about - and move 15% into something I would want to fight about and most the breaking things and things Claude describes as "The 15% you're deferring is where the filter bypass lives, where the search rewrite lives, and where the recent invariant-restoration commits cluster."

Here is the comment Claude thinks I should post (they eviscerated the 6 paragraphs I originally wrote up):


Substantive PR — the startup problem is real and the store + slim materializer parts look good. Pushing back on scope and on a couple of read-path defects:

Scope. #21247 framed the goal as "parsing the tools on each startup is also by far the most time consuming step" — startup + steady-state memory. Under Path 1 there you flagged batch endpoints as a challenge with a stated mitigation: "we could cache the response aggressively since it rarely changes." This PR addresses both the goal and the challenge in one diff. Would it be reasonable to land the store + slim LazyToolBox first and revisit the /api/tools + /api/tool_panels index paths in a separate PR? LazyToolBox itself reads the index for has_tool / lineage / panel rendering — the deferral I'm suggesting is on the batch API endpoints, not removing the index.

Read-path defects worth surfacing regardless of the split:

  • lib/galaxy/tool_util/id_util.py:33-34 extracts the tool id with a regex over the first 2000 bytes — no macro expansion. The fallback at lib/galaxy/tools/__init__.py:733-735 (use first match if tool_dir doesn't match) lets a wrong stored source come back silently. Root <tool id="..."> rarely comes from macro expansion in practice, but it's not impossible. The populator side is fine because it goes through get_tool_source(config_file=...) — fix is on the read side.

  • lib/galaxy/tools/__init__.py:674-677 makes the eager ToolBox.get_expanded_tool_source always read from the store first. So the macro-id concern is now in scope for default deployments after they run the populator, not just lazy ones. Combined with the new always-on tool_index table and the unconditional store init, "default deployments are unaffected" is true only in the narrow sense that lazy materialization doesn't run.

  • /api/tools?in_panel=false under lazy mode bypasses FilterFactory. lib/galaxy/webapps/galaxy/services/tools.py:636-641 returns index entries via to_api_dict(); the eager path at lib/galaxy/tool_util/toolbox/base.py:1442-1444 runs filter_method(tool, panel_item_types.TOOL) driven by tool_filters / user_tool_filters. Admin- and user-defined tool filters are documented behavior — the index path should run through the same filter pass.

Search. lib/galaxy/tool_source_store/index.py:287-350 is a hand-rolled scorer whose own comment says it "mirrors how the eager toolbox's Whoosh index behaves." Could the new index feed Whoosh, or could Whoosh's analyzer be reused, to avoid ranking drift?

Maturity. The single-tool materialization path looks settled. The panel-render / batch-path bits are still landing shed-install ordering, cross-version lineage, EDAM render and [temp diagnostic] commits at branch tip — that's where the invariant work is still happening.

Suggested split:

  • Now: tool_source_store + slim LazyToolBox (single-Tool lazy materialization + persistent index for has_tool). Solves startup and memory for every process role — webapp, job handlers, workflow handlers, celery — not just the API. Eager to_dict / to_panel_view / apply_filter / Whoosh search all unchanged. all_requirements + tests_summary index shortcuts (already in this PR, low parity risk) can stay in scope. Default off behind use_lazy_toolbox. This is what gets you to the "10 minutes for 26.0" target you mentioned in January.
  • Follow-up: /api/tools + /api/tool_panels index path. The filter-pass and search questions live here; deserves its own focused review with dedicated parity tests against the eager path. Path 2 from Improve tool loading situation #21247 (external toolbox API) is also still on the table for this if first-request thundering herd turns out to be an issue under load.

@mvdbeek
Copy link
Copy Markdown
Member Author

mvdbeek commented May 15, 2026

  • This is what gets you to the "10 minutes for 26.0" target you mentioned in January.

I don't know how Claude managed to get the context so wrong. We did that already.

The main thing is that we need to be able to run more web handlers. I can see how the review is easier if the PR is split up, but if we never land the web handler work then it isn't worth the complexity. We're not limited by the job or workflow handler processes. That said the whole thing isn't ready, the panel work is a real pain.

@jmchilton
Copy link
Copy Markdown
Member

The main thing is that we need to be able to run more web handlers. I can see how the review is easier if the PR is split up, but if we never land the web handler work then it isn't worth the complexity.

I'm sure it is a John problem and not a Claude problem. Are you saying there is no useful subpart of this that doesn't bypass the toolbox filters, per-user toolbox rendering? The pre-built indexes are what you care about here and the rest of it is support infrastructure that is not useful on its own?

@mvdbeek
Copy link
Copy Markdown
Member Author

mvdbeek commented May 18, 2026

It would be somewhat useful (faster handler restarts, less baseline memory consumption), but as long as the web handler still eager load the whole toolbox (on first request or on startup) we're not addressing any production issues. I will tweak this to see if there is a better solution for the toolbox filters.

Follow-up: /api/tools + /api/tool_panels index path.

This isn't dead, but i don't love the complexity in managing these extra services. Maybe that's fine though, you'd probably only need this at the usegalaxy scale ...

mvdbeek added 12 commits May 18, 2026 10:26
LazyToolBox extends ToolBox and loads each Tool on demand from the
configured tool source store, backed by an LRU cache sized via
``lazy_toolbox_cache_size``. The pre-computed index lets batch
endpoints (panels, tests summary, requirements) avoid materializing
every tool object.

- tools/lazy_toolbox.py: the on-demand toolbox plus its index cache
  invalidation hook.
- tool_util/toolbox/base.py and parser.py: small extension points
  needed for lazy loading.
- tools/__init__.py: helpers shared with LazyToolBox.
- queue_worker/__init__.py: route ``_get_new_toolbox`` through
  LazyToolBox when opted in, and add the
  ``reload_tool_source_cache`` control message used by
  populate_store --watch to invalidate caches across processes.
- app/__init__.py and galaxy_mock.py: build the tool source store at
  app startup and use LazyToolBox when ``use_lazy_toolbox`` is set.
Replaces ad-hoc iteration over ``app.toolbox`` in the
``/api/tools``, ``/api/tool_panels``, tests-summary, and
``/api/tools/all_requirements`` endpoints with service methods that
query the LazyToolBox pre-computed index when available, falling
back to the existing toolbox path otherwise. This converts the
batch endpoints from O(N tools) to O(1) when the lazy toolbox is in
use, without changing behavior for the default eager toolbox.
End-to-end test that boots Galaxy through the composite store +
LazyToolBox path and verifies tools are discoverable via /api/tools,
covering the wiring exercised by all of the preceding commits.
- doc/source/admin/tool_source_storage.rst: operator-facing guide
  covering backend choice, populate_store usage, watch mode, and the
  per-conf composite story.
- doc/source/dev/tool_source_storage.rst: developer-facing
  architecture overview of the store, LazyToolBox, and the batch
  endpoint integration.
Wires populate_store.py and ``use_lazy_toolbox: true`` into the api
and integration GitHub Actions workflows so the LazyToolBox path is
exercised in CI alongside the default eager toolbox.
…d YAML-id tools

Round out the index discovery so the lazy toolbox boots against the full
shipping tool set:

- ``scripts/tool_source/_discover.py`` now handles ``<tool_dir>`` panel
  entries and reads YAML tool ids alongside XML, so multi-version tool
  directories and YAML tools land in the index.
- ``LazyToolBox`` extends the bootstrap path to register every variant
  discovered for a single tool id (was previously dropping siblings) and
  uses the shared YAML id reader when chasing direct ``<tool file=...>``
  references from the panel config.
- ``DatabaseToolSourceStore.store_index`` accepts the richer per-id
  payload from the discovery script.

Test coverage is extended in ``test_tool_source_storage`` to assert that
the bootstrapped index sees both directory-based and YAML-defined tools.
Boot was eagerly walking every indexed tool (500+) through panel-view
construction, EDAM rendering, and lineage build-up — both slow and
prone to leaking caches across the embedded restart used by
``test_recovery``. Defer all of that to first request:

- The default tool panel view returns ``self._tool_panel`` unchanged.
  ``GET /api/tool_panels/default`` is served by an override that builds
  responses straight from ``ToolIndex.entries`` via
  ``_index_entry_to_api_dict`` — no Tool instantiation at boot.
- Static panel views materialise per-section on first request and cache
  the result; ``in_panel=True`` listings fall back to ``super().to_dict``
  so individual sections are loaded only when consumed.
- EDAM panel view rendering is similarly deferred to first request and
  tracks its own render state, separate from the LRU cache.
- A new ``LazyLineageMap`` derives version lists from the index instead
  of probing every Tool object; entries with the same versionless guid
  share a single map instance so cross-version lookups don't fall over.
- Two ``haltables`` (``tool source store`` and ``lazy toolbox``) clear
  cached state on shutdown so the second boot of an embedded restart
  sees a clean slate.

Lint/mypy/black fallout from the panel rewrite is folded in.
Support tool shed installs end-to-end on top of the persistent tool
source store:

- Section-install paths funnel through ``_lazy_register_tool_item`` so
  each shed install is indexed alongside the panel section it lives in.
- Shed entries are keyed by full toolshed guid (not the short id) so
  workflow-side ``has_tool(guid)`` lookups resolve, and short-id
  references resolve back to the latest matching guid.
- ``tool_shed_repository``/``guid`` metadata is stamped on stored
  entries and restored on materialisation; cross-version lineage
  follows the guid so the same lineage covers every installed revision.
- ``_prune_orphaned_shed_entries`` drops persisted entries whose
  backing shed_tool_conf no longer references them, so a
  ``reset_shed_tools`` test doesn't leak installs into the next test.
- Section re-materialisation rebuilds ``section.elems`` (shed tools
  inserted at section head, ahead of local tools) instead of clearing
  the existing entries; ``remove_tool_by_id`` clears
  ``_tool_versions_by_id`` and lineage state under the toolbox lock.
- ``reload_tool_source_cache`` broadcasts with ``noop_self=True`` so
  the local process keeps its just-stored cache instead of racing the
  queue-worker session.
- Configured shed/migrated tool confs are registered as
  on-demand placeholders when the file is absent at boot, matching
  the eager toolbox's behaviour.
Round out tool lookup so version-aware callers and short-id callers
both see the right Tool object:

- ``has_tool``/``get_tool`` honour ``tool_version`` and ``exact`` —
  prefer the exact requested version, fall back to the default
  version, then fall back to the latest indexed version when the
  requested version isn't known.
- ``get_all_versions`` sorts via ``packaging.version`` so prefixes
  like ``"0.1+galaxy6"`` and ``"0.2"`` compare correctly (lexical
  sort would invert them).
- ``ToolIndex`` records the per-id default version so the toolbox can
  pin selection rather than rely on insertion order; the in-memory
  index is checked for an exact version match before instantiating
  the default Tool.
- ``packaging`` is imported at module scope and ``get_tool``'s
  signature is tightened with overloads.
Make the store's commit boundaries explicit so the in-memory toolbox
and the persistent index agree across worker boundaries, and shut the
store down cleanly before the SQLAlchemy engine goes away:

- ``ToolSourceStore`` grows a polymorphic ``commit()`` method
  (no-op default, composite propagates, database flushes + commits).
- ``DatabaseToolSourceStore.close()`` commits pending writes before
  dropping the cached index so the next boot doesn't read a partial
  snapshot.
- ``LazyToolBox`` commits explicitly at the end of bootstrap, and
  releases the queue-worker transactions held over ``__init__`` and
  ``_load_index_from_store`` so peer processes don't see
  idle-in-transaction rows that block subsequent writes.
- ``GalaxyUniverseApplication`` runs the tool source store shutdown
  before disposing the SQLAlchemy engine so ``commit()`` still has
  a live session.
Round out the test suite that exercises LazyToolBox end-to-end against
a real boot:

- Extend ``BaseToolSourceStorageIntegrationTestCase`` helpers to hit
  ``/api/tools?in_panel=False`` (the index-backed listing) and to
  reach the lazy ``tools_by_id`` map directly when the resolver path
  isn't what's being asserted.
- New unit tests under ``test/unit/tool_source_store/`` pin the index's
  version handling: default-version selection, sibling-version
  discovery, and ``packaging.version`` ordering.

Tests that bind specific behaviours to specific LazyToolBox commits
land alongside the code commits that introduced them; this commit
collects the cross-cutting helpers and pure-unit coverage.
``start_services.sh`` exported the Service ClusterIP and returned
immediately after ``kubectl apply``, then the Run tests step started
right away. The Postgres pod typically needs 30+ seconds to image-pull
and start its listener; runs where Galaxy boots fast enough race it
and hit ``connection refused``. When that happens the entire shard
errors — every Galaxy boot dies talking to its DB before any test
can run. Observed on Integration shard 1 of run 25398574715
(310 errors, 0 passes, first ERROR 35s after the apply).

``kubectl rollout status`` waits for the deployment, then ``kubectl
wait Ready`` confirms the pod has actually transitioned out of
ContainerCreating before tests start.
mvdbeek added 28 commits May 18, 2026 17:46
``ToolFileWatcher._process_tool_file`` used to write only
``StoredToolSource`` on a content change, leaving the ``ToolIndex``
stale until the next full populator run. After this commit, the
watcher delegates to ``populate_for_paths`` once the lightweight hash
check confirms the content actually changed — same single-writer entry
point shed installs use — so the index, whoosh search, and peer-process
reload broadcast all stay in sync with on-disk edits in dev mode.

``ToolFileWatcher`` gains an optional ``sa_session`` arg; ``watch_mode``
passes the freshly-initialised ``model.context``. Callers that pass
``None`` (legacy / test mocks) fall back to the previous store-only
behaviour with a one-line "index left stale" log — the contract is
preserved for the 25 fake-store unit tests that exercise the watcher
without a DB.

69/69 unit tests still pass.
In-process callers (LazyToolBox cold-start, ``populate_for_paths`` on
shed install, ``reconcile_index`` for ``reset_shed_tools``) share
Galaxy's ``app.model.context`` SQLAlchemy session. The
``DatabaseToolSourceStore.store()`` write path goes through that
session, and ``Session`` is not thread-safe — running the
``ThreadPoolExecutor`` with the previous default of 4 workers caused
SQLite "database is locked" errors and a full integration suite that
took 51 minutes with 13 failures.

The CLI is unaffected: ``populate_store(config_file=...)`` constructs
its own ``model.context`` via ``init_models_from_config`` (no concurrent
readers) and explicitly passes ``parallel=args.parallel`` (defaults to 4),
so ``populate_store.py --parallel N`` still fans out.
…oosh

``app.toolbox_search`` becomes the single search singleton for both
toolbox flavours. In lazy mode it's :class:`LazyToolboxSearch` — a
``ToolBoxSearch`` subclass that opens the populator-owned whoosh dir
(``tool_search_index_dir/_lazy_default``) on every ``search()`` call and
no-ops ``build_index`` (the populator does the writes). In eager mode it
stays :class:`ToolBoxSearch` as before.

Effects:

- ``services/tools.py:search_tools`` drops its ``_get_lazy_toolbox`` branch
  and just calls ``self._search(query, view)``. One code path, regardless
  of toolbox flavour.
- ``LazyToolBox.search_tools``, ``_get_search_index``, the
  ``_whoosh_search_index`` slot, and the whoosh clear in
  ``invalidate_index_cache`` all drop. The unused
  ``ToolSearchTuning`` / ``ToolWhooshIndex`` imports go with them.
- ``LazyToolboxSearch`` keeps ``index_count`` so the existing
  ``queue_worker.rebuild_toolbox_search_index`` watermark check is
  satisfied without doing work.

``lazy_toolbox.py`` drops a further ~55 lines (1257 → 1202).

69/69 unit tests still pass.
``load_index`` already does this — the comment there explains why: each
``session.execute(select(...))`` on the shared scoped session opens an
implicit read transaction that nothing later closes, so on SQLite the
read lock blocks subsequent writers (in particular the cold-start
populator), and on Postgres it accumulates idle-in-transaction rows.

Apply the same ``finally: session.rollback()`` pattern to ``get``,
``exists``, ``get_by_tool_id``, and ``get_by_source_path``. The
populator and ``LazyToolBox._index_needs_population`` call these in
tight loops at cold boot (484× ``exists`` and 484× ``get_by_source_path``
on a default checkout), so the leak compounded fast: integration tests
hit ``OperationalError: database is locked`` on the very next
``SELECT FROM tool_index`` from the test client.

``rollback()`` after a pure read is a no-op for data but closes the
transaction.
After the populator/single-writer refactor, ``LazyToolBox.create_tool``
raises ``RuntimeError`` on index miss (commit ``56a66da6a9``). That broke
``load_hidden_lib_tool`` paths: ``set_metadata_tool.xml`` (and other
Galaxy-internal tools loaded from ``lib/galaxy/datatypes/`` after boot)
live outside any tool_conf and ``discover_tools`` doesn't yield them, so
the populator never indexed them.

New ``galaxy.tool_source_store.populator.populate_single_path`` parses
one tool file by absolute path, builds a ``StoredToolSource`` +
``ToolIndexEntry`` (with no panel section — the tool isn't in any conf),
writes it to every writable store, and updates the cached index.
Errors are logged and rolled back; the session is left clean.

``LazyToolBox.create_tool`` now retries via ``populate_single_path`` on
miss before raising. The single-writer contract is preserved (only the
populator writes), the raise still fires when the file genuinely doesn't
exist on disk, and the test seam ``test_create_tool_raises_on_index_miss``
still passes because the path ``/tools/unknown.xml`` doesn't exist.

Also adds ``LazyTool.tool_shed_repository`` (forwarded from
``_overrides``, defaults to ``None``) so the eager pipeline's
shed-metadata setattr doesn't trip the strict ``__getattr__``. Lib tools
have no repo; shed tools store theirs via the override.

Per-tool commits in ``populate_store_inline``'s store loop release the
write transaction between tools so SQLite readers (the test client's
``/api/tools`` request, the queue worker) aren't blocked behind a
484-row write transaction. The error-handling path rolls back the
shared session so a flush failure doesn't leak ``PendingRollbackError``
to the next caller.
…_path

Three review-driven cleanups on top of commit 7ae5bf4:

1. **``ToolSource.to_string()``** is the canonical serialiser on the
   parser interface (``lib/galaxy/tool_util/parser/interface.py:450``)
   and is implemented by every concrete source (XML, YAML, CWL). Use
   it instead of branching on ``xml_tree`` / falling back to raw file
   read — fewer special cases, correct for every source class.

2. **Drop ``populate_single_path``.** It was a heavyweight recovery
   for ``load_hidden_lib_tool`` paths that aren't in any tool_conf.
   The cleaner answer: on index miss, ``LazyToolBox.create_tool``
   delegates to ``super().create_tool(...)``. The eager parent parses
   the file and registers a real ``Tool`` (not a ``LazyTool``) — lib
   tools like ``set_metadata_tool.xml`` live in memory, never enter
   the populator-owned store, and re-load fresh on every boot. The
   "raise on miss" contract is downgraded to "fall through to eager"
   — the populator still owns every tool that *should* be in the
   index, and the test seam (``test_create_tool_falls_through_to_eager_on_index_miss``)
   asserts the new behaviour.

3. **Hoist inline imports** in both files. Module-level
   ``StoredToolSource`` / ``ToolSourceStore`` / ``ReadOnlyStoreError`` /
   ``DiscoveredTool`` / ``discover_tools`` / ``ToolIndex`` /
   ``ToolIndexEntry`` / ``ToolSection`` / ``populate_store_inline`` /
   ``get_toolbox_parser`` / ``get_tool_source`` so the structural deps
   are visible at the top of each module. The heavy CLI-only imports
   (``galaxy.config``, ``galaxy.model``, ``galaxy.datatypes.registry``,
   the whoosh ``ToolSearchTuning`` / ``ToolWhooshIndex`` pair) stay
   inline — they're optional, used only on populator entry points,
   and would slow ``--help`` if pulled in.

14/16 integration tests pass; the two remaining failures predate this
stack (``test_run_specific_version_executes_that_version`` is the
noted pre-existing flake, ``test_default_panel_view_section_tools_use_id_list``
needs separate investigation).
``AbstractToolBox._newer_tool`` (lib/galaxy/tool_util/toolbox/base.py:1500)
compares ``tool1.version_object > tool2.version_object``; it's reached
from ``_lineage_in_panel`` during ``to_panel_view`` rendering. Without
the property the strict ``__getattr__`` raised ``NotImplementedError``
for ~hundreds of tools and the API returned 500.

Compute it the same way ``Tool.version_object`` does
(lib/galaxy/tools/__init__.py:1213) — parse the entry's ``version``
string, special-casing the ``+galaxy`` suffix per PEP-440. ``parse_version``
is already imported at module top.
Galaxy vendors a ``parse_version`` (lib/galaxy/tool_util/version.py:45)
that falls back to ``LegacyVersion`` when ``packaging``'s ``Version``
raises ``InvalidVersion`` — the eager toolbox uses it
(lib/galaxy/tools/__init__.py:128). The new lazy paths (LazyTool's
``version_object`` / ``_ver_key``, ``ToolIndex.add_entry``'s
multi-version tiebreaker) were importing ``packaging.version.parse``
directly, bypassing the LegacyVersion fallback.

Swap to the vendored function in three places:

- ``lib/galaxy/tools/lazy_toolbox.py`` module-level import.
- ``ToolIndex.add_entry``'s inline import.
- The matching comment in ``_ver_key``.

No behaviour change for versions that parse as PEP-440; the difference
shows up on Galaxy-specific version strings that need
``LegacyVersion`` semantics.
…se fallback

Two boot-time and panel-render fixes that the integration suite
surfaced once cold-start indexing covered every conf-discovered tool:

1. **Version defaulting in ``build_index_entry_from_source``.** Calling
   ``tool_source.parse_version()`` directly returns ``None`` for tools
   without an explicit ``<tool version="...">`` (test fixtures like
   ``sam_to_unsorted_bam.xml`` and ``implicit_conversion.xml``). The
   eager pipeline routes through
   ``galaxy.tool_util.parser.util.parse_tool_version_with_defaults``
   which falls back to ``"1.0.0"`` on profiles < 16.04. Use the same
   helper at index-build time so ``LazyTool.version`` is never
   ``None`` and ``ToolLineage.register_version`` can parse it.

2. **Materialise fallback in ``LazyTool.to_dict(link_details=True)``.**
   ``to_panel_view`` materialises every panel-visible tool to call
   ``to_dict(link_details=True)``. Some tools have XML the parameter
   factory chokes on at materialise time (``upload1``'s
   ``upload_dataset`` input_type, ``filter_data_table.xml``'s
   ``column="value"`` dynamic-options filter). The eager toolbox
   catches these in ``_load_tool_tag_set`` at boot and drops the tool
   from ``_tools_by_id``; the lazy path postpones the failure to first
   ``to_dict``, where it now logs a WARNING and falls back to the
   entry-only fast-path dict so the panel render still completes.

15/16 ``test_tool_source_storage.py`` integration tests pass; the
remaining failure is the noted pre-existing
``test_run_specific_version_executes_that_version`` flake.
``galaxy.tools.special_tools`` already enumerates Galaxy-internal tool
files loaded outside the conf walk (``load_hidden_lib_tool`` → the four
``imp_exp`` / ``data_fetch`` entries, plus ``set_metadata_tool.xml``
which the datatypes registry loads separately). The lazy refactor was
missing index entries for those, so the previous step relaxed
``LazyToolBox.create_tool`` to fall through to the eager parent on miss
to keep boot working.

Add ``hidden_lib_tool_paths()`` in ``special_tools.py``: returns the
absolute paths of every hidden-lib tool, including
``set_metadata_tool.xml``. ``SPECIAL_TOOLS`` (consumed by
``load_lib_tools``) stays unchanged so ``set_metadata_tool`` isn't
double-loaded — the new ``_EXTRA_HIDDEN_LIB_TOOLS`` dict only feeds
the populator-facing helper.

``galaxy.tool_source_store.discover.discover_tools`` walks those paths
after the conf + bundled walks, yielding ``DiscoveredTool`` entries
with ``tool_conf="<hidden-lib>"``. The populator's cold-start scan
indexes them like any other tool, so the post-boot
``load_hidden_lib_tool`` calls now hit the index branch.

With coverage complete, ``LazyToolBox.create_tool`` reverts to the
strict raise the plan called for: a miss is now a contract failure
(operator forgot to repopulate, or a new ad-hoc tool load didn't get
added to ``hidden_lib_tool_paths``). The unit test asserts the raise
again.

15/16 integration tests still pass; the remaining failure is the
pre-existing flake ``test_run_specific_version_executes_that_version``.
… trip apply_async

Every ``@galaxy_task`` relies on ``app.magic_partial(func)`` (in the
wrapper itself) to inject DI dependencies — typically
``sa_session: galaxy_scoped_session`` and manager instances declared on
the function signature — at task-execution time. Celery's
``apply_async`` runs ``check_arguments(*args, **kwargs)`` against the
*original* function signature before the task body fires, and that
check has no visibility into magic_partial. Net effect: every Galaxy
celery task that takes a DI-injected positional crashes at ``.delay()``
with ``TypeError: <task>() missing 1 required positional argument:
'sa_session'``.

The failure mode was hidden on older celery / Python combos because
``Task.typing`` (which gates the pre-call check) interacted differently
with the signature ``head_from_fun`` generated. On celery 5.6.x + Python
3.14 it surfaces consistently — most visibly in
``test_run_specific_version_executes_that_version`` where
``set_job_metadata.delay(...)`` doesn't pass ``sa_session`` and the
metadata-setting job fails before it runs.

Set ``typing=False`` as the default in ``galaxy_task``: every Galaxy
celery task already uses DI injection, so the pre-call check would
never be correct for them anyway. Individual call sites still pass
their non-DI args explicitly; the DI-injected ones are populated by
``magic_partial`` at execution. Callers that want celery's typing
check back can pass ``typing=True`` explicitly.

Integration suite goes from 15/16 to 16/16 with no other change.
Every inline import in these two files was checked against:
1. Does it create an actual import cycle? (No for any of them.)
2. Is it a soft / optional dependency? (kombu + watchdog are in
   ``pyproject.toml`` and ``pinned-requirements.txt``; the
   try/except-ImportError defenses were dead code.)

The "lazy import for startup speed" justifications on the CLI entry
points (``galaxy.config``, ``galaxy.model``, ``galaxy.datatypes.registry``,
``galaxy.tool_source_store.search``, etc.) were also bogus — the
populator module is imported at Galaxy boot via the LazyToolBox
cold-start hook, so those heavy modules get pulled in anyway. The
inline form only delayed the cost by milliseconds while making the
dep graph invisible at the top of the file.

Hoisted:

- ``lazy_toolbox.py``: ``galaxy.exceptions``, ``galaxy.tool_shed.util.repository_util``,
  ``galaxy.util.tool_version``.
- ``populator.py``: ``kombu``, ``watchdog``, ``galaxy.config``,
  ``galaxy.datatypes.registry``, ``galaxy.model``, ``galaxy.model.mapping``,
  ``galaxy.queues``, ``galaxy.tool_source_store.search`` (Tuning/Whoosh),
  ``galaxy.util.properties``. The watchdog ``ImportError`` defense and
  the kombu inline block both drop with them.

Galaxy boot still works; 113/113 unit tests + 16/16 integration tests
pass; CLI ``populate_store.py --help`` still resolves. The
``TYPE_CHECKING`` block in ``lazy_toolbox.py`` (the only remaining
function-scope ``from``-import area) keeps its lookup-only imports
where they belong.
…y stub

``_create_tool_from_stored_source`` was hitting the install database on
every shed-tool materialise to fetch ``ToolShedRepository`` just so the
Tool ctor's ``populate_tool_shed_info`` could stamp four scalar fields.
The eager pipeline already has a namedtuple stub for exactly this case
(``ToolConfRepository`` in ``lib/galaxy/tool_util/toolbox/base.py:87``,
used for shed installs whose install-DB row hasn't appeared yet) —
build it directly from ``ToolIndexEntry``.

Eliminates:
- ``_lookup_tool_shed_repository`` method (~30 lines).
- ``galaxy.tool_shed.util.repository_util.get_installed_repository``
  module-level import.
- A DB round-trip per shed-tool materialise.

``installed_tool_dependencies`` readers see
``tool_dependencies_installed_or_in_error=[]`` for materialised lazy
tools — same shape the eager pipeline produces when the install-DB
row is missing, so downstream callers already tolerate this. If a job
runner ever needs real DB-backed dependency info, it can query
``app.install_model`` directly using the scalar shed metadata already
on ``Tool``.

16/16 integration tests still pass.
``Registry.load_datatype_converters`` (``lib/galaxy/datatypes/registry.py:674``)
walks the converter list from ``datatypes_conf.xml`` after boot and
calls ``toolbox.load_tool(config_path)`` per entry. Under strict
``LazyToolBox.create_tool`` every converter raised ``RuntimeError``
(caught + logged by the registry as ``Error loading converter (…)``)
and the ``datatype_converters`` dict stayed silently empty — so
format-mismatch conversions via
``Dataset.find_conversion_destination`` were no-ops.

Fix: in ``discover_tools()``, after the hidden-lib block, query
``galaxy.model._get_datatypes_registry()`` and yield a
``DiscoveredTool`` for each converter the registry has parsed. Same
source of truth ``load_datatype_converters`` iterates, so we can't
drift — what the registry will try to load is exactly what the
populator indexes.

The registry is populated by ``set_datatypes_registry()`` at app boot
(``app/__init__.py:853``) before the toolbox initialises and inside
``populate_store(config_file=…)`` before ``populate_store_inline`` runs
— both paths produce a live registry by the time the populator walks.
On the (atypical) case where the registry isn't set, the try/except
falls through and the converters remain unindexed; the eager
``load_datatype_converters`` then continues to catch + log the
``Error loading converter`` messages, matching today's pre-refactor
behaviour.

After this commit, ``Persisted ToolIndex for store __default__``
reports 580 entries (up from 564 = +16 converters that
``sample_tool_conf.xml`` doesn't list) and zero ``Error loading
converter`` messages appear in the integration log. 16/16 integration
tests + 113/113 unit tests still pass.
CI surfaced six attributes that the tool-execution path
(``/api/tools/{id}`` POST → ``handle_input``) and the job runner read
on the resulting Tool, all of which fundamentally need a parsed
parameter tree:

- ``handle_input`` — entry point for tool execution.
- ``inputs``, ``parameters``, ``new_state`` — the parameter machinery.
- ``input_translator`` — runtime parameter translation.
- ``requires_galaxy_python_environment`` — job-runner environment hint.

Each was raising ``NotImplementedError`` for every tool-execution
request, taking out ~hundreds of API tests at once. Add them to
``_MATERIALIZE_OK`` so the stub materialises on first read and the
real Tool answers the rest of the call.

Materialise on tool execution is exactly the cost model the lazy
toolbox is designed for — pay parse cost only when the tool is
actually used. The stub still surfaces unaccounted-for attribute reads
loudly, which is the contract the user picked.
Round-2 CI surfaced five more attributes the tool-execution path
hits before the parameter machinery has finished setting up:

- ``check_and_update_param_values`` — parameter validation called from
  ``handle_input``.
- ``wants_params_cleaned`` — parameter scrub before execution.
- ``tool_source`` — raw ``ToolSource`` (some callers walk the XML
  directly).
- ``dynamic_tool`` — dynamic-tool linkage on execution.
- ``produces_entry_points`` — interactive-tool entry-point lookup.

All five materialise correctly when read; adding to ``_MATERIALIZE_OK``
makes the stub's strict ``__getattr__`` route through the cached
``_real`` Tool on first access instead of raising.
Eight more execution-path attrs surfaced by API tests:

- ``execute`` — the tool execute method itself.
- ``expand_incoming`` — parameter expansion for multi-run / map-over.
- ``params_to_strings`` — parameter serialisation.
- ``tool_action`` — execution action dispatch.
- ``tool_dir`` — on-disk directory accessor (read by some callers).
- ``completed_jobs`` — job-search-by-tool API.
- ``get_default_history_by_trans`` — default-history lookup during execute.
- ``regenerate_imported_metadata_if_needed`` — import/export job path.

All execution-path; correct to materialise on first read.
``tool_action`` graduated to ``_MATERIALIZE_OK`` in the previous commit
(it's read on the execute path). The strict-raise assertion needs an
attribute the stub neither forwards nor materialises;
``totally_not_a_tool_attr`` is guaranteed unaccounted-for.
Seven more execution + introspection-path attrs from the round-5 CI:

- ``outputs`` — output spec read by ``execute`` / display formatters.
- ``tests`` — tool tests definitions surfaced via ``/api/tools/.../tests``.
- ``to_json`` — tool JSON serialisation.
- ``tool_requirements_status`` — ``/api/tools/.../requirements_status``.
- ``provided_metadata_file`` — job-runner metadata path resolution.
- ``test_data_path`` — tool test data lookup.
- ``get_configured_job_handler`` — job-runner handler routing.

Cascading WorkflowStep ``module`` injection errors in the same log
were a downstream effect: workflow steps couldn't resolve their tool
modules because the underlying ``LazyTool`` raised on these reads.
Materialising lets the injector populate ``step.module`` normally.
``/api/tools/<id>`` show endpoint passes ``io_details=True`` to surface
``inputs`` / ``outputs`` / parameter detail; the previous condition
only materialised on ``link_details=True``, so the listing-shape dict
came back without inputs and ``test_show_repeat`` / ``test_show_multi_data``
asserted on a missing key.

Add ``io_details`` to the materialise predicate. Forward both flags into
``Tool.to_dict`` so the parent picks the right payload variant. The
entry-only fast path now only applies to the truly cheap case (the
``/api/tools?in_panel=False`` listing).

Materialise failure still falls back to the entry-only dict — listing
remains 200 even if a specific tool can't materialise.
The entry-only fast path was missing fields the ``/api/tools/<id>``
show endpoint serves (``xrefs``, ``inputs`` when ``io_details=False``,
…). The flat ``/api/tools?in_panel=False`` listing — the only caller
that benefits from the fast path — already bypasses this method:
``services/tools.py:list_tools`` walks ``tool_index.list_all()`` and
calls ``entry.to_api_dict()`` directly.

Always materialise here; the entry-only dict survives as the fallback
when materialise itself raises (a tool whose XML the parameter
factory chokes on — ``upload_dataset``, ``column="value"``, …) so the
caller still gets *something* renderable rather than a 500.

Tests updated:
- ``test_to_dict_fast_path_does_not_materialise`` → renamed to
  ``test_to_dict_falls_back_to_entry_dict_when_materialise_fails``.
- ``test_to_dict_link_details_materialises_exactly_once`` → renamed
  for accuracy; behaviour unchanged.
After five rounds of CI surfacing missing stub attrs and the latest
round still hitting indirect failures (workflow steps with ``module``
unset because some inner attribute access raised under strict), the
explicit ``_MATERIALIZE_OK`` set has stopped converging. The strict
guard caught the obvious surface and now slows ship velocity without
adding signal.

Flip the default: unknown attr reads on ``LazyTool`` now log a WARNING
and materialise. The strict raise is still available for debugging via
``LAZY_TOOL_STRICT=1`` — the path that surfaces ``add to the stub
surface or _MATERIALIZE_OK`` for new audit work.

``test_strict_getattr_raises_with_clear_message`` now monkey-patches
the module flag to force strict mode for the assertion. Other tests are
unchanged.
These were emitted on every entry of /api/tools but no client component
ever read them. The Vue panel renderer derives the link target from
``model_class == "DataSourceTool"`` via ``getTargetById`` and never reads
``min_width`` at all; the Tool interface declared the fields purely to
match the wire shape. ``Tool.target`` / ``Tool.uihints`` likewise had
``to_dict`` as their sole reader in lib/.

Removed:
- ``self.target`` / ``self.uihints`` attributes and the corresponding
  ``<inputs target>`` and ``<uihints minwidth>`` reads in ``__parse_legacy_features``
- ``DataSourceTool.parse_inputs`` self.target = "_top" override (dead)
- ``min_width`` / ``target`` keys from both the eager ``to_dict`` payload
  and the ``LazyTool.to_dict`` stub

The XSD declarations for ``<uihints minwidth>`` and ``<inputs target>``
stay so existing data-source tool XML continues to validate cleanly.
Companion to the server-side removal in ``Tool.to_dict``. The only
component still reading ``item.target`` was ``ToolsListTable.vue``,
which used ``target === 'galaxy_main'`` to flag a tool as "local"
(driving the wrench-vs-external icon, the External badge, and
internal vs ``href`` routing). The server-side equivalent of that
flag is ``isinstance(self, DataSourceTool)`` — the only tool class
whose ``target`` was ever ``_top`` rather than ``galaxy_main`` —
so switch the check to ``model_class !== 'DataSourceTool'``, matching
the heuristic already used by ``toolStore.getTargetById``.

Removed:
- ``min_width`` / ``target`` from the ``Tool`` interface
- ``min_width`` / ``target`` keys from the ToolsView test fixture
- ``min_width`` / ``target`` from the ``createMockTool`` factory in tool-version.test.ts
After commit ``108529ed9ee`` removed ``min_width`` and ``target``,
``Tool.to_dict``'s ``link_details=True`` branch only added ``link`` — a
two-line ``url_for`` call that returns a stable URL pattern. The
toolbox internals (``get_tool_to_dict``, ``to_dict(in_panel=True)``,
``to_panel_view``, the ``ToolSection`` else-branch in both) all
unconditionally passed ``link_details=True``. Under ``LazyToolBox``
that forced full materialise on every panel-visible top-level tool
just to compute one URL.

Promote ``link`` to an unconditional field in ``Tool.to_dict`` and drop
the ``link_details=True`` from every toolbox-internal caller. The
``link_details`` parameter stays on the signature (still accepted by
``/api/tools/{id}?link_details=…``) but is now a no-op gate — no field
is conditional on it.

Net effect:
- Lazy mode: ``to_panel_view`` no longer materialises top-level tools.
  The lazy stub serves ``link`` from its ``id`` in the entry-shape fast
  path (restored in a follow-up commit).
- Eager mode: serialisation drops one ``if`` branch per tool. No
  visible behaviour change.

321/321 unit tests pass.
With ``link_details`` no longer gating any field, the toolbox-internal
panel / listing callers stopped passing it (``f64858bf55e``).
``LazyTool.to_dict`` can now return the entry-shape dict for the
default-call case — every panel-view tool serves from the index, zero
materialise.

Materialise still fires when the caller asks for the parsed-Tool
payload: ``io_details=True`` (the ``/api/tools/{id}`` show endpoint
contract) or ``tool_help=True`` (rendered help). Failure-to-materialise
still falls back to the entry-shape dict so the show endpoint never
returns 500 when a specific tool's XML chokes the parameter factory.

The ``link`` field is derived from ``self.id`` matching what
``Tool.to_dict`` now emits unconditionally — ``/tool_runner?tool_id={id}``
for the standard tool runner.

Tests updated:
- ``test_to_dict_entry_fast_path_does_not_materialise`` asserts the
  default call serves the entry dict and never invokes the callback.
- ``test_to_dict_io_details_materialises`` covers the show contract.
- ``test_to_dict_falls_back_to_entry_when_materialise_fails`` covers the
  parameter-factory-failure case.

23/23 unit tests pass.
`LazyTool` materialise during workflow ``inject_all`` calls
``store.get(hash)``; the store ran the SELECT on Galaxy's shared
scoped session and then called ``session.rollback()`` to release the
implicit read transaction. That rollback expired *every*
request-scoped instance attached to the shared session — including the
in-flight ``workflow.steps`` collection. The next access in the same
``populate_module_and_state`` re-fetched a fresh ``WorkflowStep`` list
from the DB, with ``.module`` unset on the new instances, so
``compute_runtime_state``'s ``assert step.module`` raised
``AttributeError: 'WorkflowStep' object has no attribute 'module'``.
Every workflow-invocation test under ``use_lazy_toolbox=true`` 500'd.

Fix: read methods on ``DatabaseToolSourceStore`` (``get``, ``exists``,
``get_by_tool_id``, ``get_by_source_path``, ``count``, ``list_all``,
``load_index``) now open a private ``Session`` bound to the same
engine and close it on exit. The shared scoped session — and its
caller's transaction state — is untouched. The original lock-release
intent (cold-start populator, queue-worker init) still holds because
each private session's read transaction ends when the session closes.

Writes (``store``, ``store_index``, ``delete``, ``update_index_entry``)
still use ``_get_session()`` so they participate in the caller's
transaction and commit in the right context (populator, queue worker,
``LazyToolBox.__init__``).
The lazy ``to_dict`` fast path was short-circuiting *both* the
panel-view walk (intended) *and* the ``/api/tools/{id}`` show endpoint
(unintended) — they're indistinguishable at the ``to_dict(io_details=
False, link_details=False)`` callsite. ``test_legacy_biotools_xref_
injection`` caught it: the show endpoint returned the entry-shape
dict and no longer carried ``xrefs``.

Split the two call sites at the API rather than the parameter:

* ``Tool.to_panel_entry(trans)`` — new method, returns the cheap
  entry-shape dict (id/name/version/description/labels/edam/panel
  section/link). ``LazyTool`` overrides it to skip materialise.
* ``AbstractToolBox.get_tool_to_dict`` calls ``to_panel_entry`` when
  ``tool_help=False`` (the panel-view default); ``tool_help=True``
  still falls through to ``to_dict`` so the help payload renders.
* ``LazyTool.to_dict`` always materialises now — it's the
  ``/api/tools/{id}`` contract and must ship ``xrefs`` / ``versions``
  / ``is_workflow_compatible`` / ``tool_shed_repository`` / etc. The
  materialise-failure fallback now reuses ``to_panel_entry``.

Net effect: ``/api/tool_panels/{view}`` and ``/api/tools?in_panel=
true`` stay materialise-free for the lazy toolbox; ``/api/tools/{id}``
serves the full show-endpoint dict in both modes.
@jmchilton
Copy link
Copy Markdown
Member

This isn't dead, but i don't love the complexity in managing these extra services. Maybe that's fine though, you'd probably only need this at the usegalaxy scale ...

I would trade some admin complexity we can hide behind Ansible over pulling the code base in directions we don't want (there was a lot in here I liked and thought were good directions on last review - just not all of it). Admin of Galaxy is complex but the codebase itself is much more complex 😅 IMO - I also think decomposing the API layer is the kind of abstraction we want whereas making the tool APIs more brittle and less functional or reducing the cohesion (not that you're doing that - or the latest version is doing this - but it is my directional concern the API decomposition) is something we don't want IMO.

You're scaring me but I trust you. Sorry if it feels like I'm picking on this PR.

@mvdbeek
Copy link
Copy Markdown
Member Author

mvdbeek commented May 19, 2026

It's all still in draft, though i think this version is a good bit better now in terms of the panel handling, and special casing. I don't think this will make it to 26.1, but it seems feasible to have just the toolbox variant and the stub tools and not touch any of the remaining code, keeping all features and APIs intact (we should drop some legacy data like links and min_width). The rest of the code is mostly about resetting the state for integration tests and laying out what will cause full materialisation of a tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants