The library is split so pipeline code sits at the top, operation drivers sit in the middle, and YT-facing adapters sit closer to the foundation. Tach encodes allowed imports in tach.toml at the repo root: each yt_framework.* subtree lists depends_on, layer ordering, layers_explicit_depends_on, unused-edge detection (exact), and no circular first-party cycles. CI runs tach check and tach check-external against runtime dependencies in pyproject.toml.
Canonical contract: treat tach.toml plus this page as the source of truth for who may import whom. Other checks echo the same rules in different forms (see below).
Roughly:
- Foundation (
yt_framework.utils,yt_framework.job_command,yt_framework.typed_jobs, and the mostly emptyyt_frameworknamespace) — must not importcore,operations, oryt. yt_framework.yt— factory and package__init__only depend onyt_framework.yt.clients.yt_framework.yt.support— max row weight, dev simulator, prod/dev runtime helpers, secure-env splitting, and sharedOperationResourcesdataclass. Depends onyt_framework(for exampleyt_framework._layoutfor PYTHONPATH roots) and may loadytjobsdynamically where needed. Nothing here may importyt_framework.yt.clientsor the pipeline layers above YT.yt_framework.yt.clients—BaseYTClient, dev/prod clients, YQL request types underclients.yql, mixins underclients._client_split, and the public operation specs. Depends onsupport,job_command, andutils. Must not importyt_framework.contracts(contracts depend on this client surface instead).yt_framework.contracts—StageDependenciesandStageContextfor stage injection. Depends onyt_framework.yt.clients(for theBaseYTClienttype used in the protocol). Must not importcoreoroperations. Letscoredepend on shared stage types without importing the wholeoperationspackage for that alone.yt_framework.operations— map/vanilla/map-reduce drivers, upload, S3 helpers. Declaresyt_framework.yt.clients,yt_framework.contracts, and finer Tach modules underoperations.command_ops,operations.common, andoperations._internalwhere those subtrees have their owndepends_on. Must not importyt_framework.core. Type-only imports still count toward Tach (ignore_type_checking_imports = false).yt_framework.core—BasePipeline, stage discovery/registry,BaseStage, concretePipelineStageDependencies. Importsoperations,contracts,utils,yt(factory entry), andyt.clientsfor types used by the pipeline.
yt_framework.operations.stage_contracts remains a thin re-export of yt_framework.contracts for older import paths.
ytjobs stays on the job side of the boundary: it must not import yt_framework (same tach.toml rules).
Pre-commit also runs strict BasedPyright, Ruff, Xenon, Vulture, and small repo policies (file length, directory width, binding-word limits) described in CONTRIBUTING.md at the repository root.
tests/test_architecture_boundaries.py applies a few line-based greps over yt_framework/operations and yt_framework/yt. That overlaps Tach for some edges (for example operations must not import core). The duplication is intentional: Tach owns the full graph, while the tests fail with filenames and lines when someone bypasses the usual import layout. If you change boundaries, update both tach.toml and those tests when the rule is still something you want to guarantee in prose-friendly form.
Pipeline orchestration is split across pipeline.py, pipeline_cli.py, pipeline_config.py, registry, discovery, and dependencies so a single file stays under the repo policy limit ([tool.yt_framework.pre_commit.max_file_lines] in pyproject.toml, currently 550 lines for yt_framework and ytjobs). Xenon (pre-commit) also discourages letting any one module absorb the whole flow.