Triton bump, py3.14 + CUDA 13.0 by mgorny · Pull Request #477 · conda-forge/pytorch-cpu-feedstock

mgorny · 2026-01-26T12:53:23Z

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

Fixes #457
Fixes #420

Signed-off-by: Michał Górny <mgorny@quansight.com>

CUDA 13.0 requires architecture `sm_75` or higher, and renamed `sm_101` to `sm_110`. To build for these, maintainers will need to modify their existing list of specified architectures (e.g. `CMAKE_CUDA_ARCHITECTURES`, `TORCH_CUDA_ARCH_LIST`, etc.) for their package. Since CUDA 12.8, the conda-forge nvcc package now sets `CUDAARCHS` and in its activation script to a string containing all of the supported real architectures plus the virtual architecture of the latest. Recipes for packages who use these variables to control their build but do not want to build for all supported architectures will need to override these variables in their build script. ref: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#new-features > [[!IMPORTANT]] > Remember to update any CUDA 11/12 specific selector syntax in the recipe to include > CUDA 13. For example `# [(cuda_compiler_version or "None").startswith("12")]` > might be replaced with `# [cuda_compiler_version != "None"]`.

@carterbox

Thanks to @carterbox for the patch: conda-forge#457 (comment) Signed-off-by: Michał Górny <mgorny@quansight.com>

…6.01.26.08.52.07 Other tools: - conda-build 25.11.1 - rattler-build 0.55.1 - rattler-build-conda-compat 1.4.10

conda-forge-admin · 2026-01-26T12:54:56Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

ℹ️ The magma output has been superseded by libmagma-devel.
ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

_{This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/21452160226. Examine the logs at this URL for more detail.}

mgorny · 2026-01-26T15:44:52Z

Reminder to self: once this is merged, enable triton tests on py3.14.

mgorny · 2026-01-26T17:44:08Z

Minimal test run passed. Now let's do the full thing…

h-vetinari · 2026-01-26T20:56:14Z

@mgorny, this is still using GPU agents to build the CPU versions, c.f. my comment from #475

@h-vetinari, do we want to include CUDA 13 migration for when the final is released?

As long as you use a development install of smithy (combined with the skip from #332, so that CPU builds run on non-GPU agents), that's OK for me.

jakirkham

Thanks Michał! 🙏

Had a question below about MAGMA usage

mgorny · 2026-01-26T21:23:51Z

@mgorny, this is still using GPU agents to build the CPU versions, c.f. my comment from #475

@h-vetinari, do we want to include CUDA 13 migration for when the final is released?

As long as you use a development install of smithy (combined with the skip from #332, so that CPU builds run on non-GPU agents), that's OK for me.

Ah, sorry, I was missing #332. Was wondering why git conda-smithy didn't produce any differences, and figure out the relevant changes must've been released already.

mgorny · 2026-01-26T21:24:38Z

Looks like libmagma-devel change broke Windows.

Co-Authored-By: Isuru Fernando <isuruf@gmail.com>

…forge-pinning 2026.01.26.08.52.07 Other tools: - conda-build 25.11.1.dev19+dirty - rattler-build 0.55.1 - rattler-build-conda-compat 1.4.10

h-vetinari · 2026-01-26T22:11:52Z

It looks like we might have a new must-fix issue for any new PRs here: #478

Signed-off-by: Michał Górny <mgorny@quansight.com>

Fixes conda-forge#479 Signed-off-by: Michał Górny <mgorny@quansight.com>

mgorny · 2026-01-28T19:17:46Z

Added the TorchConfig.cmake fixes tested in #480, since we'd be having another build round anyway.

h-vetinari · 2026-01-29T02:56:58Z

Your fix looks obviously correct™️, so I'm going to cancel CI. The server is heavily congested right now, so this has too little marginal benefit IMO. We can merge this without rerunning CI once the congestion has cleared a bit.

h-vetinari · 2026-01-29T11:11:53Z

OK, flash-attn is through (well, at least has stopped consuming agents), tensorflow is down to one job that's almost done, and while there's a stray webkit still around, that shouldn't stop us from merging this one.

Bombs away!

bdice · 2026-01-29T20:49:06Z

Congrats and great thanks to @mgorny and everyone who helped with this effort!

Tobias-Fischer · 2026-01-29T22:40:00Z

Not sure where the best place is to report, but in conda-forge/theseus-ai-feedstock#32 I get

  Theseus CUDA support: True (forced by THESEUS_FORCE_CUDA env var)
  Traceback (most recent call last):
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
      main()
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
      json_out["return_val"] = hook(**hook_input["kwargs"])
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 175, in prepare_metadata_for_build_wheel
      return hook(metadata_directory, config_settings)
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/setuptools/build_meta.py", line 378, in prepare_metadata_for_build_wheel
      self.run_setup()
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/setuptools/build_meta.py", line 518, in run_setup
      super().run_setup(setup_script=setup_script)
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/setuptools/build_meta.py", line 317, in run_setup
      exec(code, locals())
    File "<string>", line 136, in <module>
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1408, in CUDAExtension
      library_dirs += library_paths(device_type="cuda")
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1626, in library_paths
      if (not os.path.exists(_join_cuda_home(lib_dir)) and
    File "/home/conda/feedstock_root/build_artifacts/theseus-ai_1769725632963/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 3094, in _join_cuda_home
      raise OSError('CUDA_HOME environment variable is not set. '
  OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
  error: subprocess-exited-with-error

h-vetinari · 2026-01-29T23:28:07Z

None of the CUDA builds has been uploaded yet. Your PR ends up using CPU pytorch.

(I noticed too late that this PR closed #457 and #420; given the enormous amount of time necessary to build out pytorch, we should have waited until builds are online before giving the migrator the sign to move on)

jakirkham · 2026-01-30T00:41:20Z

Alternatively we could configure downstream feedstocks to wait until the PyTorch packages are available before attempting migration

bot:
    # only open PRs if resulting environment is solvable, useful for tightly coupled packages
    check_solvable: true

h-vetinari · 2026-01-30T02:32:32Z

I much prefer PRs to be opened even if they don't pass yet. That's infinitely more visible than something lost in the bowels of the bot infrastructure. It's only a minor inconvenience if CI on those PRs has to be restarted, which would have been nice to avoid, but it's ultimately not a big deal IMO.

h-vetinari · 2026-01-30T03:00:57Z

win+CUDA12.8:

$ gh run download 21475924468 --repo conda-forge/pytorch-cpu-feedstock --name conda_artifacts_21475924468_win_64_channel_targetsconda-forge_maincu_hca575dce
$ unzip pytorch-cpu-feedstock_conda_artifacts_.zip
$ cd bld/win-64 && rm current_repodata.json index.html repodata*
$ ls
libtorch-2.10.0-cuda128_mkl_h97e3598_301.conda       pytorch-gpu-2.10.0-cuda128_mkl_hc88b545_301.conda
pytorch-2.10.0-cuda128_mkl_py310_hdd2a298_301.conda  pytorch-tests-2.10.0-cuda128_mkl_py310_hf0eca92_301.conda
pytorch-2.10.0-cuda128_mkl_py311_h0cb71aa_301.conda  pytorch-tests-2.10.0-cuda128_mkl_py311_hc85c64c_301.conda
pytorch-2.10.0-cuda128_mkl_py312_hc4f88d7_301.conda  pytorch-tests-2.10.0-cuda128_mkl_py312_hb3d0777_301.conda
pytorch-2.10.0-cuda128_mkl_py313_h716786b_301.conda  pytorch-tests-2.10.0-cuda128_mkl_py313_hd85d54a_301.conda
pytorch-2.10.0-cuda128_mkl_py314_hc058aa6_301.conda  pytorch-tests-2.10.0-cuda128_mkl_py314_hfe9566a_301.conda
$ ls | xargs anaconda upload
$ DELEGATE=h-vetinari
PACKAGE_VERSION=2.10.0
for package in libtorch pytorch pytorch-gpu pytorch-tests; do
  anaconda copy --from-label main --to-label main --to-owner conda-forge ${DELEGATE}/${package}/${PACKAGE_VERSION}
done

The CUDA 13.0 build failed due to losing connection with the agent; if we're lucky the reduction in GPU arches means that libtorch will be small enough to succeed uploading upon restart.

h-vetinari · 2026-01-30T10:43:31Z

Obviously, with an extra python version to build & test for pytorch & pytorch-tests, our runtime for the CUDA 12.9 builds has blown up further still - 22h30 for the longest single job.

I also noticed that the libtorch builds for 12.9 and 13.0 have a massive size difference; ~470MB for 13.0 and ~850MB for 12.9. Part of that is explained by -compress-mode=size (5c1be2d), but we should perhaps consider thinning out the GPU arches a bit also for 12.9...

mgorny · 2026-01-30T11:18:27Z

I suppose we could start considering removing some of the targets common to CUDA 12.x and 13.x, but we probably need to be careful (though I think PTX should make this less harmful?)

jakirkham · 2026-01-30T20:23:33Z

Perhaps this would be worth discussing in a new issue?

h-vetinari · 2026-01-30T20:56:58Z

Feel free to open an issue!

h-vetinari · 2026-01-31T06:15:20Z

All builds are online now. I've just started CI for d392c50, which is the backport of the CMake fix to v2.9.x.

Closes #296 Restores CUDA 13 conda test CI jobs, now that there are conda-forge PyTorch packages with CUDA 13 support (conda-forge/pytorch-cpu-feedstock#477) Also modifies `pytorch` conda dependency to meet these requirements: * `cugraph-pyg` must be installable on a system without a GPU * `cugraph-pyg`'s tests require CUDA-enabled builds of PyTorch With the following mix of things: * add a `require_gpu` matrix filter in `dependencies.yaml` which pulls in `pytorch-gpu` opted-into in test CI jobs but otherwise not - *`conda-forge::pytorch-gpu` is a metapackage that forces the installation of CUDA variants of `conda-forge::pytorch`... that should replace the "accidentally pulled in a CPU-only variant" case with a loud, clear conda solver error* * depend on `mkl` in the test x86_64 environment but without version constraints - *allow `pytorch` to declare its range of compatible `mkl` versions* - *this still prevents OpenBLAS variants from getting installed, which I think was part of the goal of #161* - *keeping this out of `cugraph-pyg`'s dependencies still makes it possible to install alongside `nomkl`, even though that combination is untested* * add comments in the `cugraph-pyg` conda recipe explaining why it doesn't depend on `pytorch-gpu` I hope this will be a relatively future-proof way to guarantee CI here keeps picking up the PyTorch versions this project wants to tet against. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Alex Barghi (https://github.com/alexbarghi-nv) - Bradley Dice (https://github.com/bdice) URL: #395

There are now `pytorch` CUDA 13 packages (started with `pytorch` 2.10: conda-forge/pytorch-cpu-feedstock#477) This adds them to the test environment so they'll be tested in CUDA 13 integration testing jobs. More details on the history of PyTorch in those jobs: #20748 (comment) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #21663

mgorny and others added 10 commits January 26, 2026 13:27

Bump triton to 3.6.0

2a91d78

Signed-off-by: Michał Górny <mgorny@quansight.com>

specify arches for CUDA 13; add fatbin compression

5c1be2d

do not run the set of python tests on CUDA 13.0; server has too old GPU

4db8428

simplify GPU requirements

feadde7

update recipe to use newer magma naming where possible

5f260d8

fix link check for cuda builds

e74090b

Rebuild for python 3.14

30000ef

Use gcc-13 for aarch64

69f6bb6

Thanks to @carterbox for the patch: conda-forge#457 (comment) Signed-off-by: Michał Górny <mgorny@quansight.com>

MNT: Re-rendered with conda-smithy 3.54.1 and conda-forge-pinning 202…

cfb7619

…6.01.26.08.52.07 Other tools: - conda-build 25.11.1 - rattler-build 0.55.1 - rattler-build-conda-compat 1.4.10

mgorny mentioned this pull request Jan 26, 2026

Upgrade to CUDA 13.0 #457

Closed

jameslamb mentioned this pull request Jan 26, 2026

PyTorch CUDA 13 packages + RAPIDS rapidsai/build-planning#215

Closed

8 tasks

mgorny force-pushed the triton-py314-cuda13 branch from f5e1df4 to cfb7619 Compare January 26, 2026 17:47

jakirkham reviewed Jan 26, 2026

View reviewed changes

Comment thread recipe/meta.yaml

Comment thread recipe/meta.yaml

h-vetinari changed the title ~~Triton bump, py3.14 + CUDA~~ Triton bump, py3.14 + CUDA 13.0 Jan 26, 2026

h-vetinari mentioned this pull request Jan 26, 2026

Publish output refactor for CUDA 12.8 conda-forge/libmagma-feedstock#32

Closed

h-vetinari and others added 2 commits January 27, 2026 08:34

build non-CUDA builds on CPU agents

0fa03a7

Co-Authored-By: Isuru Fernando <isuruf@gmail.com>

MNT: Re-rendered with conda-smithy 3.54.2.dev14+gea21a2e41 and conda-…

e1975df

…forge-pinning 2026.01.26.08.52.07 Other tools: - conda-build 25.11.1.dev19+dirty - rattler-build 0.55.1 - rattler-build-conda-compat 1.4.10

h-vetinari force-pushed the triton-py314-cuda13 branch from 30514e3 to e1975df Compare January 26, 2026 21:38

mgorny marked this pull request as ready for review January 27, 2026 09:51

mgorny requested review from Tobias-Fischer, baszalmstra and beckermr as code owners January 27, 2026 09:51

mgorny requested review from hmaarrfk, jeongseok-meta and sodre as code owners January 27, 2026 09:51

mgorny mentioned this pull request Jan 27, 2026

Try to fix TorchConfig.cmake #480

Closed

mgorny added 2 commits January 28, 2026 20:12

Add a regression test for bug conda-forge#479

b3d1078

Signed-off-by: Michał Górny <mgorny@quansight.com>

Fix torch include dirs in TorchConfig.cmake

37b7252

Fixes conda-forge#479 Signed-off-by: Michał Górny <mgorny@quansight.com>

h-vetinari merged commit 238fe50 into conda-forge:main Jan 29, 2026
13 of 32 checks passed

traversaro mentioned this pull request Jan 29, 2026

use fatbin compression for CUDA 13.0+ #419

Closed

mgorny deleted the triton-py314-cuda13 branch January 30, 2026 10:11

jameslamb mentioned this pull request Jan 30, 2026

restore conda-python-tests on CUDA 13 rapidsai/cugraph-gnn#395

Merged

lucascolley mentioned this pull request Feb 1, 2026

MAINT: officially support Python 3.14 data-apis/array-api-extra#567

Merged

h-vetinari mentioned this pull request Feb 2, 2026

Status Report (2026-01-30) Quansight-Labs/conda-ecosystem-sta-mgmt#110

Closed

jameslamb mentioned this pull request Feb 5, 2026

Python 3.14 support rapidsai/build-planning#205

Closed

38 tasks

jameslamb mentioned this pull request Mar 5, 2026

include pytorch conda packages in CUDA 13 test env rapidsai/cudf#21663

Merged

3 tasks

Uh oh!

Conversation

mgorny commented Jan 26, 2026

Uh oh!

conda-forge-admin commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgorny commented Jan 26, 2026

Uh oh!

mgorny commented Jan 26, 2026

Uh oh!

h-vetinari commented Jan 26, 2026

Uh oh!

jakirkham left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mgorny commented Jan 26, 2026

Uh oh!

mgorny commented Jan 26, 2026

Uh oh!

h-vetinari commented Jan 26, 2026

Uh oh!

mgorny commented Jan 28, 2026

Uh oh!

h-vetinari commented Jan 29, 2026

Uh oh!

h-vetinari commented Jan 29, 2026

Uh oh!

Uh oh!

bdice commented Jan 29, 2026

Uh oh!

Tobias-Fischer commented Jan 29, 2026

Uh oh!

h-vetinari commented Jan 29, 2026

Uh oh!

jakirkham commented Jan 30, 2026

Uh oh!

h-vetinari commented Jan 30, 2026

Uh oh!

h-vetinari commented Jan 30, 2026

Uh oh!

h-vetinari commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgorny commented Jan 30, 2026

Uh oh!

jakirkham commented Jan 30, 2026

Uh oh!

h-vetinari commented Jan 30, 2026

Uh oh!

h-vetinari commented Jan 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

conda-forge-admin commented Jan 26, 2026 •

edited

Loading

h-vetinari commented Jan 30, 2026 •

edited

Loading