Skip to content

Add per-file (PEP 503) #sha256= hashes to the py.mujoco.org simple index #3344

Description

@hartikainen

The feature, motivation and pitch

Hey folks,

We've been consuming the nightly index at https://py.mujoco.org/ from Bazel's rules_python, and we're hitting a small issue: the index doesn't expose per-file hashes, so rules_python can't pin the wheels and falls back to the host pip. That in turn breaks cross-platform image builds (e.g., building a Linux image on a macOS host) because of architecture mismatches. I poked at the index a bit to confirm what's going on, and found that the url anchors carry no #sha256= fragment and there data-requires-python is also missing. Furthermore, the server seems to ignore the Accept header (it responds vary: Accept-Encoding only, and always text/html, so there's no PEP 691 JSON endpoint). I think the fix on your side is pretty small, so I wanted to bring this to your attention.

My main ask is just to include the per-file hashes in the index to satisfy PEP 503 simple index. This way rules_python (and pip/uv in general) can pin against them.

A PEP 503 "simple" index is just HTML anchor tags, and the hash is an optional URL fragment on the href:

Here's what's currently emitted:

<a href="https://py.mujoco.org/mujoco-3.x.x-cp311-...whl">mujoco-3.x.x-...whl</a>

Here's what should be emitted:

<a href="https://py.mujoco.org/mujoco-3.x.x-cp311-...whl#sha256=abcd1234...">mujoco-3.x.x-...whl</a>

I'd guess you just need a script that walks the wheel bucket and writes something like hashlib.sha256(wheel_bytes).hexdigest() for each file and append the #sha256=... to the html href.

Even better, more modern and robust way would be to expose PEP 691 (JSON) endpoint in addition to the (existing) PEP 503 (HTML) endpoint. Both of these are served through the same URL, but just with different content type set by the client. The PEP 691 server should expose:

The json for the project page, e.g. GET https://py.mujoco.org/mujoco/ looks something like:

{
  "meta": { "api-version": "1.0" },
  "name": "mujoco",
  "files": [
    {
      "filename": "mujoco-3.3.0-cp311-cp311-manylinux2014_x86_64.whl",
      "url": "https://py.mujoco.org/files/mujoco-3.3.0-cp311-cp311-manylinux2014_x86_64.whl",
      "hashes": { "sha256": "abcd1234..." },
      "requires-python": ">=3.9",
      "yanked": false
    }
  ]
}

So basically, here's what I think you'd have to do:

  1. Walk the wheel bucket (or wherever your wheels live), and for each wheel compute hashlib.sha256(wheel_bytes).hexdigest().
  2. Append #sha256=<digest> to that file's href when writing the index HTML (and, while you're in there, optionally emit data-requires-python from each wheel's metadata, which is a nice-to-have but not needed for our case).
  3. Regenerate/re-upload the index, ideally backfilling the existing wheels and not just the new ones, so that historical pins keep working.

You should be able to check the results with (this is roughly what I ran to confirm the current state):

# Before: no hashes anywhere in the index.
$ curl -sS https://py.mujoco.org/mujoco/ | rg -c 'sha256='
0

# After: every file should carry a hash (one per anchor, so this should
# match the anchor count, ~4055 at the time of writing).
$ curl -sS https://py.mujoco.org/mujoco/ | rg -c 'sha256='
4055

If you also decide to add the PEP 691 (JSON) endpoint, you can confirm the content negotiation works with:

$ curl -sS -D - -o /dev/null \
  -H 'Accept: application/vnd.pypi.simple.v1+json' \
  https://py.mujoco.org/mujoco/ \
| rg -i 'content-type|vary'

content-type: application/vnd.pypi.simple.v1+json
vary: Accept

(Right now this returns content-type: text/html and vary: Accept-Encoding, i.e. the Accept header is ignored.)

The #sha256= fragment alone would make our life a bit easier, so the JSON endpoint is a just a nice bonus/best-practice.

Happy to help out or test against the updated index if that's useful.

Alternatives

I can work around this with something like:

+mujoco @ https://py.mujoco.org/mujoco/mujoco-3.10.0.dev929828707-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
-mujoco>=3.10.0.dev929828707,<3.11

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions