Skip to content

Commit d0b9baa

Browse files
authored
refactor!: create full venv for bootstrap=system_python (#3473)
The system_python bootstrap is basically just an alternative stage1 bootstrap now. Finish unifying it with bootstrap=script by having it support and create a full venv. While this is the new default behavior, it's disabled for: * Windows: some difficult to debug failures came up, so disable using it there * Bazel 7: Bazel 7 doesn't have the `File.is_symlink` APIs, so it's problematic to enable it there due to zipapp support. Switching from non-venv to venv layouts is a significant change and likely to break something, so marking this as a breaking change. Additional notes: * rules_pkg 1.2+ is needed to package a venv-based binary, due to extensive use of symlinks. Work towards #2156
1 parent 41fccfd commit d0b9baa

File tree

10 files changed

+203
-58
lines changed

10 files changed

+203
-58
lines changed

.bazelci/presubmit.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,8 @@ buildifier:
3131
# As a regression test for #225, check that wheel targets still build when
3232
# their package path is qualified with the repo name.
3333
- "@rules_python//examples/wheel/..."
34-
build_flags:
34+
build_flags: &reusable_config_build_flags
35+
- "--experimental_repository_cache_hardlinks=false"
3536
- "--keep_going"
3637
- "--build_tag_filters=-integration-test"
3738
- "--verbose_failures"
@@ -42,6 +43,7 @@ buildifier:
4243
- "--test_tag_filters=-integration-test"
4344
.common_workspace_flags_min_bazel: &common_workspace_flags_min_bazel
4445
build_flags:
46+
- "--experimental_repository_cache_hardlinks=false"
4547
- "--noenable_bzlmod"
4648
- "--build_tag_filters=-integration-test"
4749
test_flags:
@@ -292,6 +294,7 @@ tasks:
292294
name: "RBE: Ubuntu, minimum Bazel"
293295
platform: rbe_ubuntu2204
294296
build_flags:
297+
- "--experimental_repository_cache_hardlinks=false"
295298
# BazelCI sets --action_env=BAZEL_DO_NOT_DETECT_CPP_TOOLCHAIN=1,
296299
# which prevents cc toolchain autodetection from working correctly
297300
# on Bazel 5.4 and earlier. To workaround this, manually specify the

.bazelrc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,9 @@ common --incompatible_use_plus_in_repo_names
3030
# See https://github.com/bazel-contrib/rules_python/issues/3655
3131
common --incompatible_strict_action_env=false
3232

33+
# To work around bug on bazel 7
34+
common:ci --experimental_repository_cache_hardlinks=false
35+
3336
# Windows makes use of runfiles for some rules
3437
build --enable_runfiles
3538

@@ -50,3 +53,4 @@ common --incompatible_python_disallow_native_rules
5053
common --incompatible_no_implicit_file_export
5154

5255
build --lockfile_mode=update
56+

CHANGELOG.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,17 @@ END_UNRELEASED_TEMPLATE
6363
* {obj}`--windows_enable_symlinks` is required. Add `startup
6464
--windows_enable_symlinks` to your `.bazelrc` to enable Bazel using full
6565
symlink support on Windows.
66+
* venv-based binaries are created by default ({obj}`--bootstrap_impl=system_python`)
67+
on supported platforms (Linux/Mac with Bazel 8+).
6668

6769
Other changes:
6870
* (pypi) Update dependencies used for `compile_pip_requirements`, building
6971
sdists in the `whl_library` rule and fetching wheels using `pip`.
70-
* (pypi) We will set `allow_fail` to `False` if the {attr}`experimental_index_url_overrides` is set
71-
to a non-empty value. This means that failures will be no-longer cached in this particular case.
72-
([#3260](https://github.com/bazel-contrib/rules_python/issues/3260) and
72+
* (pypi) We will set `allow_fail` to `False` if the
73+
{attr}`experimental_index_url_overrides` is set
74+
to a non-empty value. This means that failures will be no-longer cached in
75+
this particular case.
76+
([#3260](https://github.com/bazel-contrib/rules_python/issues/3260) and
7377
[#2632](https://github.com/bazel-contrib/rules_python/issues/2632))
7478

7579
{#v0-0-0-fixed}

MODULE.bazel

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -223,7 +223,7 @@ bazel_dep(name = "rules_testing", version = "0.6.0", dev_dependency = True)
223223
bazel_dep(name = "rules_shell", version = "0.3.0", dev_dependency = True)
224224
bazel_dep(name = "rules_multirun", version = "0.9.0", dev_dependency = True)
225225
bazel_dep(name = "bazel_ci_rules", version = "1.0.0", dev_dependency = True)
226-
bazel_dep(name = "rules_pkg", version = "1.0.1", dev_dependency = True)
226+
bazel_dep(name = "rules_pkg", version = "1.2.0", dev_dependency = True)
227227
bazel_dep(name = "other", version = "0", dev_dependency = True)
228228
bazel_dep(name = "another_module", version = "0", dev_dependency = True)
229229

python/private/py_executable.bzl

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -523,9 +523,36 @@ def _create_zip_main(ctx, *, stage2_bootstrap, runtime_details, venv):
523523
# * https://github.com/python/cpython/blob/main/Modules/getpath.py
524524
# * https://github.com/python/cpython/blob/main/Lib/site.py
525525
def _create_venv(ctx, output_prefix, imports, runtime_details, add_runfiles_root_to_sys_path, extra_deps):
526-
create_full_venv = BootstrapImplFlag.get_value(ctx) == BootstrapImplFlag.SCRIPT
527526
venv = "_{}.venv".format(output_prefix.lstrip("_"))
528527

528+
# The pyvenv.cfg file must be present to trigger the venv site hooks.
529+
# Because it's paths are expected to be absolute paths, we can't reliably
530+
# put much in it. See https://github.com/python/cpython/issues/83650
531+
pyvenv_cfg = ctx.actions.declare_file("{}/pyvenv.cfg".format(venv))
532+
ctx.actions.write(pyvenv_cfg, "")
533+
534+
is_bootstrap_script = BootstrapImplFlag.get_value(ctx) == BootstrapImplFlag.SCRIPT
535+
is_windows = target_platform_has_any_constraint(ctx, ctx.attr._windows_constraints)
536+
537+
create_full_venv = True
538+
539+
# The legacy build_python_zip codepath (enabled by default on windows) isn't
540+
# compatible with full venv.
541+
# TODO: Use non-build_python_zip codepath for Windows
542+
if is_windows:
543+
create_full_venv = False
544+
elif not rp_config.bazel_8_or_later and not is_bootstrap_script:
545+
# Full venv for Bazel 7 + system_python is disabled because packaging
546+
# it using build_python_zip=true or rules_pkg breaks.
547+
# * Using build_python_zip=true breaks because the legacy zipapp support
548+
# doesn't handle symlinks correctly.
549+
# * Using rules_pkg breaks for two reasons:
550+
# 1. It requires rules_pkg 1.2, which crashes under Bazel 7
551+
# 2. It requires File.is_symlink, which is a Bazel 8+ API.
552+
# While bootstrap=script has the same problems, it has always been like
553+
# that.
554+
create_full_venv = False
555+
529556
if create_full_venv:
530557
# The pyvenv.cfg file must be present to trigger the venv site hooks.
531558
# Because it's paths are expected to be absolute paths, we can't reliably
@@ -534,7 +561,6 @@ def _create_venv(ctx, output_prefix, imports, runtime_details, add_runfiles_root
534561
ctx.actions.write(pyvenv_cfg, "")
535562
else:
536563
pyvenv_cfg = None
537-
538564
runtime = runtime_details.effective_runtime
539565

540566
venvs_use_declare_symlink_enabled = (
@@ -561,6 +587,7 @@ def _create_venv(ctx, output_prefix, imports, runtime_details, add_runfiles_root
561587
# needed or used at runtime. However, the zip code uses the interpreter
562588
# File object to figure out some paths.
563589
interpreter = ctx.actions.declare_file("{}/{}".format(bin_dir, py_exe_basename))
590+
564591
ctx.actions.write(interpreter, "actual:{}".format(interpreter_actual_path))
565592

566593
elif runtime.interpreter:

python/private/python_bootstrap_template.txt

Lines changed: 120 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,11 @@ from __future__ import print_function
88
import sys
99

1010
import os
11+
from os.path import dirname, join, basename
1112
import subprocess
1213
import uuid
14+
import shutil
15+
1316
# NOTE: The sentinel strings are split (e.g., "%stage2" + "_bootstrap%") so that
1417
# the substitution logic won't replace them. This allows runtime detection of
1518
# unsubstituted placeholders, which occurs when native py_binary is used in
@@ -51,7 +54,14 @@ IS_ZIPFILE = "%is_zipfile%" == "1"
5154
# 0 or 1.
5255
# If 1, then a venv will be created at runtime that replicates what would have
5356
# been the build-time structure.
54-
RECREATE_VENV_AT_RUNTIME="%recreate_venv_at_runtime%"
57+
RECREATE_VENV_AT_RUNTIME = "%recreate_venv_at_runtime%" == "1"
58+
# 0 or 1
59+
# If 1, then the path to python will be resolved by running
60+
# PYTHON_BINARY_ACTUAL to determine the actual underlying interpreter.
61+
RESOLVE_PYTHON_BINARY_AT_RUNTIME = "%resolve_python_binary_at_runtime%" == "1"
62+
# venv-relative path to the site-packages
63+
# e.g. lib/python3.12t/site-packages
64+
VENV_REL_SITE_PACKAGES = "%venv_rel_site_packages%"
5565

5666
WORKSPACE_NAME = "%workspace_name%"
5767

@@ -65,6 +75,7 @@ else:
6575
INTERPRETER_ARGS = [arg for arg in _INTERPRETER_ARGS_RAW.split("\n") if arg]
6676

6777
ADDITIONAL_INTERPRETER_ARGS = os.environ.get("RULES_PYTHON_ADDITIONAL_INTERPRETER_ARGS", "")
78+
EXTRACT_ROOT = os.environ.get("RULES_PYTHON_EXTRACT_ROOT")
6879

6980
def is_running_from_zip():
7081
return IS_ZIPFILE
@@ -149,7 +160,7 @@ def print_verbose(*args, mapping=None, values=None):
149160
if mapping is not None:
150161
for key, value in sorted((mapping or {}).items()):
151162
print(
152-
"bootstrap: stage 1: ",
163+
"bootstrap: stage 1:",
153164
*(list(args) + ["{}={}".format(key, repr(value))]),
154165
file=sys.stderr,
155166
flush=True
@@ -254,10 +265,17 @@ def extract_zip(zip_path, dest_dir):
254265
# https://docs.microsoft.com/en-us/windows/desktop/fileio/naming-a-file#maximum-path-length-limitation
255266
file_path = os.path.abspath(os.path.join(dest_dir, info.filename))
256267
# The Unix st_mode bits (see "man 7 inode") are stored in the upper 16
257-
# bits of external_attr. Of those, we set the lower 12 bits, which are the
258-
# file mode bits (since the file type bits can't be set by chmod anyway).
268+
# bits of external_attr.
259269
attrs = info.external_attr >> 16
260-
if attrs != 0: # Rumor has it these can be 0 for zips created on Windows.
270+
# Symlink bit in st_mode is 0o120000.
271+
if (attrs & 0o170000) == 0o120000:
272+
with open(file_path, "r") as f:
273+
target = f.read()
274+
os.remove(file_path)
275+
os.symlink(target, file_path)
276+
# Of those, we set the lower 12 bits, which are the
277+
# file mode bits (since the file type bits can't be set by chmod anyway).
278+
elif attrs != 0: # Rumor has it these can be 0 for zips created on Windows.
261279
os.chmod(file_path, attrs & 0o7777)
262280

263281
# Create the runfiles tree by extracting the zip file
@@ -268,6 +286,57 @@ def create_runfiles_root():
268286
# important that deletion code be in sync with this directory structure
269287
return os.path.join(temp_dir, 'runfiles')
270288

289+
def _create_venv(runfiles_root):
290+
runfiles_venv = join(runfiles_root, dirname(dirname(PYTHON_BINARY)))
291+
if EXTRACT_ROOT:
292+
venv = join(EXTRACT_ROOT, runfiles_venv)
293+
os.makedirs(venv, exist_ok=True)
294+
cleanup_dir = None
295+
else:
296+
import tempfile
297+
venv = tempfile.mkdtemp("", f"bazel.{basename(runfiles_venv)}.")
298+
cleanup_dir = venv
299+
300+
python_exe_actual = find_binary(runfiles_root, PYTHON_BINARY_ACTUAL)
301+
302+
# See stage1_bootstrap_template.sh for details on this code path. In short,
303+
# this handles when the build-time python version doesn't match runtime
304+
# and if the initially resolved python_exe_actual is a wrapper script.
305+
if RESOLVE_PYTHON_BINARY_AT_RUNTIME:
306+
src = f"""
307+
import sys, site
308+
print(sys.executable)
309+
print(site.getsitepackages(["{venv}"])[-1])
310+
"""
311+
output = subprocess.check_output([python_exe_actual, "-I"], shell=True,
312+
encoding = "utf8", input=src)
313+
output = output.strip().split("\n")
314+
python_exe_actual = output[0]
315+
venv_site_packages = output[1]
316+
os.makedirs(dirname(venv_site_packages), exist_ok=True)
317+
runfiles_venv_site_packages = join(runfiles_venv, VENV_REL_SITE_PACKAGES)
318+
else:
319+
python_exe_actual = find_binary(runfiles_root, PYTHON_BINARY_ACTUAL)
320+
venv_site_packages = join(venv, "lib")
321+
runfiles_venv_site_packages = join(runfiles_venv, "lib")
322+
323+
if python_exe_actual is None:
324+
raise AssertionError('Could not find python binary: ' + repr(PYTHON_BINARY_ACTUAL))
325+
326+
venv_bin = join(venv, "bin")
327+
try:
328+
os.mkdir(venv_bin)
329+
except FileExistsError as e:
330+
pass
331+
332+
# Match the basename; some tools, e.g. pyvenv key off the executable name
333+
venv_python_exe = join(venv_bin, os.path.basename(python_exe_actual))
334+
_symlink_exist_ok(from_=venv_python_exe, to=python_exe_actual)
335+
_symlink_exist_ok(from_=join(venv, "lib"), to=join(runfiles_venv, "lib"))
336+
_symlink_exist_ok(from_=venv_site_packages, to=runfiles_venv_site_packages)
337+
_symlink_exist_ok(from_=join(venv, "pyvenv.cfg"), to=join(runfiles_venv, "pyvenv.cfg"))
338+
return cleanup_dir, venv_python_exe
339+
271340
def runfiles_envvar(runfiles_root):
272341
"""Finds the runfiles manifest or the runfiles directory.
273342
@@ -311,7 +380,7 @@ def runfiles_envvar(runfiles_root):
311380
return (None, None)
312381

313382
def execute_file(python_program, main_filename, args, env, runfiles_root,
314-
workspace, delete_runfiles_root):
383+
workspace, delete_dirs):
315384
# type: (str, str, list[str], dict[str, str], str, str|None, str|None) -> ...
316385
"""Executes the given Python file using the various environment settings.
317386
@@ -326,8 +395,8 @@ def execute_file(python_program, main_filename, args, env, runfiles_root,
326395
runfiles_root: (str) Path to the runfiles root directory
327396
workspace: (str|None) Name of the workspace to execute in. This is expected to be a
328397
directory under the runfiles tree.
329-
delete_runfiles_root: (bool), True if the runfiles root should be deleted
330-
after a successful (exit code zero) program run, False if not.
398+
delete_dirs: (list[str]) directories that should be deleted after the user
399+
program has finished running.
331400
"""
332401
argv = [python_program]
333402
argv.extend(INTERPRETER_ARGS)
@@ -351,20 +420,19 @@ def execute_file(python_program, main_filename, args, env, runfiles_root,
351420
# can't execv because we need control to return here. This only
352421
# happens for targets built in the host config.
353422
#
354-
if not (is_windows() or workspace or delete_runfiles_root):
423+
if not (is_windows() or workspace or delete_dirs):
355424
_run_execv(python_program, argv, env)
356425

426+
print_verbose("run: subproc: environ:", mapping=os.environ)
427+
print_verbose("run: subproc: cwd:", workspace)
428+
print_verbose("run: subproc: argv:", values=argv)
357429
ret_code = subprocess.call(
358-
argv,
359-
env=env,
360-
cwd=workspace
361-
)
430+
argv, env=env, cwd=workspace)
362431

363-
if delete_runfiles_root:
364-
# NOTE: dirname() is called because create_runfiles_root() creates a
365-
# sub-directory within a temporary directory, and we want to remove the
366-
# whole temporary directory.
367-
shutil.rmtree(os.path.dirname(runfiles_root), True)
432+
if delete_dirs:
433+
for delete_dir in delete_dirs:
434+
print_verbose("rmtree:", delete_dir)
435+
shutil.rmtree(delete_dir, True)
368436
sys.exit(ret_code)
369437

370438
def _run_execv(python_program, argv, env):
@@ -374,9 +442,27 @@ def _run_execv(python_program, argv, env):
374442
print_verbose("RunExecv: environ:", mapping=os.environ)
375443
print_verbose("RunExecv: python:", python_program)
376444
print_verbose("RunExecv: argv:", values=argv)
377-
os.execv(python_program, argv)
445+
try:
446+
os.execv(python_program, argv)
447+
except:
448+
with open(python_program, 'rb') as f:
449+
print_verbose("pyprog head:" + str(f.read(50)))
450+
raise
451+
452+
def _symlink_exist_ok(*, from_, to):
453+
try:
454+
os.symlink(to, from_)
455+
except FileExistsError:
456+
pass
457+
458+
378459

379460
def main():
461+
print_verbose("sys.version:", sys.version)
462+
print_verbose("initial argv:", values=sys.argv)
463+
print_verbose("initial cwd:", os.getcwd())
464+
print_verbose("initial environ:", mapping=os.environ)
465+
print_verbose("initial sys.path:", values=sys.path)
380466
print_verbose("STAGE2_BOOTSTRAP:", STAGE2_BOOTSTRAP)
381467
print_verbose("PYTHON_BINARY:", PYTHON_BINARY)
382468
print_verbose("PYTHON_BINARY_ACTUAL:", PYTHON_BINARY_ACTUAL)
@@ -399,12 +485,16 @@ def main():
399485
main_rel_path = os.path.normpath(STAGE2_BOOTSTRAP)
400486
print_verbose("main_rel_path:", main_rel_path)
401487

488+
delete_dirs = []
489+
402490
if is_running_from_zip():
403491
runfiles_root = create_runfiles_root()
404-
delete_runfiles_root = True
492+
# NOTE: dirname() is called because create_runfiles_root() creates a
493+
# sub-directory within a temporary directory, and we want to remove the
494+
# whole temporary directory.
495+
delete_dirs.append(dirname(runfiles_root))
405496
else:
406497
runfiles_root = find_runfiles_root(main_rel_path)
407-
delete_runfiles_root = False
408498

409499
print_verbose("runfiles root:", runfiles_root)
410500

@@ -433,6 +523,14 @@ def main():
433523
repr(PYTHON_BINARY_ACTUAL)
434524
))
435525

526+
if RECREATE_VENV_AT_RUNTIME:
527+
# When the venv is created at runtime, python_program is PYTHON_BINARY_ACTUAL
528+
# so we have to re-point it to the symlink in the venv
529+
venv, python_program = _create_venv(runfiles_root)
530+
delete_dirs.append(venv)
531+
else:
532+
python_program = find_python_binary(runfiles_root)
533+
436534
# Some older Python versions on macOS (namely Python 3.7) may unintentionally
437535
# leave this environment variable set after starting the interpreter, which
438536
# causes problems with Python subprocesses correctly locating sys.executable,
@@ -456,7 +554,7 @@ def main():
456554
execute_file(
457555
python_program, main_filename, args, new_env, runfiles_root,
458556
workspace,
459-
delete_runfiles_root = delete_runfiles_root,
557+
delete_dirs = delete_dirs,
460558
)
461559

462560
except EnvironmentError:

python/private/stage1_bootstrap_template.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@ if [[ -n "${RULES_PYTHON_BOOTSTRAP_VERBOSE:-}" ]]; then
66
set -x
77
fi
88

9-
# runfiles-relative path
9+
# runfiles-root-relative path
1010
STAGE2_BOOTSTRAP="%stage2_bootstrap%"
1111

12-
# runfiles-relative path to python interpreter to use.
12+
# runfiles-root-relative path to python interpreter to use.
1313
# This is the `bin/python3` path in the binary's venv.
1414
PYTHON_BINARY='%python_binary%'
1515
# The path that PYTHON_BINARY should symlink to.
16-
# runfiles-relative path, absolute path, or single word.
16+
# runfiles-root-relative path, absolute path, or single word.
1717
# Only applicable for zip files or when venv is recreated at runtime.
1818
PYTHON_BINARY_ACTUAL="%python_binary_actual%"
1919

@@ -211,7 +211,7 @@ elif [[ "$RECREATE_VENV_AT_RUNTIME" == "1" ]]; then
211211
read -r resolved_py_exe
212212
read -r resolved_site_packages
213213
} < <("$python_exe_actual" -I <<EOF
214-
import sys, site, os
214+
import sys, site
215215
print(sys.executable)
216216
print(site.getsitepackages(["$venv"])[-1])
217217
EOF

0 commit comments

Comments
 (0)