Download (bzlmod)

See also

For WORKSPACE instructions see here.

To add PyPI dependencies to your MODULE.bazel file, use the pip.parse extension and call it to create the central external repo and individual wheel external repos. Include the toolchain extension in the MODULE.bazel file as shown in the first bzlmod example above.

pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip")

pip.parse(
    hub_name = "my_deps",
    python_version = "3.13",
    requirements_lock = "//:requirements_lock_3_11.txt",
)

use_repo(pip, "my_deps")

For more documentation, see the Bzlmod examples under the examples folder or the documentation for the @rules_python//python/extensions:pip.bzl extension.

We are using a host-platform compatible toolchain by default to setup pip dependencies. During the setup phase, we create some symlinks, which may be inefficient on Windows by default. In that case use the following .bazelrc options to improve performance if you have admin privileges:

startup --windows_enable_symlinks

This will enable symlinks on Windows and help with bootstrap performance of setting up the hermetic host python interpreter on this platform. Linux and OSX users should see no difference.

Interpreter selection

The pip.parse bzlmod extension by default uses the hermetic Python toolchain for the host platform, but you can customize the interpreter using pip.parse.python_interpreter and pip.parse.python_interpreter_target.

You can use the pip extension multiple times. This configuration will create multiple external repos that have no relation to one another and may result in downloading the same wheels numerous times.

As with any repository rule or extension, if you would like to ensure that pip_parse is re-executed to pick up a non-hermetic change to your environment (e.g., updating your system python interpreter), you can force it to re-execute by running bazel sync --only [pip_parse name].

Requirements for a specific OS/Architecture

In some cases, you may need to use different requirements files for different OS and architecture combinations. This is enabled via the requirements_by_platform attribute in the pip.parse extension and the pip.parse tag class. The keys of the dictionary are labels to the file, and the values are a list of comma-separated target (os, arch) tuples.

For example:

    # ...
    requirements_by_platform = {
        "requirements_linux_x86_64.txt": "linux_x86_64",
        "requirements_osx.txt": "osx_*",
        "requirements_linux_exotic.txt": "linux_exotic",
        "requirements_some_platforms.txt": "linux_aarch64,windows_*",
    },
    # For the list of standard platforms that the rules_python has toolchains for, default to
    # the following requirements file.
    requirements_lock = "requirements_lock.txt",

In case of duplicate platforms, rules_python will raise an error, as there has to be an unambiguous mapping of the requirement files to the (os, arch) tuples.

An alternative way is to use per-OS requirement attributes.

    # ...
    requirements_windows = "requirements_windows.txt",
    requirements_darwin = "requirements_darwin.txt",
    # For the remaining platforms (which is basically only linux OS), use this file.
    requirements_lock = "requirements_lock.txt",
)

Note

If you are using a universal lock file but want to restrict the list of platforms that the lock file will be evaluated against, consider using the aforementioned requirements_by_platform attribute and listing the platforms explicitly.

Multi-platform support

Historically, the pip_parse and pip.parse have only been downloading/building Python dependencies for the host platform that the bazel commands are executed on. Over the years, people started needing support for building containers, and usually, that involves fetching dependencies for a particular target platform that may be different from the host platform.

Multi-platform support for cross-building the wheels can be done in two ways:

  1. using experimental_index_url for the pip.parse bzlmod tag class

  2. using the pip.parse.download_only setting.

Warning

This will not work for sdists with C extensions, but pure Python sdists may still work using the first approach.

By default, rules_python selects the host {os}_{arch} platform from its MODULE.bazel file. This means that rules_python by default does not provide cross-platform building support because some packages have very large wheels and users should be able to use bazel query with minimal overhead. As a result, users should configure their pip.parse calls and select which platforms they want to target via the pip.parse.target_platforms attribute:

    # Example of enabling free threaded and non-freethreaded switching on the host platform:
    target_platforms = ["{os}_{arch}", "{os}_{arch}_freethreaded"],

    # As another example, to enable building for `linux_x86_64` containers and the host platform:
    # target_platforms = ["{os}_{arch}", "linux_x86_64"],
)

Using download_only attribute

Let’s say you have two requirements files:

# requirements.linux_x86_64.txt
--platform=manylinux_2_17_x86_64
--python-version=39
--implementation=cp
--abi=cp39

foo==0.0.1 --hash=sha256:deadbeef
bar==0.0.1 --hash=sha256:deadb00f
# requirements.osx_aarch64.txt contents
--platform=macosx_10_9_arm64
--python-version=39
--implementation=cp
--abi=cp39

foo==0.0.3 --hash=sha256:deadbaaf

With these 2 files your pip.parse could look like:

pip.parse(
    hub_name = "pip",
    python_version = "3.9",
    # Tell `pip` to ignore sdists
    download_only = True,
    requirements_by_platform = {
        "requirements.linux_x86_64.txt": "linux_x86_64",
        "requirements.osx_aarch64.txt": "osx_aarch64",
    },
)

With this, pip.parse will create a hub repository that is going to support only two platforms - cp39_osx_aarch64 and cp39_linux_x86_64 - and it will only use wheels and ignore any sdists that it may find on the PyPI- compatible indexes.

Warning

Because bazel is not aware what exactly is downloaded, the same wheel may be downloaded multiple times.

Note

This will only work for wheel-only setups, i.e., all of your dependencies need to have wheels available on the PyPI index that you use.

Customizing Requires-Dist resolution

In order to understand what dependencies to pull for a particular package, rules_python parses the whl file METADATA. Packages can express dependencies via Requires-Dist, and they can add conditions using “environment markers”, which represent the Python version, OS, etc.

While the PyPI integration provides reasonable defaults to support most platforms and environment markers, the values it uses can be customized in case more esoteric configurations are needed.

To customize the values used, you need to do two things:

  1. Define a target that returns EnvMarkerInfo

  2. Set the //python/config_settings:pip_env_marker_config flag to the target defined in (1).

The keys and values should be compatible with the PyPA dependency specifiers specification. This is not strictly enforced, however, so you can return a subset of keys or additional keys, which become available during dependency evaluation.

Bazel downloader and multi-platform wheel hub repository.

Warning

This is currently still experimental, and whilst it has been proven to work in quite a few environments, the APIs are still being finalized, and there may be changes to the APIs for this feature without much notice.

The issues that you can subscribe to for updates are:

The pip extension supports pulling information from PyPI (or a compatible mirror), and it will ensure that the bazel downloader is used for downloading the wheels.

This provides the following benefits:

  • Integration with the credential_helper to authenticate with private mirrors.

  • Cache the downloaded wheels speeding up the consecutive re-initialization of the repositories.

  • Reuse the same instance of the wheel for multiple target platforms.

  • Allow using transitions and targeting free-threaded and musl platforms more easily.

  • Avoids pip for wheel fetching and results in much faster dependency fetching.

To enable the feature specify pip.parse.experimental_index_url as shown in the examples/bzlmod/MODULE.bazel example.

Similar to uv, one can override the index that is used for a single package. By default, we first search in the index specified by pip.parse.experimental_index_url, then we iterate through the pip.parse.experimental_extra_index_urls unless there are overrides specified via pip.parse.experimental_index_url_overrides.

When using this feature during the pip extension evaluation you will see the accessed indexes similar to below:

Loading: 0 packages loaded
    Fetching module extension @@//python/extensions:pip.bzl%pip; Fetch package lists from PyPI index
    Fetching https://pypi.org/simple/jinja2/

This does not mean that rules_python is fetching the wheels eagerly; rather, it means that it is calling the PyPI server to get the Simple API response to get the list of all available source and wheel distributions. Once it has gotten all of the available distributions, it will select the right ones depending on the sha256 values in your requirements_lock.txt file. If sha256 hashes are not present in the requirements file, we will fall back to matching by version specified in the lock file.

Fetching the distribution information from the PyPI allows rules_python to know which whl should be used on which target platform and it will determine that by parsing the whl filename based on PEP600, PEP656 standards. This allows the user to configure the behaviour by using the following publicly available flags:

Internal dependencies and private repositories

The rules_python Bazel module downloads Python interpreters and dependencies as part of its functionality. These artifacts are fetched using Bazel’s internal HTTP downloader, not using the pip tool.

If you are in a network-restricted environment and must use internal registries, you can configure the Bazel downloader to redirect all of these downloads to a different registry.

Example of a bazel_downloader.cfg:

all_blocked_message See internal.mirror.lan/registry/ for more information
allow s3.amazon.com

# Rewrite everything to files.pythonhosted to the internal mirror with two
# capture groups: the first group matches the host and is appended first,
# the second matches the entire path and is appended second
rewrite (files.pythonhosted.org)/(.*) internal.mirror.lan/python/$1/$2
rewrite (pypi.python.org)/(.*) internal.mirror.lan/python/$1/$2

# Allow the internal mirror and block everything else
allow internal.mirror.lan
block *

Use the config file with --experimental_downloader_config=bazel_downloader.cfg.

How the config is parsed:

  • Uses Java regular expressions

  • Matching is performed only on host and path components of the URL, not the scheme

  • Directives are applied in the following order: rewrite, allow, block

  • Back references are numbered starting from $1

  • Expressions must match the entire string being tested, not just find a substring.

If your patterns don’t seem to match or rewrite:

  • Begin with simple patterns to ensure they match as expected.

  • Be cautious when using block statements to avoid unintentionally blocking necessary downloads. Add block statements incrementally and test thoroughly after each change.

References:

Credential Helper

The Bazel downloader usage allows for the Bazel Credential Helper. Your Python artifact registry may provide a credential helper for you. Refer to your index’s docs to see if one is provided.

The simplest form of a credential helper is a bash script that accepts an argument and spits out JSON to stdout. For a service like Google Artifact Registry that uses ‘Basic’ HTTP Auth and does not provide a credential helper that conforms to the spec, the script might look like:

#!/bin/bash
# cred_helper.sh
ARG=$1  # but we don't do anything with it as it's always "get"

# formatting is optional
echo '{'
echo '  "headers": {'
echo '    "Authorization": ["Basic dGVzdDoxMjPCow=="]'
echo '  }'
echo '}'

Configure Bazel to use this credential helper for your Python index example.com:

# .bazelrc
build --credential_helper=example.com=/full/path/to/cred_helper.sh

Bazel will call this file like cred_helper.sh get and use the returned JSON to inject headers into whatever HTTP(S) request it performs against example.com.

See the Credential Helper Spec for more details.