feat: unified pypi hub repository#3837
Conversation
Introduces the finalized implementation blueprint under .agents/plans/ to document our automatic canonical proxy repository and executable transition specifications.
Incorporate disjoint package execution-phase failure targets so cquery/query pass successfully over unrepresented select branches. Adds multi-hub memory minimization guidelines and test specifications.
Incorporate dynamic unionizing of custom extra_hub_aliases into the canonical proxy repository to fully support specialized wheel targets. Adds extra alias test specifications.
Currently, standard `py_library` targets cannot easily depend on third-party PyPI packages that might resolve to different concrete hubs based on the `py_binary` target including them. To enable this, synthesize an automatic canonical `@pypi` proxy repository that exposes package aliases using `select()` expressions over a new `pypi_hub` Starlark build setting. This builds upon executable transitions so `py_binary` automatically configures the appropriate PyPI resolution spoke. Sibling or disjoint packages missing from a specific spoke fail gracefully during the execution phase. * Adds `pypi_hub` build setting and base executable transition labels * Implemented `proxy_hub_repository` and `missing_package_error` * Unified pip wheel modifications and extra spoke aliases in proxy * Author integration test verifying multi-hub fallbacks and CLI overrides
Currently, if a `pip.default` tag is used solely to delete a platform (by passing only `platform`), it fails validation because `config_settings` is enforced. Furthermore, unit test mock structs lacked `default_hub`. To fix, refactor `build_config` to identify platform deletions and skip `config_settings` validation for them. Additionally, update `_default` and `_parse_modules` mock helpers in Starlark unit tests.
- Rename proxy_hub_repository to unified_hub_repository and match _impl naming conventions - Factor JSON configuration encoding out of repository rule attributes into distinct Starlark attributes - Reference public pypi_hub config_setting flag by canonical label instead of string attribute - Update missing package error messages to refer to PyPI packages - Factor unified hub synthesis in extension.bzl into a dedicated helper function
…et setup - Add documentation for pypi_hub build flag to config_settings API docs - Update missing package error execution prefix to 'ERROR:' - Rename unified_hub_repository.bzl to unified_hub_repo.bzl to comply with repository rule file naming guidelines - Factor BUILD file population logic out of unified_hub_repo.bzl into dedicated define_pypi_hub_flag_config_settings and define_pypi_package_targets helper functions in setup_unified_hub.bzl - Remove mandatory enforcement of default_hub to allow seamless error reporting when 0 fallback hubs exist
- Rename setup_unified_hub to unified_hub_setup to align with file naming conventions - Update config_settings docs for pypi_hub flag to clarify target transitions - Implement render.str in text_util.bzl to cleanly handle None values - Refactor unified_hub_repo.bzl to utilize render.str and parenthesized boolean expressions
There was a problem hiding this comment.
Code Review
This pull request implements the Canonical Automatic PyPI Proxy Hub feature for rules_python under bzlmod, introducing a @pypi proxy repository that dynamically routes package dependencies to concrete hubs based on the new pypi_hub build setting. It also handles disjoint packages by generating execution-phase action failures and supports unionizing custom extra hub aliases. The feedback highlights three key issues in python/private/pypi/extension.bzl: a bug where multiple pip.default tags trigger false duplicate tag failures, a backward compatibility issue with the mandatory config_settings check on platform overrides, and a lack of validation to ensure default_hub matches a defined concrete hub.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
- Rename setup_unified_hub_bzl target to unified_hub_setup_bzl referencing the correct file - Move target to maintain strict alphabetical sorting in BUILD.bazel - Add missing text_util_bzl dependency to unified_hub_repo_bzl target
- Regenerate tests/integration/bzlmod_lockfile/MODULE.bazel.lock following upstream uv changes
Only enforce duplicate default hub tag check when tags explicitly define default_hub, allowing multiple pip.default tags to co-exist without failures. Also validate default_hub is defined at extension phase. - Update duplicate tag loop in python/private/pypi/extension.bzl - Add default_hub definition validation in python/private/pypi/extension.bzl - Add unit test scenario in tests/pypi/extension/extension_tests.bzl - Add integration test in tests/integration/unified_pypi_test.py
Detail the unified PyPI hub proxy feature and related robustness fixes in the new news fragment.
Update phrasing to match user's requested concise description.
Add section under docs/pypi/download.md to explain the unified @pypi proxy hub repository for multi-hub configurations.
Address review feedback on download.md and update Starlark docstrings in extension.bzl to link to the new unified hub section.
Move versionadded block under H2 heading in download.md, and update Bzlmod extension docstrings in extension.bzl to clarify that the unified @pypi hub is always generated.
Delete temporary files generated during CI failure analysis from the repository.
Delete the temporary PR monitoring state file from the repository.
Register the pypi_hub build setting flag target in the features detection map to allow downstream users to detect pypi_hub support.
Update sphinxdocs MODULE.bazel and integration tests to import and use the automatically generated @pypi unified hub repository instead of the concrete dev_pip hub. Also add local_path_overrides to ensure the nested workspaces utilize our local rules_python development version.
This reverts commit 3b6bd82.
|
Ready for review. I think the main win of this, in comparison to using |
| named `pypi` (`pip.parse(hub_name = "pypi")`), the automatic proxy | ||
| synthesis is skipped so the user maintains absolute control over that |
There was a problem hiding this comment.
In this particular case we could:
- Print a warning that it is skipped.
- Make the hub
<module_name>.pypiin this case and tell this to the user. Since this is a special name, we should not get us in a situation where we break if there is a non-root module using this name. - If the user defines a
pip.parsewithpypi, then set that as default automatically.
There was a problem hiding this comment.
Thanks for coming up with some ideas. My initial thinking was to skip the logic to ensure existing behavior wasn't affected, then figure out how we could transition to @pypi being a pip-extension owned name.
The idea to make name=pypi an implicit default is appealing. I'm a bit concerned it may over-complicate how a default is selected. But...I do like it. Feels like a pretty reasonable behavior. So lets do that.
What do you think of:
Print a warning if the name collision occurs. If an env var is set (RULES_PYTHON_PYPI_HUB_RESERVED=1), then the hub is silently renamed module_name.pyi and is used as the default hub (if pip.default wasn't used). In a future release, we flip the default.
Remove unnecessary use_repo() calls for concrete hubs pypi_a and pypi_b, since only the unified @pypi hub needs to be imported.
Move the _whl_mods_repo and _whl_mods_repo_impl definitions back to the very end of the file to match the upstream structure.
Rename the string flag //python/config_settings:pypi_hub to :venv to align with standard virtualenv terminology. Update all transition definitions, private macro helpers, dynamic repository setups, integration tests, and user documentation accordingly.
Clarify that hub_name in the venv flag values corresponds to a pip.parse.hub_name value, and add a Sphinx MyST cross-reference to the --venv flag in the extension docstrings.
Wrap the <hub_name> description block to 79 columns in the config_settings API index, and shorten the --venv flag cross-reference in the extension docstring to use the implicit bzl domain.
Omit the redundant target name <venv> from the flag cross-reference,
simplifying it to just {flag}`--venv=auto`.
Update the design plan to rename pypi_hub to venv, matching our global renaming implementation.
Load the canonical target name constants from labels.bzl and redefine _STANDARD_ALIASES in unified_hub_setup.bzl. This eliminates local redefinition and resolves a silent naming mismatch bug on the extracted wheel files target.
Parse default_hub from the pip.default tag inside the build_config function instead of parse_modules. Inline default_hub in the parse_modules return struct and remove the redundant parsing logic.
Update the transitivedigest of the uv extension in the integration test workspace's lockfile to account for the Starlark refactoring changes in rules_python.
To prevent name collisions with the automatically generated unified @pypi hub proxy, we reserve the hub name 'pypi' and introduce an optional fallback renaming mechanism controlled by the RULES_PYTHON_PYPI_HUB_RESERVED environment variable. When enabled, concrete hubs named 'pypi' are automatically renamed to '<module_name>_pypi' and assigned fallback default hub precedence.
We update the warning-only message to clearly state that in a future release, the concrete hub named 'pypi' will be automatically renamed to '<module_name>_pypi'. This provides a clearer warning to users about the upcoming migration and its default resolution.
We wrap the reserved hub name print warning statements at 80 columns in extension.bzl. We also correct a string interpolation formatting bug by enclosing the concatenated string literals in parentheses before calling .format().
…-hub-dependency-resolution
This implements a pypi hub that is the union of all pypi hubs.
The basic design is:
pipextension always creates a@pypirepo unless the nameis already taken by another hub definition.
--pypi_hubflag dispatches to one of the hubs. If not set,then it uses a default one (first, or as configured)
The set of packages and targets the unified hub exposes is a union
of all other hubs. If the unified hub routes to a hub that doesn't
support such a target, then it points to a target that fails at
execution time. This is to allow query and cquery to work even if
some targets don't exist in some hubs.