-
-
Notifications
You must be signed in to change notification settings - Fork 698
feat: unified pypi hub repository #3837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rickeylev
wants to merge
42
commits into
bazel-contrib:main
from
rickeylev:pypi-hub-dependency-resolution
Open
Changes from all commits
Commits
Show all changes
42 commits
Select commit
Hold shift + click to select a range
db26e5f
Add architectural plan for Unified PyPI Hub dynamic dependencies
rickeylev b708dd5
Update Unified PyPI Hub plan with execution-phase failure mechanism
rickeylev 391ada0
Update Unified PyPI Hub plan with unionized extra aliases logic
rickeylev 3eaa93d
feat(pypi): allow unified PyPI proxy hub and dynamic dependencies
rickeylev d1c7cfe
fix(pypi): handle platform deletion tags and mock structs
rickeylev f39c8a2
Merge upstream/main into pypi-hub-dependency-resolution
rickeylev f1e993e
refactor(pypi): address PR review comments for unified hub proxy
rickeylev 65d3c41
refactor(pypi): complete architectural separation of unified hub targ…
rickeylev 5898c55
refactor(pypi): address final code review comments for unified hub setup
rickeylev a6fcdc4
Merge upstream/main into pypi-hub-dependency-resolution
rickeylev d73d294
Merge upstream/main into pypi-hub-dependency-resolution
rickeylev eccb0f3
fix(pypi): rename setup_unified_hub_bzl target to unified_hub_setup_bzl
rickeylev d46cb22
fix(pypi): update integration test lockfile
rickeylev 3d3a6a7
Resolve PR #3837 review comments for unified hub
rickeylev 97e382e
Add news fragment for PR #3837
rickeylev 75da17a
Refine news fragment for PR #3837
rickeylev d284b4f
Document Bzlmod Unified @pypi Hub feature
rickeylev 745ae20
Refine unified hub docs and update Bzlmod API docstrings
rickeylev 4ea3971
Refine unified hub docs and docstrings based on review
rickeylev 2265810
Remove temporary CI log and plan files
rickeylev 0f0cd10
Remove monitored PR state file
rickeylev 9c52822
Add //python/config_settings:pypi_hub to features.targets
rickeylev 501f25c
Merge upstream/main into pypi-hub-dependency-resolution
rickeylev 4655cc7
Merge upstream/main into pypi-hub-dependency-resolution
rickeylev 3b6bd82
Switch sphinxdocs codebase to use unified @pypi hub
rickeylev 0d22a25
Revert "Switch sphinxdocs codebase to use unified @pypi hub"
rickeylev b0db850
docs(pypi): simplify Bzlmod unified hub example
rickeylev ed76414
refactor(pypi): move _whl_mods_repo definitions to end of extension.bzl
rickeylev 04231ce
refactor(pypi): rename pypi_hub build flag to venv
rickeylev 5a6313c
docs(pypi): address review comments on venv flag documentation
rickeylev a57059e
docs(pypi): refine stardoc and myst flag cross-references
rickeylev cc50ae1
docs(pypi): simplify flag cross-reference in extension docstring
rickeylev e45f37f
refactor(docs): update implementation plan to use venv flag
rickeylev 1d33b0b
refactor(pypi): align standard aliases list with labels.bzl constants
rickeylev 4a5604f
refactor(pypi): move default_hub parsing into build_config
rickeylev 7343444
refactor(pypi): update integration test MODULE.bazel.lock
rickeylev bc1854c
Merge branch 'main' into pypi-hub-dependency-resolution
rickeylev 9393965
Merge branch 'upstream/main' into pypi-hub-dependency-resolution
rickeylev 116b013
feat(pypi): reserve 'pypi' hub name and add fallback renaming
rickeylev d484bc2
chore(pypi): improve reserved pypi hub name warning message
rickeylev cfa6db2
style(pypi): wrap reserved pypi hub warning messages at 80 columns
rickeylev 45df215
Merge remote branch 'origin/pypi-hub-dependency-resolution' into pypi…
rickeylev File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,296 @@ | ||
| # Implementation Plan: Canonical Automatic PyPI Proxy Hub | ||
|
|
||
| This document defines the locked, production-ready architectural, Starlark API, | ||
| and testing specifications for implementing dynamic PyPI dependency resolution in | ||
| `rules_python` using the `venv` flag. | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Architectural Strategy: The Canonical `@pypi` Proxy | ||
|
|
||
| The `pip` bzlmod extension will automatically synthesize a canonical `@pypi` | ||
| proxy repository rule that orchestrates routing to underlying concrete hubs. | ||
|
|
||
| ### Bzlmod-Exclusive Scope | ||
|
|
||
| The Unified PyPI Hub Proxy is an **exclusive feature of `bzlmod`**. Legacy | ||
| `WORKSPACE` evaluations using independent `pip_parse` repository macros are not | ||
| supported, as bzlmod's module extension architecture provides the required | ||
| centralized coordination to inspect and interlink cross-module hubs. | ||
|
|
||
| ### Automatic Proxy Construction & Collision Logic | ||
|
|
||
| During the evaluation of the `pip` extension across the dependency graph: | ||
| 1. **Unconditional Creation**: The extension will **always** synthesize a | ||
| proxy repository rule with the apparent name `pypi`, even if zero | ||
| `pip.parse` concrete hubs are defined in the dependency graph (in which | ||
| case the proxy is completely valid but empty). | ||
| 2. **Collision Prevention**: If a user explicitly defines a concrete hub | ||
| named `pypi` (`pip.parse(hub_name = "pypi")`), the automatic proxy | ||
| synthesis is skipped so the user maintains absolute control over that | ||
| repository name. | ||
|
|
||
| In `MODULE.bazel`: | ||
| ```starlark | ||
| pip = use_extension("@rules_python//python/extensions:pip.bzl", "pip") | ||
|
|
||
| # Concrete hubs defined for different execution contexts | ||
| pip.parse(hub_name = "pypi_a", ...) | ||
| pip.parse(hub_name = "pypi_b", ...) | ||
|
|
||
| # Designate 'pypi_b' as the default hub for the unified '@pypi' repository | ||
| pip.default(default_hub = "pypi_b") | ||
|
|
||
| # The canonical proxy is automatically created unconditionally: | ||
| use_repo(pip, "pypi") | ||
| ``` | ||
|
|
||
| ### Unified PyPI Hub | ||
|
|
||
| The canonical `@pypi` proxy repository matches exactly how concrete hubs create | ||
| their directory structure: a root package for shared configuration settings, and | ||
| a dedicated subdirectory (subpackage) for each PyPI package. | ||
|
|
||
| Here is a complete, representative code example of what the generated files in | ||
| `@pypi` will look like when resolving packages between `pypi_a` and `pypi_b`: | ||
|
|
||
| #### 1. `@pypi//BUILD.bazel` (Root Package) | ||
| The root package contains the shared `config_setting` targets following the | ||
| `_is_venv_<name>` private naming convention. Leading underscores are strictly | ||
| applied because these configuration settings are an internal implementation | ||
| detail of the proxy repository and are not intended to be a public API. | ||
|
|
||
| ```starlark | ||
| package(default_visibility = ["//visibility:public"]) | ||
|
|
||
| config_setting( | ||
| name = "_is_venv_pypi_a", | ||
| flag_values = { | ||
| "@rules_python//python/config_settings:venv": "pypi_a", | ||
| }, | ||
| ) | ||
|
|
||
| config_setting( | ||
| name = "_is_venv_pypi_b", | ||
| flag_values = { | ||
| "@rules_python//python/config_settings:venv": "pypi_b", | ||
| }, | ||
| ) | ||
| ``` | ||
|
|
||
| #### 2. `@pypi//foo/BUILD.bazel` (PyPI Package Subpackage) | ||
| Each PyPI package subpackage defines the standard aliases (`pkg`, `whl`, `data`, | ||
| `dist_info`, `extracted_wheel_files`), plus a complete **union of all custom | ||
| `extra_hub_aliases`** defined across all concrete hubs. | ||
|
|
||
| Each alias resolves dynamically to the active concrete hub based on the root | ||
| private configuration settings: | ||
|
|
||
| ```starlark | ||
| package(default_visibility = ["//visibility:public"]) | ||
|
|
||
| alias( | ||
| name = "foo", | ||
| actual = ":pkg", | ||
| ) | ||
|
|
||
| alias( | ||
| name = "pkg", | ||
| actual = select({ | ||
| "//:_is_venv_pypi_a": "@pypi_a//foo:pkg", | ||
| "//:_is_venv_pypi_b": "@pypi_b//foo:pkg", | ||
| # When venv is "auto" (unset), it defaults to the designated fallback | ||
| # (or first defined concrete hub). | ||
| "//conditions:default": "@pypi_b//foo:pkg", | ||
| }), | ||
| ) | ||
|
|
||
| alias( | ||
| name = "whl", | ||
| actual = select({ | ||
| "//:_is_venv_pypi_a": "@pypi_a//foo:whl", | ||
| "//:_is_venv_pypi_b": "@pypi_b//foo:whl", | ||
| "//conditions:default": "@pypi_b//foo:whl", | ||
| }), | ||
| ) | ||
|
|
||
| # ... standard aliases for data, dist_info, extracted_wheel_files ... | ||
|
|
||
| # 3. Unionized custom extra alias (defined in pypi_a but missing in pypi_b): | ||
| alias( | ||
| name = "my_custom_tool", | ||
| actual = select({ | ||
| "//:_is_venv_pypi_a": "@pypi_a//foo:my_custom_tool", | ||
| # Unrepresented branch routes to execution failure target: | ||
| "//:_is_venv_pypi_b": "//:_missing_package_error_pypi_b_foo", | ||
| "//conditions:default": "@pypi_a//foo:my_custom_tool", | ||
| }), | ||
| ) | ||
| ``` | ||
|
|
||
| ### Disjoint Hub Packages & Execution-Phase Failure | ||
|
|
||
| If a package exists in one concrete hub but is missing in another (e.g., `scipy` | ||
| is in `pypi_b` but not `pypi_a`), our proxy synthesizes a package subpackage for | ||
| the union of all packages. | ||
|
|
||
| To ensure that `bazel cquery` and `bazel query` successfully analyze over the | ||
| entire transitive build graph without failing, unrepresented select branches | ||
| must route to a dedicated **execution-phase error rule**. | ||
|
|
||
| ```starlark | ||
| # In @pypi//scipy/BUILD.bazel | ||
| alias( | ||
| name = "pkg", | ||
| actual = select({ | ||
| # Routes to execution-phase action failure target: | ||
| "//:_is_venv_pypi_a": "//:_missing_package_error_pypi_a_scipy", | ||
| "//:_is_venv_pypi_b": "@pypi_b//scipy:pkg", | ||
| "//conditions:default": "@pypi_b//scipy:pkg", | ||
| }), | ||
| ) | ||
| ``` | ||
|
|
||
| The synthesized `//:_missing_package_error_XX` rule in `@pypi//BUILD.bazel` | ||
| returns standard Starlark Python providers so analysis/cquery passes, but | ||
| registers a build action that fails when executed: | ||
|
|
||
| ``` | ||
| Dependency Error: Third-party package 'scipy' is not available when building under PyPI hub 'pypi_a'. | ||
| ``` | ||
|
|
||
| ### Fallback Hub Precedence (`"auto"`) | ||
|
|
||
| When a target depends on `@pypi//foo` and the active build setting is `"auto"`, | ||
| the proxy resolves to a concrete hub using the following precedence: | ||
| 1. **Designated Fallback**: If the user has explicitly designated a fallback | ||
| concrete hub via `pip.default(default_hub = "...")` in their root | ||
| `MODULE.bazel`, the proxy routes to it. | ||
| 2. **First Defined Hub**: If no fallback is explicitly designated via | ||
| `pip.default()`, the proxy **automatically routes to the first defined | ||
| concrete hub** parsed during extension evaluation (e.g., `pypi_a`). | ||
|
|
||
| ```starlark | ||
| # Explicitly override the "auto" fallback hub | ||
| pip.default( | ||
| default_hub = "pypi_b", | ||
| ) | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 2. Core Rule Integration: `config_settings` Transitions | ||
|
|
||
| Users will switch active hubs using the standard, highly generic | ||
| `config_settings` transition attribute on executable targets. | ||
|
|
||
| ### Build Setting Definition | ||
|
|
||
| In `python/config_settings/BUILD.bazel`: | ||
|
|
||
| ```starlark | ||
| string_flag( | ||
| name = "venv", | ||
| build_setting_default = "auto", # Default value is "auto" | ||
| visibility = ["//visibility:public"], | ||
| ) | ||
| ``` | ||
|
|
||
| In `python/private/common_labels.bzl`: | ||
| ```starlark | ||
| VENV = str(Label("//python/config_settings:venv")), | ||
| ``` | ||
|
|
||
| In `python/private/transition_labels.bzl`: | ||
| ```starlark | ||
| _BASE_TRANSITION_LABELS = [ | ||
| # ... existing transition labels ... | ||
| labels.VENV, | ||
| ] | ||
| ``` | ||
|
|
||
| Because `py_binary` and `py_test` implement an incoming transition | ||
| (`_transition_executable_impl`) that automatically processes any | ||
| `config_settings` keys matching `TRANSITION_LABELS`, **this provides complete | ||
| transition capabilities with zero changes to our core rule definitions**. | ||
|
|
||
| ### Usage in BUILD.bazel | ||
|
|
||
| Libraries consume packages through the canonical proxy: | ||
|
|
||
| ```starlark | ||
| py_library( | ||
| name = "common", | ||
| deps = ["@pypi//foo"], # Apparent proxy repository | ||
| ) | ||
| ``` | ||
|
|
||
| Binaries change the active hub by transitioning the build setting: | ||
|
|
||
| ```starlark | ||
| # Resolves @pypi -> pypi_b (default hub / designated fallback) | ||
| py_binary( | ||
| name = "bin_default", | ||
| deps = [":common"], | ||
| ) | ||
|
|
||
| # Resolves @pypi -> pypi_a via transition | ||
| py_binary( | ||
| name = "bin_a", | ||
| deps = [":common"], | ||
| config_settings = { | ||
| "//python/config_settings:venv": "pypi_a", | ||
| }, | ||
| ) | ||
| ``` | ||
|
|
||
| ### Analysis Cache & Memory Best Practices | ||
|
|
||
| Because transitions fork the Bazel configuration, building targets with highly | ||
| diversified `config_settings` across large build graphs will result in | ||
| re-analysis and re-compilation of shared dependencies. | ||
|
|
||
| We will include explicit documentation guidelines advising users to keep their | ||
| `venv` transition configurations localized and minimized to preserve Bazel | ||
| caching and memory efficiency. | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Integration Testing Specification | ||
|
|
||
| We will construct a comprehensive Bazel-in-Bazel integration test suite in | ||
| `tests/integration/unified_pypi/` to guarantee correctness and verify | ||
| transitions. | ||
|
|
||
| The integration test suite will assert: | ||
| 1. **`"auto"` Precedence**: Author a test asserting `bazel run //:bin_default` | ||
| correctly inherits `"auto"` and resolves dependencies from the designated fallback. | ||
| 2. **Transitional Resolution**: Author a test asserting two binary targets in | ||
| the same package with different `config_settings` successfully resolve | ||
| dependencies and execute against their respective concrete hubs (`pypi_a` | ||
| vs `pypi_b`). | ||
| 3. **Command Line Override**: Author a test asserting | ||
| `bazel run --//python/config_settings:venv=pypi_a //:bin_default` | ||
| successfully forces the executable to run using imports resolved from | ||
| `pypi_a`. | ||
| 4. **Disjoint Execution Failure**: Author a test asserting `bazel cquery` over | ||
| a target depending on an unrepresented missing package succeeds, while | ||
| `bazel run` on that target gracefully fails during execution with the exact | ||
| synthesized error message. | ||
| 5. **Unionized Extra Hub Aliases**: Author a test asserting that a binary | ||
| successfully runs using a custom `extra_hub_aliases` target resolved | ||
| through the `@pypi proxy`. | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Execution Steps | ||
|
|
||
| 1. **Phase 1**: Define `venv` `string_flag` and register it in | ||
| `common_labels.bzl` and `transition_labels.bzl`. | ||
| 2. **Phase 2**: Update `python/private/pypi/extension.bzl` to synthesize the | ||
| canonical `pypi` proxy repository rule. | ||
| 3. **Phase 3**: Implement `missing_package_error` execution failure rule and | ||
| the `proxy_hub_repository` generation logic. | ||
| 4. **Phase 4**: Author the Bazel-in-Bazel integration test suite in | ||
| `tests/integration/unified_pypi/`. | ||
| 5. **Phase 5**: Run all tests and verify full pass before PR submission. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case we could:
<module_name>.pypiin this case and tell this to the user. Since this is a special name, we should not get us in a situation where we break if there is a non-root module using this name.pip.parsewithpypi, then set that as default automatically.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for coming up with some ideas. My initial thinking was to skip the logic to ensure existing behavior wasn't affected, then figure out how we could transition to
@pypibeing a pip-extension owned name.The idea to make
name=pypian implicit default is appealing. I'm a bit concerned it may over-complicate how a default is selected. But...I do like it. Feels like a pretty reasonable behavior. So lets do that.What do you think of:
Print a warning if the name collision occurs. If an env var is set (
RULES_PYTHON_PYPI_HUB_RESERVED=1), then the hub is silently renamed module_name.pyi and is used as the default hub (if pip.default wasn't used). In a future release, we flip the default.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done