nfpm.native_libs: new backend for nfpm pkg deps (only elfdeps subsystem)#22873
nfpm.native_libs: new backend for nfpm pkg deps (only elfdeps subsystem)#22873cognifloyd merged 32 commits intomainfrom
Conversation
Without the elfdeps req, we can't run pytest or mypy on the nfpm.native_libs backend's analyze_wheels.py script. __________________________________________________________________ Lockfile diff: 3rdparty/python/user_reqs.lock [python-default] __________________________________________________________________ == Upgraded dependencies == ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ graphql-core 3.2.6 --> 3.2.7 pbr 7.0.1 --> 7.0.3 pydantic 2.12.3 --> 2.12.4 pydantic-core 2.41.4 --> 2.41.5 __________________________________________________________________ == Added dependencies == ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ elfdeps 0.2.0 pyelftools 0.32
I had to use a dummy backend in the new PythonTool(...) in generate_builtin_lockfiles.py to generate the initial lockfile, because the backend isn't loadable until the lockfile exists. After generating the lockfile for the first time, the backend could actually load, so I updated the PythonTool(...) entry to use the actual backend. Regeneration works just fine after all of that. __________________________________________________________________ Lockfile diff: elfdeps.lock [elfdeps] __________________________________________________________________ == Added dependencies == ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ elfdeps 0.2.0 pyelftools 0.32
move the sort logic into the class, so tests don't need to sort it at the usage site.
We need slightly different things for deb vs rpm. For deb, we need just the soname to search for relevant packages, and parsing the so_info string did not seem wise when I can just preserve the data as elfdeps returned it. So, we now have a SOInfo dataclass (a limited mirror of the elfdeps.SOInfo dataclass).
As it will start analyzing more than just wheels.
As it will hava analysis of more than just wheels.
e64a330 to
a158f40
Compare
|
I would like to see multiple reviews on this, especially around how I put I could add a separate resolve but that adds complexity in |
The rules should not depend on `elfdeps` (a pypi package), because that will make the pants wheel/pex include `elfdeps` which is not useful. So, instead of depending on the script `python_source` targets, add deps on `resource` targets instead. Each script is both `python_source` and `resource` so that rule code can depend on the script as a resource, and ruff, mypy, pytest, and other tools can work with the script as python. To avoid inadvertent dependencies between rule python code and script python code, this adds some `__dependents_rules__` to ensure rule code can only depend on the `resource`, not the `python_source`.
I added Now, dependents are as follows for the Now, dependents are as follows for the 3rdparty dependencies of No 3rdparty dependencies of 3rdparty dependencies of 3rdparty dependencies of the pants wheel do NOT include |
benjyw
left a comment
There was a problem hiding this comment.
Looks fine to me. I am comfortable with elfdeps in python-default, with the visibility rules.
|
|
||
| Go now compiles with trimpath to strip sandbox paths from output, allowing for reproducible builds. | ||
|
|
||
| #### NEW: nFPM Native Libs |
There was a problem hiding this comment.
Wondering: why is this a separate backend and not a new feature in the existing one?
There was a problem hiding this comment.
This is separate because--I assume--it will not be desirable for everyone. If you want a quick rpm that serves as merely a container for files--which nfpm can happily provide--then something that automatically (magically/implicitly) analyzes binaries and injects dependencies is antithetical to that explicit simplicity.
I could probably add a field to pex_binary that can disable or enable this behavior, but then I'd have to add that field to every other target that could be binary data (added once the backend supports that target). I could also add a flag to control this in pants.toml, but I don't think that is significantly different than adding one more backend to enable the functionality.
I think we could combine the backends later after we have more experience using it and more feedback from others' experiences. If people generally want this feature on by default once the nfpm backend is enabled, then combining the backends would make sense.
e0b91ac to
92f5836
Compare
…22899) ## PR Series Overview This is the second in a series of PRs that introduces a new backend: `pants.backend.npm.native_libs` Initially, the backend will be available as: `pants.backend.experimental.nfpm.native_libs` I proposed this new backend (originally named `bindeps`) in discussion #22396. This backend will inspect ELF bin/lib files (like `lib*.so`) in packaged contents (for this PR series, only in `pex_binary` targets) to identify package dependency metadata and inject that metadata on the relevant `nfpm_deb_package` or `nfpm_rpm_package` targets. Effectively, it will provide an approximation of these native packager features: - `rpm`: `rpmdeps` + `elfdeps` - `deb`: `dh_shlibdeps` + `dpkg-shlibdeps` (These substitute `${shlibs:Depends}` in debian control files have) ### Goal: Host-agnostic package builds This pants backend is designed to be host-agnostic, like [nFPM](https://nfpm.goreleaser.com/). Native packaging tools are often restricted to a single release of a single distro. Unlike native package builders, this new pants backend does not use any of those distro-specific or distro-release-specific utilities or local package databases. This new backend should be able to run (help with building deb and rpm packages) anywhere that pants can run (MacOS, rpm linux distros, deb linux distros, other linux distros, docker, ...). ### Previous PRs in series - #22873 ## PR Overview This PR adds rules in `nfpm.native_libs` to add package dependency metadata to `nfpm_rpm_package`. The 2 new rules are: - `inject_native_libs_dependencies_in_package_fields`: - An implementation of the polymorphic rule `inject_nfpm_package_fields`. This rule is low priority (`priority = 2`) so that in-repo plugins can override/augment what it injects. (See #22864) - Rule logic overview: - find any pex_binaries that will be packaged in an `nfpm_rpm_package` (using utility introduced in #22863) - Run new `rpm_depends_from_pex` rule (see below) - Inject identified SONAMEs in `nfpm_rpm_package` dependency fields (rpm accepts raw SONAMEs in these fields, so the SONAME does not need to be translated to a package name when building the package). - The `requires` field gets SONAMEs required by ELF binaries or libraries in the package contents - The `provides` field gets SONAMEs provided by ELF libraries in the package contents - How the rule outputs are used: The package dependency fields (like `requires` and `provides`) will be used when generating the config passed to `nFPM` so that `nFPM` includes the package dependency metadata in the built rpm package. - `rpm_depends_from_pex`: - runs `elfdeps_analyze_pex` on a pex (added in #22873) - returns only the ELF metadata that can be injected in `nfpm_rpm_package` fields.
) #22890 changed the interpreter constraints to include python 3.14. I missed including that in #22873 (which was merged after #22890), so this PR regenerates `elfdeps.lock` with the bumped ICs. #22873 also missed registering the `pants.backend.experimental.nfpm.native_libs` backend, which I discovered when looking for script deps in the output of: ``` pants dependencies --transitive src/python/pants:pants-packaged | less ```
PR Series Overview
This is the first in a series of PRs that introduces a new backend:
pants.backend.npm.native_libsInitially, the backend will be available as:
pants.backend.experimental.nfpm.native_libsI proposed this new backend (originally named
bindeps) in discussion #22396.This backend will inspect ELF bin/lib files (like
lib*.so) in packaged contents (for this PR series, only inpex_binarytargets) to identify package dependency metadata and inject that metadata on the relevantnfpm_deb_packageornfpm_rpm_packagetargets. Effectively, it will provide an approximation of these native packager features:rpm:rpmdeps+elfdepsdeb:dh_shlibdeps+dpkg-shlibdeps(These substitute${shlibs:Depends}in debian control files have)Goal: Host-agnostic package builds
This pants backend is designed to be host-agnostic, like nFPM.
Native packaging tools are often restricted to a single release of a single distro. Unlike native package builders, this new pants backend does not use any of those distro-specific or distro-release-specific utilities or local package databases. This new backend should be able to build deb and rpm packages anywhere that pants can run (MacOS, rpm linux distros, deb linux distros, other linux distros, docker, ...).
PR Overview
To achieve the host-agnostic goal, the scripts in this new backend use pure-python deps to search the ELF bin/lib files for provided and/or required SONAMEs:
elfdeps(📦 |pyelftools(📦 |This PR focuses only on
pants.backend.npm.native_libs.elfdeps, which includes some rules, a subsystem, a lockfile, and ananalyze.pyscript. Future PRs will actually translate and inject these SONAME provides/requires into the appropriatenfpm_*_packagedependency fields.Subsystem and Lockfile
elfdepsis a subsystem (with a defaultelfdeps.lockfile) that allows configuring an alternate version ofelfdeps. This might be useful if an alternate version (or a fork?) has fixes/features that apply to libraries contained in the wheels. As with all other backends, we only test with one version in CI, so users may encounter issues with other versions, which they are welcome to report on GitHub or in Slack.The subsystem's new lockfile summary (note the minimal number of deps, all of which are pure-python):
There is also a rule bundled with the subsystem,
setup_elfdeps_analyze_wheels_tool, that constructs a venv for theanalyze.pyscript to run in.Rules
The
elfdeps_analyze_pexrule:pex3 repository extractto create a directory with all of the wheels in the pex.aside: I would rather not unpack the pex like this, as the pex-internal structure feels like an internal implementation detail. However, I couldn't find any official pex utilities to extract the non-wheel (non-deps) files, so
UnzipBinaryit is.subsystem.setup_elfdeps_analyze_wheels_tool: Prepare a pex venv usingelfdeps.lockanalyze.pyscript (see below) in a pex venv. This runs (concurrently) twice: once with the wheels, and once with the non-wheel files.analyze.py, returning it in a dataclass for use by the calling rule.analyze.pyscriptThe
analyze.pyscript inpants.backend.nfpm.native_libs.elfdepsdoes the following:--modearg (wheelsandfilesare separate modes becauseelfdepscan descend into zip files, but it can't tell that a.whlis a zip file.)--mode wheels: open each wheel as a zip file, then pass it toelfdepsto iterate over and inspect the wheel's contents,--mode files: pass directory toelfdepsto recursively inspect all files in that directory (descending into tar or zip files if found),elfdepsanalyzes ELF metadata of each file that looks like a.solibrary or an executable ELF binaryrequiresandprovidesELF metadata returned by elfdepsTests
Python scripts in other backends are often treated as a resource, so linters are either skipped entirely for them, or mypy is instructed to ignore the 3rd party imports. I want the full help of all of these tools to keep my code clean and catch as many issues as possible. I also want unit tests for the parts of the script, but that means pytest needs to be able to import the script, so its dependencies have to be available in the python resolve that includes the
analyze.pyscript.I don't want to create another resolve just for this one script and its tests. So, I added
elfdepsto thepython-defaultresolve, and added visibility rules that ensures the dep is only ever used by the script and its tests, and that the script (and itselfdepsdep) is only a resource dependency (not a python dependency) of rule code.Here is the lockfile diff from regenerating the python-default resolve's lockfile (note the package upgrades are unrelated to elfdeps):
At 18.5 kB and 188.5 kB, the pure-python wheels for
elfdepsandpyelftoolsare remarkably small. However, this should only matter during development, because the visibility rules should ensure thatelfdepsnever becomes a transitive dep of the pants wheel or pants pex.