Skip to content

nfpm.native_libs: new backend for nfpm pkg deps (only elfdeps subsystem)#22873

Merged
cognifloyd merged 32 commits intomainfrom
cognifloyd/nfpm-native_libs-elfdeps
Nov 19, 2025
Merged

nfpm.native_libs: new backend for nfpm pkg deps (only elfdeps subsystem)#22873
cognifloyd merged 32 commits intomainfrom
cognifloyd/nfpm-native_libs-elfdeps

Conversation

@cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Nov 8, 2025

PR Series Overview

This is the first in a series of PRs that introduces a new backend: pants.backend.npm.native_libs
Initially, the backend will be available as: pants.backend.experimental.nfpm.native_libs

I proposed this new backend (originally named bindeps) in discussion #22396.

This backend will inspect ELF bin/lib files (like lib*.so) in packaged contents (for this PR series, only in pex_binary targets) to identify package dependency metadata and inject that metadata on the relevant nfpm_deb_package or nfpm_rpm_package targets. Effectively, it will provide an approximation of these native packager features:

  • rpm: rpmdeps + elfdeps
  • deb: dh_shlibdeps + dpkg-shlibdeps (These substitute ${shlibs:Depends} in debian control files have)

Goal: Host-agnostic package builds

This pants backend is designed to be host-agnostic, like nFPM.

Native packaging tools are often restricted to a single release of a single distro. Unlike native package builders, this new pants backend does not use any of those distro-specific or distro-release-specific utilities or local package databases. This new backend should be able to build deb and rpm packages anywhere that pants can run (MacOS, rpm linux distros, deb linux distros, other linux distros, docker, ...).

PR Overview

To achieve the host-agnostic goal, the scripts in this new backend use pure-python deps to search the ELF bin/lib files for provided and/or required SONAMEs:

  • elfdeps (📦 | :octocat:):

    "Python implementation of RPM elfdeps."

  • pyelftools (📦 | :octocat:):

    "A pure-Python library for parsing and analyzing ELF files and DWARF debugging information."
    (elfdeps is only host-agnostic because of its dependency on pyelftools. We do not use pyelftools directly.)

This PR focuses only on pants.backend.npm.native_libs.elfdeps, which includes some rules, a subsystem, a lockfile, and an analyze.py script. Future PRs will actually translate and inject these SONAME provides/requires into the appropriate nfpm_*_package dependency fields.

Subsystem and Lockfile

elfdeps is a subsystem (with a default elfdeps.lock file) that allows configuring an alternate version of elfdeps. This might be useful if an alternate version (or a fork?) has fixes/features that apply to libraries contained in the wheels. As with all other backends, we only test with one version in CI, so users may encounter issues with other versions, which they are welcome to report on GitHub or in Slack.

The subsystem's new lockfile summary (note the minimal number of deps, all of which are pure-python):

__________________________________________________________________
Lockfile diff: elfdeps.lock [elfdeps]
__________________________________________________________________
==                      Added dependencies                      ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  elfdeps                        0.2.0
  pyelftools                     0.32

There is also a rule bundled with the subsystem, setup_elfdeps_analyze_wheels_tool, that constructs a venv for the analyze.py script to run in.

Rules

The elfdeps_analyze_pex rule:

  • Run pex3 repository extract to create a directory with all of the wheels in the pex.
  • Unzip the pex to get any remaining files that are not included in wheels.
    aside: I would rather not unpack the pex like this, as the pex-internal structure feels like an internal implementation detail. However, I couldn't find any official pex utilities to extract the non-wheel (non-deps) files, so UnzipBinary it is.
  • subsystem.setup_elfdeps_analyze_wheels_tool: Prepare a pex venv using elfdeps.lock
  • Run analyze.py script (see below) in a pex venv. This runs (concurrently) twice: once with the wheels, and once with the non-wheel files.
  • Parse the JSON output of analyze.py, returning it in a dataclass for use by the calling rule.

analyze.py script

The analyze.py script in pants.backend.nfpm.native_libs.elfdeps does the following:

  • Branch based on the --mode arg (wheels and files are separate modes because elfdeps can descend into zip files, but it can't tell that a .whl is a zip file.)
    • --mode wheels: open each wheel as a zip file, then pass it to elfdeps to iterate over and inspect the wheel's contents,
    • --mode files: pass directory to elfdeps to recursively inspect all files in that directory (descending into tar or zip files if found),
  • elfdeps analyzes ELF metadata of each file that looks like a .so library or an executable ELF binary
  • Collect SONAMEs in requires and provides ELF metadata returned by elfdeps
  • Return collected results as JSON (rules can use this as simple structured data)

Tests

Python scripts in other backends are often treated as a resource, so linters are either skipped entirely for them, or mypy is instructed to ignore the 3rd party imports. I want the full help of all of these tools to keep my code clean and catch as many issues as possible. I also want unit tests for the parts of the script, but that means pytest needs to be able to import the script, so its dependencies have to be available in the python resolve that includes the analyze.py script.

I don't want to create another resolve just for this one script and its tests. So, I added elfdeps to the python-default resolve, and added visibility rules that ensures the dep is only ever used by the script and its tests, and that the script (and its elfdeps dep) is only a resource dependency (not a python dependency) of rule code.

Here is the lockfile diff from regenerating the python-default resolve's lockfile (note the package upgrades are unrelated to elfdeps):

__________________________________________________________________
Lockfile diff: 3rdparty/python/user_reqs.lock [python-default]
__________________________________________________________________
==                    Upgraded dependencies                     ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  graphql-core                   3.2.6        -->   3.2.7
  pbr                            7.0.1        -->   7.0.3
  pydantic                       2.12.3       -->   2.12.4
  pydantic-core                  2.41.4       -->   2.41.5
__________________________________________________________________
==                      Added dependencies                      ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  elfdeps                        0.2.0
  pyelftools                     0.32

At 18.5 kB and 188.5 kB, the pure-python wheels for elfdeps and pyelftools are remarkably small. However, this should only matter during development, because the visibility rules should ensure that elfdeps never becomes a transitive dep of the pants wheel or pants pex.

@cognifloyd cognifloyd self-assigned this Nov 8, 2025
@cognifloyd cognifloyd changed the title Introduce nfpm.native_libs backend with pex+elf analysis support nfpm.native_libs: new backend with elfdeps subsystem for use with nfpm Nov 8, 2025
@cognifloyd cognifloyd changed the title nfpm.native_libs: new backend with elfdeps subsystem for use with nfpm nfpm.native_libs: new backend for nfpm pkg deps (only elfdeps subsystem) Nov 8, 2025
Without the elfdeps req, we can't run pytest or mypy on the
nfpm.native_libs backend's analyze_wheels.py script.

__________________________________________________________________
Lockfile diff: 3rdparty/python/user_reqs.lock [python-default]
__________________________________________________________________
==                    Upgraded dependencies                     ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  graphql-core                   3.2.6        -->   3.2.7
  pbr                            7.0.1        -->   7.0.3
  pydantic                       2.12.3       -->   2.12.4
  pydantic-core                  2.41.4       -->   2.41.5
__________________________________________________________________
==                      Added dependencies                      ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  elfdeps                        0.2.0
  pyelftools                     0.32
I had to use a dummy backend in the new PythonTool(...) in generate_builtin_lockfiles.py to generate the initial lockfile,
because the backend isn't loadable until the lockfile exists. After generating the lockfile for the first time,
the backend could actually load, so I updated the PythonTool(...) entry to use the actual backend.
Regeneration works just fine after all of that.

__________________________________________________________________
Lockfile diff: elfdeps.lock [elfdeps]
__________________________________________________________________
==                      Added dependencies                      ==
‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
  elfdeps                        0.2.0
  pyelftools                     0.32
move the sort logic into the class, so tests don't need to sort it at the usage site.
We need slightly different things for deb vs rpm. For deb, we need just
the soname to search for relevant packages, and parsing the so_info
string did not seem wise when I can just preserve the data as elfdeps
returned it. So, we now have a SOInfo dataclass (a limited mirror of the
elfdeps.SOInfo dataclass).
As it will start analyzing more than just wheels.
As it will hava analysis of more than just wheels.
@cognifloyd cognifloyd force-pushed the cognifloyd/nfpm-native_libs-elfdeps branch from e64a330 to a158f40 Compare November 8, 2025 17:16
@cognifloyd
Copy link
Member Author

This PR was mostly extracted from #22861, which is far too big to ask anyone to review it. But, if you want more context about how these rules get used, you can look in #22861.

@cognifloyd
Copy link
Member Author

I would like to see multiple reviews on this, especially around how I put elfdeps in the python-default resolve.

I could add a separate resolve but that adds complexity in pants.toml and BUILD metadata. If I did that, I would probably point that resolve at the subsystem's lockfile instead of putting it in 3rdparty/python/. Would anyone prefer I use a separate resolve for this?

The rules should not depend on `elfdeps` (a pypi package), because that
will make the pants wheel/pex include `elfdeps` which is not useful.
So, instead of depending on the script `python_source` targets, add
deps on `resource` targets instead.

Each script is both `python_source` and `resource` so that rule code can
depend on the script as a resource, and ruff, mypy, pytest, and other
tools can work with the script as python. To avoid inadvertent
dependencies between rule python code and script python code, this adds
some `__dependents_rules__` to ensure rule code can only depend on the
`resource`, not the `python_source`.
@cognifloyd
Copy link
Member Author

cognifloyd commented Nov 12, 2025

I would like to see multiple reviews on this, especially around how I put elfdeps in the python-default resolve.

I could add a separate resolve but that adds complexity in pants.toml and BUILD metadata. If I did that, I would probably point that resolve at the subsystem's lockfile instead of putting it in 3rdparty/python/. Would anyone prefer I use a separate resolve for this?

I added resources and __dependents_rules__ so that the rule code can depend on analyze.py as a resource, and only analyze_test.py can depend on analyze.py as a python_source (and transitively on its elfdeps dep). This should minimize the impact of adding elfdeps to the python-default resolve, ensuring that pants itself does not pull in that dependency.


Now, dependents are as follows for the :elfdeps (a python_sources target):

$ pants dependents src/python/pants/backend/nfpm/native_libs/elfdeps/analyze.py:elfdeps
src/python/pants/backend/nfpm/native_libs/elfdeps:elfdeps
src/python/pants/backend/nfpm/native_libs/elfdeps/analyze_test.py:tests
$ pants dependents --transitive src/python/pants/backend/nfpm/native_libs/elfdeps/analyze.py:elfdeps
src/python/pants/backend/nfpm/native_libs/elfdeps:elfdeps
src/python/pants/backend/nfpm/native_libs/elfdeps:tests
src/python/pants/backend/nfpm/native_libs/elfdeps/analyze_test.py:tests

Now, dependents are as follows for the :scripts (a resources target):

$ pants dependents src/python/pants/backend/nfpm/native_libs/elfdeps/analyze.py:scripts
src/python/pants/backend/nfpm/native_libs/elfdeps:scripts
$ pants dependents --transitive src/python/pants/backend/nfpm/native_libs/elfdeps/analyze.py:scripts
build-support/bin:py_scripts
build-support/bin/generate_builtin_lockfiles.py:py_scripts
src/python/pants/backend/experimental/nfpm/native_libs:native_libs
src/python/pants/backend/experimental/nfpm/native_libs/register.py
src/python/pants/backend/nfpm/native_libs:native_libs
src/python/pants/backend/nfpm/native_libs/rules.py
src/python/pants/backend/nfpm/native_libs/elfdeps:elfdeps
src/python/pants/backend/nfpm/native_libs/elfdeps:scripts
src/python/pants/backend/nfpm/native_libs/elfdeps:tests
src/python/pants/backend/nfpm/native_libs/elfdeps/rules.py
src/python/pants/backend/nfpm/native_libs/elfdeps/rules_integration_test.py:tests
src/python/pants/backend/nfpm/native_libs/elfdeps/subsystem.py

3rdparty dependencies of .../elfdeps/analyze.py:elfdeps (a python_source target) include elfdeps:

$ pants dependencies --transitive native_libs/elfdeps/analyze.py:elfdeps | grep 3rdparty
3rdparty/python#elfdeps
3rdparty/python/requirements.txt
3rdparty/python/user_reqs.lock:_python-default_lockfile

No 3rdparty dependencies of .../elfdeps/analyze.py:scripts (a resource target):

$ pants dependencies --transitive native_libs/elfdeps/subsystem.py | grep 3rdparty

3rdparty dependencies of .../elfdeps/subsystem.py do NOT include elfdeps:

$ pants dependencies --transitive native_libs/elfdeps/subsystem.py | grep 3rdparty
3rdparty/python#ansicolors
3rdparty/python#packaging
3rdparty/python#typing-extensions
3rdparty/python/requirements.txt
3rdparty/python/user_reqs.lock:_python-default_lockfile

3rdparty dependencies of the pants wheel do NOT include elfdeps (neither does pants-pex):

$ pants dependencies --transitive src/python/pants:pants-packaged | grep 3rdparty
3rdparty/python#PyYAML
3rdparty/python#ansicolors
3rdparty/python#chevron
3rdparty/python#fasteners
3rdparty/python#hdrhistogram
3rdparty/python#ijson
3rdparty/python#libcst
3rdparty/python#node-semver
3rdparty/python#packaging
3rdparty/python#psutil
3rdparty/python#python-lsp-jsonrpc
3rdparty/python#setproctitle
3rdparty/python#toml
3rdparty/python#types-PyYAML
3rdparty/python#types-toml
3rdparty/python#typing-extensions
3rdparty/python/requirements.txt
3rdparty/python/user_reqs.lock:_python-default_lockfile

Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me. I am comfortable with elfdeps in python-default, with the visibility rules.


Go now compiles with trimpath to strip sandbox paths from output, allowing for reproducible builds.

#### NEW: nFPM Native Libs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering: why is this a separate backend and not a new feature in the existing one?

Copy link
Member Author

@cognifloyd cognifloyd Nov 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is separate because--I assume--it will not be desirable for everyone. If you want a quick rpm that serves as merely a container for files--which nfpm can happily provide--then something that automatically (magically/implicitly) analyzes binaries and injects dependencies is antithetical to that explicit simplicity.

I could probably add a field to pex_binary that can disable or enable this behavior, but then I'd have to add that field to every other target that could be binary data (added once the backend supports that target). I could also add a flag to control this in pants.toml, but I don't think that is significantly different than adding one more backend to enable the functionality.

I think we could combine the backends later after we have more experience using it and more feedback from others' experiences. If people generally want this feature on by default once the nfpm backend is enabled, then combining the backends would make sense.

@cognifloyd cognifloyd force-pushed the cognifloyd/nfpm-native_libs-elfdeps branch from e0b91ac to 92f5836 Compare November 18, 2025 20:23
@cognifloyd cognifloyd enabled auto-merge (squash) November 19, 2025 00:34
@cognifloyd cognifloyd merged commit dcfff5c into main Nov 19, 2025
70 of 75 checks passed
@cognifloyd cognifloyd deleted the cognifloyd/nfpm-native_libs-elfdeps branch November 19, 2025 01:42
cognifloyd added a commit that referenced this pull request Nov 20, 2025
…22899)

## PR Series Overview

This is the second in a series of PRs that introduces a new backend:
`pants.backend.npm.native_libs`
Initially, the backend will be available as:
`pants.backend.experimental.nfpm.native_libs`

I proposed this new backend (originally named `bindeps`) in discussion
#22396.

This backend will inspect ELF bin/lib files (like `lib*.so`) in packaged
contents (for this PR series, only in `pex_binary` targets) to identify
package dependency metadata and inject that metadata on the relevant
`nfpm_deb_package` or `nfpm_rpm_package` targets. Effectively, it will
provide an approximation of these native packager features:
- `rpm`: `rpmdeps` + `elfdeps`
- `deb`: `dh_shlibdeps` + `dpkg-shlibdeps` (These substitute
`${shlibs:Depends}` in debian control files have)

### Goal: Host-agnostic package builds

This pants backend is designed to be host-agnostic, like
[nFPM](https://nfpm.goreleaser.com/).

Native packaging tools are often restricted to a single release of a
single distro. Unlike native package builders, this new pants backend
does not use any of those distro-specific or distro-release-specific
utilities or local package databases. This new backend should be able to
run (help with building deb and rpm packages) anywhere that pants can
run (MacOS, rpm linux distros, deb linux distros, other linux distros,
docker, ...).

### Previous PRs in series

- #22873

## PR Overview

This PR adds rules in `nfpm.native_libs` to add package dependency
metadata to `nfpm_rpm_package`. The 2 new rules are:

- `inject_native_libs_dependencies_in_package_fields`:

    - An implementation of the polymorphic rule `inject_nfpm_package_fields`.
      This rule is low priority (`priority = 2`) so that in-repo plugins can
      override/augment what it injects. (See #22864)

    - Rule logic overview:
        - find any pex_binaries that will be packaged in an `nfpm_rpm_package`
          (using utility introduced in #22863)
        - Run new `rpm_depends_from_pex` rule (see below)
        - Inject identified SONAMEs in `nfpm_rpm_package` dependency fields
          (rpm accepts raw SONAMEs in these fields, so the SONAME does not need to
          be translated to a package name when building the package).
            - The `requires` field gets SONAMEs required by ELF binaries or
              libraries in the package contents
            - The `provides` field gets SONAMEs provided by ELF libraries in the
              package contents

    - How the rule outputs are used: The package dependency fields (like
      `requires` and `provides`) will be used when generating the config
      passed to `nFPM` so that `nFPM` includes the package dependency metadata
      in the built rpm package.

- `rpm_depends_from_pex`:
    - runs `elfdeps_analyze_pex` on a pex (added in #22873)
    - returns only the ELF metadata that can be injected in
      `nfpm_rpm_package` fields.
cognifloyd added a commit that referenced this pull request Nov 21, 2025
)

#22890 changed the interpreter constraints to include python 3.14. I
missed including that in #22873 (which was merged after #22890),
so this PR regenerates `elfdeps.lock` with the bumped ICs.

#22873 also missed registering the
`pants.backend.experimental.nfpm.native_libs` backend, which I
discovered when looking for script deps in the output of:
```
pants dependencies --transitive src/python/pants:pants-packaged | less
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants