bindings/python: free-threaded Python (3.14t) support#2041
Merged
Conversation
Adds dedicated 3.14t support to the python bindings without breaking the regular CPython API surface. Key changes: - Wrap PyTokenizer's inner Tokenizer in std::sync::RwLock<Tokenizer>. Setters take &self + write guard; readers take a read guard. This removes the per-pyclass `&mut self` borrow check that races under free-threaded Python (`RuntimeError: Already borrowed`) and replaces it with a stdlib RwLock that admits concurrent encode operations while serializing mutations. - Each #[pymodule] is now declared as `#[cfg_attr(Py_GIL_DISABLED, pymodule(gil_used = false))]` / `#[cfg_attr(not(Py_GIL_DISABLED), pymodule)]`. 3.14t builds opt into Py_MOD_GIL_NOT_USED so importing tokenizers does not re-enable the GIL; regular CPython behaviour is unchanged. - Add bindings/python/build.rs calling pyo3_build_config::use_pyo3_cfgs(). PyO3 detects free-threaded Python and emits Py_GIL_DISABLED on its own crate, but cargo's rustc-cfg directives don't propagate to dependents — use_pyo3_cfgs re-emits them so our cfg_attr fires. - Promote `abi3` from a hardcoded pyo3 dep-feature to a project-level cargo feature (default on). Allows building without abi3 on free-threaded Python (`maturin develop --no-default-features --features ext-module`) — abi3 / limited API is not available under free-threading. - Add bindings/python/docs/free-threading-audit.md walking through every mutation surface (single-field setter, top-level swap, compound mutation, sequence components, trainer-during-train, encode hot path) with verdicts and audit-trail references. - Add bindings/python/tests/test_freethreaded.py: stress tests racing N encoders against M setters on the same Tokenizer. All pass on 3.14t (4/4) and regular 3.14 (3 pass + 1 skip for the 3.14t-specific GIL check). - Update README and __init__.py docstring describing the 3.14t behaviour and the documented compound-mutation caveat (`tokenizer.post_processor.special_tokens = X` is two Python steps, not atomic — same class as `dict[k]=v` racing `dict.clear()`). Building 3.14t wheels: maturin develop --release --no-default-features --features ext-module Regular CPython wheels are unchanged — keep `default-features = true`.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Drop the cfg_attr(Py_GIL_DISABLED, …) gating on every #[pymodule]. PyO3 0.28's `pymodule(gil_used = false)` emits the Py_mod_gil slot only when the target Python recognizes it (3.13+); on older Python versions the slot is simply not emitted. So always declaring `gil_used = false` is a no-op on 3.10–3.12, the right thing on 3.13, and the load-bearing thing on 3.14t. Verified by building a single abi3 wheel and importing it on stock CPython 3.10 / 3.11 / 3.12 / 3.13 (all clean: import + setter work) and re-running the 3.14t stress suite (still 4/4 passing, GIL stays off as before).
Collaborator
Author
|
/benchmark |
Mechanical: run `cargo fmt` over tokenizer.rs (the .read().unwrap() / .write().unwrap() chains it produced were too long for one line) and `ruff format` over test_freethreaded.py. No behavioural change. 3.14t stress suite still 4/4 passing; abi3 wheel still imports cleanly on 3.10–3.13. The .pyi stub regeneration from `make style` is intentionally NOT included — the current pipeline emits stubs without docstrings (the `stub.py` enrichment step the README documents isn't actually wired up), so re-running it on this branch would shrink every .pyi by ~80% and lose all the inline doc text. Pre-existing issue, separate PR.
Output of `cargo run --manifest-path ./tools/stub-gen/Cargo.toml` against the current FT-aware build.
Two related changes that fix the silent docstring-stripping in the stub generation pipeline. 1. bindings/python/Cargo.toml: pin pyo3 back to 0.28.2 (was 0.28.3). The Makefile's `make style` injects a `[patch.crates-io]` block in `.cargo/config.toml` pointing pyo3 at git rev 2ba9cda5 (which is pyo3 0.28.2 with the introspection metadata pyo3-introspection needs to read docstrings out of the cdylib). Cargo only honours a patch when the requested version matches, so requiring 0.28.3 in our deps caused cargo to silently ignore the patch — the cdylib then built against vanilla 0.28.3 from crates.io, with no docstring metadata for pyo3-introspection to find. 2. tools/stub-gen/src/main.rs: walk the introspected module and abort if no docstrings are present anywhere. The previous behaviour was to write out 7 docstring-less stubs and exit successfully, which only got noticed when the .pyi diff in a PR was -2800 lines. The new check fails loudly with a pointer at `[patch.crates-io]` drift, which is the root cause when this regresses. 3. py_src/tokenizers/*.pyi: regenerated against the patched build, so the docstring contents are back in.
The 3.14t job in `python.yml` was hitting `SystemError: init function of tokenizers returned uninitialized object` because the install step ran `pip install -e .[dev]`, which goes through maturin's PEP 660 editable path and keeps the `abi3` cargo feature on regardless of the target interpreter. Free-threaded Python can't load an abi3 extension (no limited API), so the resulting .so failed to initialize. Fix the install + test steps to detect free-threading and switch build/test behavior: - Install: use `maturin develop --release` directly. On a GIL-enabled interpreter, defaults are fine (abi3 on). On free-threaded, pass `--no-default-features --features ext-module` so the abi3 cargo feature is dropped and the resulting wheel is `cp314t`-tagged rather than abi3-tagged. - Run tests: `make test` runs `cargo test --no-default-features` which uses pyo3's `auto-initialize` and links libpython. Free- threaded Python on the macOS runner doesn't ship libpython3.14t in the framework path, so on 3.14t we run only `make test-py` and skip the cargo half. - Makefile: split `test` into `test-py` (just pytest) and `test-rs` (cargo test); keep the original `test` target as `test-py + test-rs` for parity. Lets CI pick the appropriate subset per interpreter without duplicating the test command line. Verified locally on 3.14t: 195 pytest items pass (4 new test_freethreaded.py stress tests included). The 2 documentation-test failures are a pre-existing truncated `tokenizer-wiki.json` fixture issue, unrelated to this PR.
`make check-style` runs the stub-gen tool, which calls `maturin develop --release` with the default cargo features (abi3 on) and then imports the cdylib for introspection. abi3 extensions can't load on free-threaded Python, so on 3.14t the import fails with the familiar `SystemError: init function of tokenizers returned uninitialized object`. Style checks (rustfmt, ruff, ty, stub-gen) are matrix-invariant, so gate to a single canonical combo (ubuntu-latest + 3.14) — avoids the 3.14t failure and also drops 3 redundant runs from the matrix.
…pply
Even with the dependency requirement at "0.28.2" (i.e. ^0.28.2),
cargo's resolver picks the highest matching version on crates.io —
0.28.3 — and the patched git source at rev 2ba9cda5 has manifest
version 0.28.2, so the patch's source-version pair doesn't match the
resolved 0.28.3 and cargo emits:
warning: patch `pyo3 v0.28.2 (...)` was not used in the crate graph
The Makefile's `cargo update` doesn't downgrade — it only refreshes
within the existing requirement. Pinning exactly (`=0.28.2`) forces
the resolver to that version, which then matches the patch's source.
Switched the three pyo3 dep entries:
pyo3 0.28.2 -> =0.28.2
pyo3-ffi 0.28 -> =0.28.2
pyo3-build-config 0.28 -> =0.28.2 (build-dep)
Verified: `make style` now shows
Docstring coverage: 188/483 items carry a docstring
with no "patch was not used" warning, and the regenerated stubs are
docstring-rich. 3.14t stress suite still 4/4 passing.
McPatate
reviewed
Apr 27, 2026
Member
McPatate
left a comment
There was a problem hiding this comment.
the only thing I'm mad about is that you unwrap everywhere, other than that lgtm! 🔥
Two follow-ups on the PyTokenizer locking work, per review. 1. Wrap the inner tokenizer in Arc<RwLock<…>> instead of just RwLock<…>. Restores the cheap Clone semantics the pre-RwLock PyTokenizer had: `clone()` is now a refcount bump rather than a deep copy of the entire Tokenizer (model, normalizer, post-processor, etc.). Matches how component wrappers elsewhere in the bindings already share their inner state. 2. Stop unwrapping lock acquisitions; propagate errors instead. Add `read_inner()` / `write_inner()` helpers that map a poisoned RwLock to a `PyException` and return `PyResult<RwLock*Guard>`. Every call site goes through them with `?`, including the one in decoders.rs used by `step_decode_stream`. Methods that previously returned a plain type and now use one of the helpers were widened to `PyResult<T>` accordingly. PyO3 treats `T` and `PyResult<T>` identically on the Python side, so there's no public API change — just an explicit failure path for the (rare) case of lock poisoning, instead of an opaque process panic. Verified: 190 regular tests pass on CPython 3.14, 4/4 stress tests pass on 3.14t. The Arc::clone is observable as a faster `t = clone(t)` for any caller that does it.
cd7c0b2 to
b652b1a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Adds dedicated 3.14t support to the python bindings without breaking the regular CPython API surface.
The release workflow was updated but we did not specify "use_gil" -> it would be a pointless release.
Key changes:
Wrap PyTokenizer's inner Tokenizer in std::sync::RwLock.
Each #[pymodule] is now declared as
#[pymodule(gil_used = false))]Promote
abi3from a hardcoded pyo3 dep-feature to a project-level cargo feature (default on). Allows building without abi3 on free-threaded Python (maturin develop --no-default-features --features ext-module) — abi3 / limited API is not available under free-threading.Add bindings/python/tests/test_freethreaded.py: stress tests racing N encoders against M setters on the same Tokenizer. All pass on 3.14t (4/4) and regular 3.14 (3 pass + 1 skip for the 3.14t-specific GIL check).
Building 3.14t wheels:
maturin develop --release --no-default-features --features ext-module