Skip to content

[ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations#22878

Merged
Gankra merged 10 commits intomainfrom
gankra/string-annot3
Jan 28, 2026
Merged

[ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations#22878
Gankra merged 10 commits intomainfrom
gankra/string-annot3

Conversation

@Gankra
Copy link
Contributor

@Gankra Gankra commented Jan 26, 2026

I ultimately hunted down the core issue with my previous approach to "if you give string annotation sub-AST nodes any kind of NodeIndex, a bunch of random places in the code will start thinking it's ok to store info about them, which is a problem because they all count up from index 0 which creates conflicts/crashes".

Rather than trying to play whackamole I decided to create a scheme to give string annotation nodes a unique NodeIndex by shifting up the parent node's index -- so 0xAB's sub-AST nodes look like 0xAB00...0000, 0xAB00..0001, etc. This scheme avoids any collisions for any reasonable AST (most string annotations are like, a dozen sub-nodes, so they need maybe 4 or 5 bits, which would require hundreds of MB of python code to run out of bits...).

As a bonus, this admits an extremely simple implementation of recording and fetching sub-AST types... they just are stored now, and you can just pass in their NodeIndex and get back actual results.

@Gankra Gankra added the ty Multi-file analysis & type inference label Jan 26, 2026
@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 26, 2026

Typing conformance results

No changes detected ✅

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 26, 2026

mypy_primer results

Changes were detected when running on open source projects
pwndbg (https://github.com/pwndbg/pwndbg)
- pwndbg/aglib/kernel/__init__.py:264:20: error[unsupported-operator] Operator `+` is not supported between objects of type `None | Unknown` and `Literal[1]`
+ pwndbg/aglib/kernel/__init__.py:264:20: error[unsupported-operator] Operator `+` is not supported between objects of type `Unknown | None` and `Literal[1]`
- pwndbg/aglib/kernel/__init__.py:270:22: error[unsupported-operator] Operator `+` is not supported between objects of type `None | Unknown` and `int`
+ pwndbg/aglib/kernel/__init__.py:270:22: error[unsupported-operator] Operator `+` is not supported between objects of type `Unknown | None` and `int`

prefect (https://github.com/PrefectHQ/prefect)
+ src/integrations/prefect-dbt/prefect_dbt/core/settings.py:94:28: error[invalid-assignment] Object of type `dict[Any, Any] | int | dict[str, Any] | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/integrations/prefect-dbt/prefect_dbt/core/settings.py:99:28: error[invalid-assignment] Object of type `int | dict[Any, Any] | float | ... omitted 3 union elements` is not assignable to `dict[str, Any]`
+ src/prefect/cli/deploy/_core.py:86:21: error[invalid-assignment] Object of type `dict[Any, Any] | int | dict[str, Any] | ... omitted 4 union elements` is not assignable to `dict[str, Any]`
+ src/prefect/cli/deploy/_core.py:87:21: error[invalid-assignment] Object of type `int | dict[Any, Any] | float | ... omitted 3 union elements` is not assignable to `dict[str, Any]`
+ src/prefect/deployments/steps/core.py:137:38: error[invalid-argument-type] Argument is incorrect: Argument type `dict[Any, Any] | int | dict[str, Any] | ... omitted 4 union elements` does not satisfy constraints (`str`, `int`, `int | float`, `bool`, `dict[Any, Any]`, `list[Any]`, `None`) of type variable `T`
- src/prefect/utilities/templating.py:320:13: error[invalid-assignment] Invalid subscript assignment with key of type `object` and value of type `Unknown | dict[str, Any]` on object of type `dict[str, Any]`
+ src/prefect/utilities/templating.py:320:13: error[invalid-assignment] Invalid subscript assignment with key of type `object` and value of type `Unknown | int | dict[str, Any] | ... omitted 4 union elements` on object of type `dict[str, Any]`
- src/prefect/utilities/templating.py:323:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_block_document_references | dict[str, Any]`, found `list[Unknown | dict[str, Any]]`
+ src/prefect/utilities/templating.py:323:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_block_document_references | dict[str, Any]`, found `list[Unknown | int | dict[str, Any] | ... omitted 4 union elements]`
- src/prefect/utilities/templating.py:437:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `dict[object, Unknown]`
+ src/prefect/utilities/templating.py:437:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `dict[object, Unknown | int | float | ... omitted 4 union elements]`
- src/prefect/utilities/templating.py:442:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `list[Unknown]`
+ src/prefect/utilities/templating.py:442:16: error[invalid-return-type] Return type does not match returned value: expected `T@resolve_variables`, found `list[Unknown | int | float | ... omitted 4 union elements]`
- src/prefect/workers/base.py:232:13: error[invalid-argument-type] Argument is incorrect: Argument type `str | dict[str, Any]` does not satisfy constraints (`str`, `int`, `int | float`, `bool`, `dict[Any, Any]`, `list[Any]`, `None`) of type variable `T`
+ src/prefect/workers/base.py:232:13: error[invalid-argument-type] Argument is incorrect: Argument type `str | int | dict[str, Any] | ... omitted 3 union elements` does not satisfy constraints (`str`, `int`, `int | float`, `bool`, `dict[Any, Any]`, `list[Any]`, `None`) of type variable `T`
+ src/prefect/workers/base.py:234:20: error[invalid-argument-type] Argument expression after ** must be a mapping type: Found `int | Unknown | float | ... omitted 4 union elements`
- Found 5368 diagnostics
+ Found 5374 diagnostics

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
- src/scikit_build_core/build/wheel.py:99:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 47 diagnostics
+ Found 46 diagnostics

core (https://github.com/home-assistant/core)
- homeassistant/util/variance.py:47:12: error[invalid-return-type] Return type does not match returned value: expected `(**_P@ignore_variance) -> _R@ignore_variance`, found `_Wrapped[_P@ignore_variance, int | _R@ignore_variance | float | datetime, _P@ignore_variance, _R@ignore_variance | int | float | datetime]`
- Found 14476 diagnostics
+ Found 14475 diagnostics

No memory usage changes detected ✅

@Gankra Gankra added the server Related to the LSP server label Jan 26, 2026
@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 26, 2026

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@AlexWaygood
Copy link
Member

Those are all just our usual non-deterministic flakes in the primer report, nothing to worry about

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 26, 2026

ecosystem-analyzer results

Lint rule Added Removed Changed
invalid-return-type 5 0 4
invalid-argument-type 1 4 2
invalid-parameter-default 0 0 7
invalid-assignment 0 4 2
unused-type-ignore-comment 2 0 0
Total 8 8 15

Full report with detailed diff (timing results)

@Gankra
Copy link
Contributor Author

Gankra commented Jan 26, 2026

Yeah i mostly just ran the ecosystem stuff to give my self the best (worst) chance of hitting a horrible crasher or something

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks overall reasonable!

The inline comment seems like a not great situation to me; if we land this as-is I wouldn't want to defer that TODO for very long.

Comment on lines 9729 to 9730
// TODO: We store (sub)string annotations now but the code is a bit inconsistent about
// only doing it once, so downgrade this into `MultiInferenceState::Overwrite`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we understand where this inconsistency comes from? Do we infer the same string annotation multiple times sometimes?

I don't love the combination of this handwavy exception, with "we don't validate overflow of the node index scheme", since I think it means that if we did have an overlap, rather than seeing any kind of debuggable panic, we'd see highly confusing overwriting of types for random nodes.

Copy link
Contributor Author

@Gankra Gankra Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were totally right to chaffe at this combination, I indeed had conflicts in my encoding from sub-sub-ASTs being possible and me not dealing with that. My new encoding is robust to that and includes rigorous checks and errors for all the boundary conditions (while also setting a hard requirement that you can't have triply-nested string annotations).

@AlexWaygood AlexWaygood removed their request for review January 27, 2026 13:36
@Gankra
Copy link
Contributor Author

Gankra commented Jan 27, 2026

I should probably write some more tests for these whacky wild sub-sub-AST situations...

@Gankra
Copy link
Contributor Author

Gankra commented Jan 27, 2026

(That said I'd be interested in concurrent review of the approach; whether we're ok with accepting the constraints the comments discuss)

@carljm
Copy link
Contributor

carljm commented Jan 27, 2026

I think the documented limitations are manageable, given that if anyone runs into them the solution will likely be quite simple ("stop using string-within-string annotations in your massive file").

How hard would it be to have the failure case be "we don't assign types to your sub-nodes" rather than "panic"? (I just read the comment, didn't review the implementation yet, so apologies if that is already the implementation)

@Gankra
Copy link
Contributor Author

Gankra commented Jan 27, 2026

I've removed all the panics yeah, it all produces a parse error which I think becomes Unknown (except for one case that's like "panic if someone did a bug in the bitmasking code")

@Gankra
Copy link
Contributor Author

Gankra commented Jan 27, 2026

h'ok I've updated semantic-tokens, goto-*, and hover to properly use the new machinery and added a pile of tests to confirm it works.

@Gankra Gankra changed the title [ty] Add support for goto-type, goto-declaration, and hover of string annotations [ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations Jan 27, 2026
@carljm
Copy link
Contributor

carljm commented Jan 27, 2026

This looks reasonable to me at a high level but I'd love to have @ibraheemdev take a look.

@carljm carljm requested a review from ibraheemdev January 27, 2026 21:10
Copy link
Member

@ibraheemdev ibraheemdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty clever, and makes sense to me. I'm slightly uncomfortable with the limits here, but it seems fine for now.

An alternative approach would be to store larger indices indirectly in a separate table in the sub-AST parsed module, but I imagine that makes it trickier to treat sub-AST expressions like regular nodes, so this seems like a fair compromise.

pub index: u32,
pub nodes: Vec<AnyRootNodeRef<'a>>,
pub max_index: u32,
pub nodes: Option<Vec<AnyRootNodeRef<'a>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to always be set to Some?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, good catch! (It should be None for the new path, since we don't care about that result at all)

@Gankra Gankra merged commit 45acbd0 into main Jan 28, 2026
48 checks passed
@Gankra Gankra deleted the gankra/string-annot3 branch January 28, 2026 04:26
carljm added a commit that referenced this pull request Jan 30, 2026
* main: (76 commits)
  [ty] Improve the check for `NewType`s with generic bases (#22961)
  [ty] Ban legacy `TypeVar` bounds or constraints from containing type variables (#22949)
  Bump the typing conformance suite pin (#22960)
  [ty] Emit an error if a TypeVarTuple is used to subscript `Generic` or `Protocol` without being unpacked (#22952)
  [ty] Reduce false positives when subscripting classes generic over `TypeVarTuple`s (#22950)
  [ty] Detect invalid attempts to subclass `Protocol[]` and `Generic[]` simultaneously (#22948)
  Fix suppression indentation matching (#22903)
  Remove hidden `--output-format` warning (#22944)
  [ty] Validate signatures of dataclass `__post_init__` methods (#22730)
  [ty] extend special-cased `numbers` diagnostic to `invalid-argument-type` errors (#22938)
  [ty] Avoid false positive for `not-iterable` with no-positive intersection types (#22089)
  [ty] Preserve pure negation types in descriptor protocol (#22907)
  [ty] add special-case diagnostic for `numbers` module (#22931)
  [ty] Move the location of more `invalid-overload` diagnostics (#22933)
  [ty] Fix unary and comparison operators for TypeVars with union bounds (#22925)
  [ty] Rule Selection: ignore/warn/select all rules (unless subsequently overriden) (#22832)
  [ty] Fix TypedDict construction from existing TypedDict values (#22904)
  [ty] fix bug in string annotations and clean up diagnostics (#22913)
  [ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations (#22878)
  [ty] Rename old typing imports to new on `unresolved-reference`. (#22827)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ecosystem-analyzer server Related to the LSP server ty Multi-file analysis & type inference

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Go to Definition not working for stringified nested class Hover rendering has no title on non-trival stringified annotations

4 participants