[ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations#22878
[ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations#22878
Conversation
Typing conformance resultsNo changes detected ✅ |
|
|
|
Those are all just our usual non-deterministic flakes in the primer report, nothing to worry about |
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-return-type |
5 | 0 | 4 |
invalid-argument-type |
1 | 4 | 2 |
invalid-parameter-default |
0 | 0 | 7 |
invalid-assignment |
0 | 4 | 2 |
unused-type-ignore-comment |
2 | 0 | 0 |
| Total | 8 | 8 | 15 |
|
Yeah i mostly just ran the ecosystem stuff to give my self the best (worst) chance of hitting a horrible crasher or something |
carljm
left a comment
There was a problem hiding this comment.
Looks overall reasonable!
The inline comment seems like a not great situation to me; if we land this as-is I wouldn't want to defer that TODO for very long.
| // TODO: We store (sub)string annotations now but the code is a bit inconsistent about | ||
| // only doing it once, so downgrade this into `MultiInferenceState::Overwrite` |
There was a problem hiding this comment.
Do we understand where this inconsistency comes from? Do we infer the same string annotation multiple times sometimes?
I don't love the combination of this handwavy exception, with "we don't validate overflow of the node index scheme", since I think it means that if we did have an overlap, rather than seeing any kind of debuggable panic, we'd see highly confusing overwriting of types for random nodes.
There was a problem hiding this comment.
You were totally right to chaffe at this combination, I indeed had conflicts in my encoding from sub-sub-ASTs being possible and me not dealing with that. My new encoding is robust to that and includes rigorous checks and errors for all the boundary conditions (while also setting a hard requirement that you can't have triply-nested string annotations).
|
I should probably write some more tests for these whacky wild sub-sub-AST situations... |
|
(That said I'd be interested in concurrent review of the approach; whether we're ok with accepting the constraints the comments discuss) |
|
I think the documented limitations are manageable, given that if anyone runs into them the solution will likely be quite simple ("stop using string-within-string annotations in your massive file"). How hard would it be to have the failure case be "we don't assign types to your sub-nodes" rather than "panic"? (I just read the comment, didn't review the implementation yet, so apologies if that is already the implementation) |
|
I've removed all the panics yeah, it all produces a parse error which I think becomes Unknown (except for one case that's like "panic if someone did a bug in the bitmasking code") |
|
h'ok I've updated semantic-tokens, goto-*, and hover to properly use the new machinery and added a pile of tests to confirm it works. |
|
This looks reasonable to me at a high level but I'd love to have @ibraheemdev take a look. |
ibraheemdev
left a comment
There was a problem hiding this comment.
This is pretty clever, and makes sense to me. I'm slightly uncomfortable with the limits here, but it seems fine for now.
An alternative approach would be to store larger indices indirectly in a separate table in the sub-AST parsed module, but I imagine that makes it trickier to treat sub-AST expressions like regular nodes, so this seems like a fair compromise.
| pub index: u32, | ||
| pub nodes: Vec<AnyRootNodeRef<'a>>, | ||
| pub max_index: u32, | ||
| pub nodes: Option<Vec<AnyRootNodeRef<'a>>>, |
There was a problem hiding this comment.
This seems to always be set to Some?
There was a problem hiding this comment.
Whoops, good catch! (It should be None for the new path, since we don't care about that result at all)
* main: (76 commits) [ty] Improve the check for `NewType`s with generic bases (#22961) [ty] Ban legacy `TypeVar` bounds or constraints from containing type variables (#22949) Bump the typing conformance suite pin (#22960) [ty] Emit an error if a TypeVarTuple is used to subscript `Generic` or `Protocol` without being unpacked (#22952) [ty] Reduce false positives when subscripting classes generic over `TypeVarTuple`s (#22950) [ty] Detect invalid attempts to subclass `Protocol[]` and `Generic[]` simultaneously (#22948) Fix suppression indentation matching (#22903) Remove hidden `--output-format` warning (#22944) [ty] Validate signatures of dataclass `__post_init__` methods (#22730) [ty] extend special-cased `numbers` diagnostic to `invalid-argument-type` errors (#22938) [ty] Avoid false positive for `not-iterable` with no-positive intersection types (#22089) [ty] Preserve pure negation types in descriptor protocol (#22907) [ty] add special-case diagnostic for `numbers` module (#22931) [ty] Move the location of more `invalid-overload` diagnostics (#22933) [ty] Fix unary and comparison operators for TypeVars with union bounds (#22925) [ty] Rule Selection: ignore/warn/select all rules (unless subsequently overriden) (#22832) [ty] Fix TypedDict construction from existing TypedDict values (#22904) [ty] fix bug in string annotations and clean up diagnostics (#22913) [ty] Improve support for goto-type, goto-declaration, hover, and highlighting of string annotations (#22878) [ty] Rename old typing imports to new on `unresolved-reference`. (#22827) ...
I ultimately hunted down the core issue with my previous approach to "if you give string annotation sub-AST nodes any kind of NodeIndex, a bunch of random places in the code will start thinking it's ok to store info about them, which is a problem because they all count up from index 0 which creates conflicts/crashes".
Rather than trying to play whackamole I decided to create a scheme to give string annotation nodes a unique NodeIndex by shifting up the parent node's index -- so 0xAB's sub-AST nodes look like 0xAB00...0000, 0xAB00..0001, etc. This scheme avoids any collisions for any reasonable AST (most string annotations are like, a dozen sub-nodes, so they need maybe 4 or 5 bits, which would require hundreds of MB of python code to run out of bits...).
As a bonus, this admits an extremely simple implementation of recording and fetching sub-AST types... they just are stored now, and you can just pass in their NodeIndex and get back actual results.