Commit 02ec968
Add work-in-progress implementation of a new Python parser (#20856)
The new "native" parser (`mypy.nativeparse`) will eventually replace the
current parser (`mypy.fastparse`). The native parser uses a Rust
extension that wraps the Ruff parser to generate a serialized AST, and
mypy will deserialize the AST directly into a mypy AST. The binary
format is the same one we already use for mypy fixed-format incremental
caches.
This is still work in progress and some features aren't supported. The
most important missing feature is probably function type comments. Also,
the Rust extension needs to be manually compiled from
https://github.com/mypyc/ast_serialize. Refer to the `ast_serialize`
repository for instructions. There is no CI support for the new parser
right now -- there are tests, but they are skipped unless the
`ast_serialize` extension is installed, and it isn't installed in CI
right now.
Once the Rust extension is installed, use `--native-parser` to enable
the new parser. The main type checker test suite can be run using the
native parser via `TEST_NATIVE_PARSER=1 pytest mypy/test/testheck.py`
(the `TEST_NATIVE_PARSER` environment variable needs to be set). A bunch
of tests are still failing.
Related issue with more context: #19776
Remaining work is tracked here for now:
https://github.com/mypyc/ast_serialize/issues
Here are the expected benefits over the old mypy parser, adapted from
the docstring of `mypy/nativeparse.py`:
* No intermediate non-mypyc Python-level AST created, to improve
performance
* Parsing doesn't need GIL => can use multithreading to construct
serialized ASTs in parallel
* Produce import dependencies without having to build an AST => helps
parallel type checking
* Support all Python syntax even if mypy is running on an older Python
version
* Generate an AST even if there are syntax errors
* Potential to support incremental parsing (quickly process modified
sections in a file)
* Stripping function bodies in third-party code can happen earlier, for
extra performance
* We have the option to easily add support for `# mypy: ignore` comments
Most of the code is straightforward and repetitive deserialization code.
I used plenty of coding agent assist to implement deserialization and to
add tests. The tests are separate from the pre-existing parser tests,
but we can unify them later (or delete the old tests once we delete the
old parser).
@ilevkivskyi contributed to this PR.
---------
Co-authored-by: Ivan Levkivskyi <levkivskyi@gmail.com>1 parent a62b691 commit 02ec968
File tree
16 files changed
+6637
-1
lines changed- mypy
- test
- test-data/unit
16 files changed
+6637
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
| 269 | + | |
| 270 | + | |
269 | 271 | | |
270 | 272 | | |
271 | 273 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1245 | 1245 | | |
1246 | 1246 | | |
1247 | 1247 | | |
| 1248 | + | |
| 1249 | + | |
1248 | 1250 | | |
1249 | 1251 | | |
1250 | 1252 | | |
| |||
0 commit comments