perf: improve general performance#172
Conversation
d1eec34 to
abd3d1d
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #172 +/- ##
==========================================
- Coverage 95.43% 95.37% -0.06%
==========================================
Files 7 7
Lines 2147 2142 -5
Branches 481 483 +2
==========================================
- Hits 2049 2043 -6
Misses 56 56
- Partials 42 43 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
MatthieuDartiailh
left a comment
There was a problem hiding this comment.
Thanks for starting this, just one small question. Also you may need some convincing for Mypy to be happy with you without all the assertions.
We try to use sets for quick lookups and cache the result of some repetitive operations to speed up some operations. We also remove a good deal of asserts embedded within the code that adde extra overhead.
abd3d1d to
7ac2792
Compare
d5a0515 to
7ac2792
Compare
|
This looks good to me even though the removal of some assertion looks slightly scary to me but I think it is fine. I will merge shortly. |
|
Regarding version checks I wonder if we could try to cythonize the code and use tempita to actually remove the runtime version checks. Is this something you would be interested in ? H5py uses tempita with their cython code h5py/h5py#2637 |
I have Cythonised this library in the past as an experiment and observed a general 2x speedup. Whilst not massive, it still a good improvement. So I would definitely be interested in this project moving to native. What I would ask is that wheels are built for as many platforms as possible, including e.g. alpine/musl. |
|
cibuildwheel make the process rather straightforward so I have no problem on this front. |
We try to use sets for quick lookups and cache the result of some repetitive operations to speed up some operations. We also remove a good deal of asserts embedded within the code that adde extra overhead. Where appropriate, these should be replaced with actual tests.
The current change squeezes about 3/4% performance from a decompile/recompile loop.
Benchmarking
The following round-trip script has been used to benchmark the performance of the proposed changes
Baseline: 1.10 +/- 0.02
This PR: 1.03 +/- 0.02