Commit 853caad
optimizations to improve performance, add ormar-utils (#1571)
* optimizations to improve performance, add optional ormar-utils package written in rust
* update lock
* update coverage and lock
* bump poetry in workflows
* bump lock
* make rust utils required dep and simplify the code to only use optimized versions
* update lock
* Optimize hot paths with caching and Rust reverse alias map
Profile-driven optimizations targeting the most expensive ormar functions.
Key changes:
- Cache alias<->field_name mappings per model class, using Rust
build_reverse_alias_map for O(1) lookups (was O(n) linear scan,
called 406K times in profiling)
- Cache (col_name, field_name) pairs to avoid repeated SA column
iteration in own_table_columns and extract_prefixed_table_columns
- Use set instead of list for selected_columns membership checks
- Cache get_name(lower=True), extract_db_own_fields, ormar_fields_set,
and ForeignKey constructors dict
- Use frozenset for RelationProxy method check
End-to-end benchmark improvements:
- iterate: 24-30% faster
- first: 26-36% faster
- get_all: 18-19% faster
- saving: 17-25% faster
- select_related: 12-17% faster
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* bump ormar-utils version
* add nocover to alias dict access, not hit on normal usage
* chore: regenerate lock and reorder imports after rebase
Post-rebase tidying: poetry lock bumped to 2.3.3 plus mkdocstrings
patch bump, and ruff reordered the third-party `ormar_rust_utils`
import in queryset/utils.py.
* perf: cache & specialize _process_kwargs hot path (#1649)
Profile-driven refactor of NewBaseModel._process_kwargs, which fires on
every Model.__init__ (user construction and row hydration). Removes
redundant per-init work and skips no-op conversion calls for fields that
are neither JSON nor bytes.
Changes:
- Cache _pydantic_field_names, _extra_is_ignore, _allowed_kwarg_names
on the class (lazy-populated on first init); installed via metaclass
add_cached_properties alongside the existing _json_fields/_bytes_fields
caches.
- Replace nested _convert_to_bytes(_convert_json(...)) wrapping with an
explicit dispatch loop. Common path (regular ormar field, no JSON, no
bytes) avoids the function-call overhead entirely.
- Inline _remove_extra_parameters_if_they_should_be_ignored behind the
cached _extra_is_ignore bool; remove the method (no external callers).
- Remove now-unused _convert_to_bytes / _convert_json methods and the
orphaned _convert_json entry in quick_access_views.
Behavior unchanged: same ModelError messaging on unknown fields, same
JSON encoding, same base64 bytes handling. 628 tests pass at 100 %
coverage.
Benchmark deltas (median, pytest-benchmark):
- test_initializing_models[250] -34.9%
- test_initializing_models[500] -8.8%
- test_iterate[500] / test_iterate[1000] -7.9% / -7.4%
- test_get_all_with_related_models[40] -3.2%
- I/O-bound get_one / first / get_or_none within noise
cProfile (all_with_related): _process_kwargs tottime 0.365s -> 0.238s
(-35%). The line-397 dict-comp is no longer a separate hotspot.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf: replace RelationProxy.__getattribute__ with explicit count/clear (#1650)
Profile-driven removal of the per-attribute-access Python override on
RelationProxy. The previous __getattribute__ existed solely to redirect
two list-method names ("count", "clear") to the QuerysetProxy versions;
every other attribute access paid the cost of a Python-level
__getattribute__ call only to fall through to super.
cProfile (all_with_related, 40 000 row hydrations): __getattribute__
was 0.354 s tottime / 6.2 % of scenario time, with 420 000 calls. After
this change it is no longer in the top 25 by tottime — attribute access
goes through the C-level lookup path.
Replacement: define count() and clear() as async methods directly on
RelationProxy. They shadow list.count / list.clear by virtue of MRO and
delegate to queryset_proxy after self._initialize_queryset(). The
observable async semantics are unchanged: callers always did
``await proxy.count(...)`` / ``await proxy.clear(...)``; the previous
override returned a bound async method, the new methods are themselves
async — both produce the same coroutine.
Behavior unchanged: same signatures (distinct=True / keep_reversed=True
defaults), same delegation path, same QuerysetProxy initialization
trigger. 628 tests pass at 100 % coverage.
Benchmark deltas (median, pytest-benchmark, --warmup=on, 10+ rounds):
- test_get_all_with_related_models[10] -17.2 %
- test_get_all_with_related_models[20] -15.6 %
- test_get_all_with_related_models[40] -7.7 %
- test_get_all[250] -13.1 %
- test_get_all[500] -9.9 %
- test_get_all[1000] -6.4 %
- test_iterate[*] within noise
- single-row get_one / first I/O-dominated, ±15 % noise
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf: lazy relation machinery in Model.__init__ (#1652)
* perf: defer QuerysetProxy construction in RelationProxy
Every model materialized from a row eagerly built a QuerysetProxy per
reverse/m2m relation, even though the queryset machinery is rarely
touched on read paths. Make `RelationProxy.queryset_proxy` a lazy
property so the allocation only happens when something actually queries
through the relation.
Benchmarks (sqlite, on-disk):
- init[1000]: 9.74 ms -> 8.42 ms (-13.6%)
- get_all[1000]: 15.64 ms -> 14.20 ms (-9.2%)
- iterate[1000]: 22.83 ms -> 21.33 ms (-6.6%)
* perf: defer RelationProxy construction in Relation
Reverse / many-to-many relations allocated their RelationProxy in
Relation.__init__ for every model, even when the relation was never
read. Construct on first add/get instead, and read related_models
directly from the hash-cache update path so a PK change doesn't force
materialization of every reverse proxy on the model.
Combined with the previous QuerysetProxy change, vs baseline:
- init[1000]: 9.74 ms -> 7.06 ms (-27.5%)
- get_all[1000]: 15.64 ms -> 12.06 ms (-22.9%)
- iterate[1000]: 22.83 ms -> 18.66 ms (-18.3%)
* perf: defer Relation construction in RelationsManager
RelationsManager.__init__ used to build a Relation for every declared
FK on every Model.__init__. Most of those Relation instances are never
read — they exist only to be checked for membership or never touched
at all on row-materialization paths. Build them on demand in _get(name)
using a precomputed name->field lookup instead.
Combined with the previous lazy-RelationProxy and lazy-QuerysetProxy
changes, vs baseline:
- init[1000]: 9.74 ms -> 6.67 ms (-31.6%)
- get_all[1000]: 15.64 ms -> 11.58 ms (-26.0%)
- iterate[1000]: 22.83 ms -> 18.83 ms (-17.5%)
- init[250]: 2.65 ms -> 1.66 ms (-37.4%)
* perf: cache row-extraction plan in from_row (#1654)
from_row used to recompute, for every row × every join level, the
selected-columns set, the prefixed column key strings, and the
exclude set. All three depend only on (model_cls, table_prefix,
excludable), not on the row, so they can be built once per query and
reused across rows.
- New RowExtractionPlan dataclass holds the precomputed work.
- build_row_extraction_plan / get_or_build_row_plan / apply_row_plan
split the lifecycle so callers can build once and apply many.
- _process_query_result_rows allocates a per-call plan cache; iterate
shares one cache across all yielded chunks so 1-row chunks still
amortize.
- prefetch_query._instantiate_models builds the plan once before the
row loop.
- _construct_with_excluded widened to AbstractSet[str] so the plan
can store the exclude set as a hashable frozenset.
Benchmarks vs the lazy-relation baseline:
- get_all[1000]: 11.58 ms -> 9.11 ms (-21.3%)
- get_all[500]: 6.80 ms -> 4.88 ms (-28.3%)
- iterate[1000]: 18.83 ms -> 15.67 ms (-16.8%)
- iterate[250]: 5.62 ms -> 4.54 ms (-19.2%)
* perf: unify Relation reverse/m2m container with __dict__ slot (#1655)
Reverse / m2m relations were tracked in two parallel containers — the
RelationProxy on the Relation, and a plain list on the owner's __dict__
read by pydantic for serialization. Every Relation.add did:
1. Hash-based membership check on the proxy
2. Append to the proxy
3. Dict load of the parallel list
4. Linear `child not in rel` scan over weakproxy entries (in a try
wrapper that caught ReferenceError as a "nuke and restart" signal)
5. Append to the parallel list
6. Dict store
Steps 3–6 collapse to a single dict store once the proxy itself is the
__dict__ slot. The proxy is a list subclass, so pydantic serialization
and pickling round-trip unchanged. Kept _find_existing's call site —
its dead-weakref probe is a side effect we still rely on (hash
collision on a stale entry populates _to_remove so the next get()
runs _clean_related).
Benchmarks vs the post-#1654 optimization tip:
- init_with_related[40]: 1983 µs -> 918 µs (-53.7%)
- init_with_related[20]: 718 µs -> 457 µs (-36.4%)
- init_with_related[10]: 303 µs -> 239 µs (-21.1%)
- get_all_with_related[40]: 5079 µs -> 4552 µs (-10.4%)
- workloads with no related-model registration: within noise
* perf: fast-path expand_relationship for already-typed Model values (#1656)
expand_relationship had a per-call try/except framework around its
constructor cache and dispatched on value.__class__.__name__ via a
string-keyed dict — even though the dominant case (row materialization,
user kwargs that already hold constructed Models) is just "register if
asked, return value".
Two changes:
- Identity-based fast path before the dispatch table: if value is an
instance of self.to (cheapest possible identity check via __class__
is), inline the register-if-asked + return. Skips the dict lookup
and the _register_existing_model indirection entirely for the most
common path.
- Replace the lazy try/except cache with a @cached_property. By the
time expand_relationship runs, _verify_model_can_be_initialized has
already gated on requires_ref_update, so self.to is resolved and the
bound methods captured in the dict are stable.
_register_existing_model is dead post-fast-path (it was the dispatch
target for the same-class case the fast path now covers); deleted, and
its dispatch entry removed.
Benchmarks vs the post-#1655 optimization tip:
- init_with_related[40]: 858 µs -> 810 µs (-5.6%)
- init_with_related[10]: 235 µs -> 222 µs (-5.5%)
- init[1000]: 7010 µs -> 6517 µs (-7.0%)
- get_all_with_related[40]: 4351 µs -> 4124 µs (-5.2%)
- iterate[1000]: 17152 µs -> 16188 µs (-5.6%)
* perf: in-place index assignment in _merge_items_lists (#1657)
* perf: in-place index assignment in _merge_items_lists
The matched branch rebuilt value_to_set with a list comprehension that
filtered by pk and concatenated [new_val] — O(N) per match, O(K*N)
overall. The Rust planner already returns the destination index
(other_idx); use it for an O(1) in-place write.
Behavior preserved on the workloads exercised by the existing benchmark
suite (matched branch always fires on size-1 value_to_set there). Where
the worst case actually fires — full PK overlap between current_field
and other_value at large list sizes — the new microbenchmark shows
33x speedup at N=100, scaling linearly with N.
Side effect: matched items now keep their original other_value index
instead of being shuffled to the tail. Today's "shuffle to tail" was
an artifact of the rebuild pattern, not a deliberate semantic; the
existing tests/test_ordering/ suite passes unchanged.
* test: add merge benchmarks (integration + microbench)
Adds two benchmark workloads for the row-merging path:
- test_select_related_nested_merge: integration benchmark over
Project -> Tasks (FK) -> Tags (m2m) covering the full join /
materialize / merge pipeline.
- test_merge_items_lists_pk_overlap: microbenchmark calling
_merge_items_lists directly with full-PK-overlap inputs at
list_size in {10, 50, 100}. This is the worst case the matched
branch is O(K*N) on; it isolates the inner loop from query and
row-materialization noise.
The microbenchmark shows the difference clearly:
list_size old (O(K*N)) new (O(K)) delta
10 31.4 us 7.0 us -77.6%
50 531.2 us 30.8 us -94.2%
100 2028.5 us 61.8 us -97.0%
Growth confirms theory: baseline scales quadratically, new code
scales linearly. The benchmark is included so future regressions in
this path get caught.
* perf: three small wins (#5, #3 remainder, #8) (#1658)
* perf: skip _recursive_add wrapper for size-1 merge groups
merge_instances_list always built a 1-element wrapper list and called
_recursive_add even when group_indices held a single index — the
common case for queries with no parent duplication. _recursive_add
short-circuits at len(model_group) <= 1 anyway, so the wrapper list,
the call, and the [0] index were pure overhead.
Hoist the early return one frame up: if the group has exactly one row,
read it directly from result_rows.
Benchmarks vs the post-#1657 optimization tip:
- init[1000]: 6571 us -> 6491 us (-1.2%)
- get_all[1000]: 9139 us -> 9069 us (-0.8%)
- get_all[500]: 4951 us -> 4799 us (-3.1%)
- iterate[1000]: 16373 us -> 16068 us (-1.9%)
* perf: cheaper Model.__hash__ and __same__
__hash__ used hash(str(pk) + cls.__name__) on every cache miss —
two string allocations per call. Replace with hash((pk, type(self)))
which uses CPython's identity hash for type objects and skips the
string concat entirely.
__same__ used hash(self) == other.__hash__() to compare two saved
models, which fills both sides' hash caches just to test equality.
Direct (pk, type) compare on the saved-pk path skips both hash
allocations; fall through to hash equality only when both sides are
unsaved.
Saved-pk path is the hot one (every relation-cache lookup goes
through __hash__). Unsaved-pk path keeps the str(vals) shape because
__dict__ can hold list/dict values (json fields, reverse-relation
slots) that aren't hashable directly.
Benchmarks (workloads that exercise relation hashing):
- init_with_related[40]: 828 us -> 781 us (-5.7%)
* perf: specialize _process_kwargs per field-set
The per-key dispatch loop ran four set membership checks on every kwarg
even though typical models have empty json_fields and bytes_fields and
only a couple of relation fields. Restructure so the empty-set checks
are hoisted out of the loop:
- Pre-compute relation_field_names on the class (cached, like
_pydantic_field_names) so expand_relationship is only called for
fields that actually need it.
- Validate unknown kwargs once up front rather than per-iteration.
- Fast path for plain models (no json/bytes/relations) reduces to
``return dict(kwargs), through_tmp_dict`` — a single C-level dict copy.
- Slow path skips per-iteration empty-set checks via has_json /
has_bytes guards.
Side effect: BaseField.expand_relationship is no longer called (only
relation fields go through expand_relationship now). Marked as
``# pragma: no cover`` rather than removed since it remains a public
API contract on the field base class.
Benchmarks vs the previous commit:
- get_all[1000]: 9670 us -> 9024 us (-6.7%)
- get_all[500]: 5008 us -> 4769 us (-4.8%)
- iterate[1000]: 16642 us -> 15723 us (-5.5%)
- get_all_with_related[40]: 4408 us -> 4205 us (-4.6%)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent d3a7a51 commit 853caad
31 files changed
Lines changed: 838 additions & 468 deletions
File tree
- .github/workflows
- benchmarks
- ormar
- fields
- models
- helpers
- mixins
- queryset
- queries
- relations
- utils
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
50 | 50 | | |
51 | 51 | | |
52 | 52 | | |
53 | | - | |
| 53 | + | |
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
384 | 384 | | |
385 | 385 | | |
386 | 386 | | |
387 | | - | |
| 387 | + | |
388 | 388 | | |
389 | 389 | | |
390 | 390 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
491 | 492 | | |
492 | 493 | | |
493 | 494 | | |
494 | | - | |
495 | | - | |
496 | | - | |
497 | | - | |
498 | | - | |
499 | | - | |
500 | | - | |
501 | | - | |
502 | | - | |
503 | | - | |
504 | | - | |
505 | | - | |
506 | | - | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
514 | | - | |
515 | | - | |
516 | 495 | | |
517 | 496 | | |
518 | 497 | | |
| |||
611 | 590 | | |
612 | 591 | | |
613 | 592 | | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
614 | 610 | | |
615 | 611 | | |
616 | 612 | | |
| |||
636 | 632 | | |
637 | 633 | | |
638 | 634 | | |
639 | | - | |
640 | | - | |
641 | | - | |
642 | | - | |
643 | | - | |
644 | | - | |
645 | | - | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
646 | 644 | | |
647 | 645 | | |
648 | | - | |
649 | 646 | | |
650 | 647 | | |
651 | 648 | | |
| |||
0 commit comments