Commit eb92aa9
committed
einsum: ToT x ToT -> T denest via phantom-unit dot, not expand-then-reduce
A denested ToT x ToT contraction (inner indices fully contracted, plain-T
result) was evaluated by einsum<DeNest::True> as expand-then-reduce: it formed
the full uncontracted product C0 (external x contracted-outer x inner) before
reducing. With a large contracted-outer index (e.g. a DF/RI index) this
materializes an enormous intermediate -- ~20 GB for one CSV-CC term in C8H18
PNO-CCSD (256 s, 26.6 GB peak) for a 412 KB result.
Reformulate as a single contraction whose inner product is a Frobenius dot.
The inner reduction is expressed with a phantom unit-extent result mode
(reserved label prefix U+2297, is_phantom_unit_label) so the result inner cell
is a genuine order->=1 tensor (TA has no order-0), and the dot reads operand
cells flat: no operand carries the phantom mode, no inner GEMM rank match, no
order-0, and C0 is never built. Correct even when an inner extent depends on a
contracted-outer index.
- util/annotation.h: phantom-unit label prefix + is_phantom_unit_label.
- einsum/tiledarray.h: DeNest::True builds C(c;U) = A(..) * B(..;..,U) then
unwraps the unit-extent inner cells to scalars.
- tensor/arena_einsum.h: RegimeAInnerKind::phantom_dot, ArenaInnerShapeKind::
unit_range, and arena_hadamard_phantom_dot (view cells).
- expressions/cont_engine.h: inner phantom-dot op in the owning and view-cell
inner-op paths, for both outer-Contraction and outer-Hadamard regimes.
- tests/einsum.cpp: external-index (e-present) denest case.
C8H18 PNO-CCSD: the motivating term drops 256 s/26.6 GB -> 0.25 s/2.9 GB and
the run now completes; c4h10 converges as before.1 parent db0bff5 commit eb92aa9
5 files changed
Lines changed: 353 additions & 143 deletions
File tree
- src/TiledArray
- einsum
- expressions
- tensor
- util
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
531 | 531 | | |
532 | 532 | | |
533 | 533 | | |
534 | | - | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
535 | 540 | | |
536 | | - | |
| 541 | + | |
537 | 542 | | |
538 | | - | |
539 | | - | |
540 | | - | |
541 | | - | |
542 | | - | |
543 | | - | |
544 | | - | |
545 | | - | |
546 | | - | |
547 | | - | |
548 | | - | |
549 | | - | |
550 | | - | |
551 | | - | |
552 | | - | |
553 | | - | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
554 | 549 | | |
555 | 550 | | |
556 | 551 | | |
| |||
566 | 561 | | |
567 | 562 | | |
568 | 563 | | |
569 | | - | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
570 | 567 | | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
577 | 572 | | |
578 | | - | |
579 | | - | |
| 573 | + | |
| 574 | + | |
580 | 575 | | |
581 | | - | |
582 | | - | |
| 576 | + | |
| 577 | + | |
583 | 578 | | |
584 | 579 | | |
585 | 580 | | |
586 | | - | |
587 | | - | |
588 | | - | |
589 | | - | |
590 | | - | |
591 | | - | |
592 | | - | |
593 | 581 | | |
594 | 582 | | |
595 | 583 | | |
| |||
0 commit comments