Implement in-place child-before-parent ExprTree.Canonicalize() with status reporting#532
Implement in-place child-before-parent ExprTree.Canonicalize() with status reporting#532Copilot wants to merge 8 commits into
ExprTree.Canonicalize() with status reporting#532Conversation
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/5ca8e295-4a63-4eb7-92fb-705afa82cdf4 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/5ca8e295-4a63-4eb7-92fb-705afa82cdf4 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
ExprTree.ToCanonical() for deterministic flat-expression layout
dadhi
left a comment
There was a problem hiding this comment.
The Windows workflow is hanging, Please double check and potential infinite recursion.
Run dotnet run --no-build -c:Release -f:net8.0 --project test/FastExpressionCompiler.TestsRunner/FastExpressionCompiler.TestsRunner.csproj
//## .NET Latest (Core): Running UnitTests and IssueTests in parallel...
//IssueTests are passing in 1810 ms.
Then I don't get it.
- Good ideas to keep - find non-linked nodes (add optional parameter config to ToCanonical) to raise exceptions or silently drop such nodes. It seems to help to implement a kind of tree shaking.
- Bad ideas: what I mean to to canonical is the canonical order of nodes where child nodes should be immediately follow-up by their parent. This is naturally happening if build expression in one builder statement with the factory methods (except for parameter - for them below). The order is broken if we create different part of the tree at different times (or copy them from the different tree, places).
If we canonicalize this ordering then comparison of 2 tree becomes a single loop-zip of 2 lists.
Emit of arguments before operation is naturally matches the order, etc.
Note: We may be forgiven for the empty nodes (Node with null Type) - such nodes may or may not happen in future when optimizing/lifting expression, etc. then when compare, etc. we can just skip such nodes, may be even store the live node Count in the Tree to help with that.
Another idea around identity of variables - for the FE it is possible means the identity of their location/index. Here we free to interpret they canonical order. May be to enforce we split them into the Separate SmallList or ParametNode (which has only Type, optional Name, and NextIdx) , then in ExprNode we may add a bit to Node.ChildIdx to forward us to this Param list.
Then we can ignore the actual order int ParamList because what we need is just the same index referenced by the same nodes in the same ExprTree and the Param Type is the same as well (we may ignore Names as they do not ifluence the semnatics).
Please do a though research and apply it in the changes.
Evaluate alternatives and output what alternative you evaluated and prose/cons/gaps/tbd you have found in my current idea.
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/d8340c41-257d-435e-9387-26da14b51485 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/d8340c41-257d-435e-9387-26da14b51485 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Addressed in 947e53f, with a small follow-up cleanup in 8a15905. I replaced the |
ExprTree.ToCanonical() for deterministic flat-expression layoutExprTree.ToCanonical() with orphan handling
| private readonly ExprTree _source; | ||
| private SmallMap16<int, int, IntEq> _parameterIds; | ||
| private SmallMap16<int, int, IntEq> _labelIds; | ||
| private bool[] _reachable; |
There was a problem hiding this comment.
Why arrays? No GC remember, use small list, small map, etc. from ImTools
| private SmallMap16<int, int, IntEq> _labelIds; | ||
| private bool[] _reachable; | ||
| private bool[] _visiting; | ||
| private ExprTree _canonical; |
There was a problem hiding this comment.
Canonicalize should work in place, may be rename ToCanonical -> Canonicalize.
User knows what she is doing and do a preventive ExprTree.Copy (add this method).
Add IsCanonical as well to check the current expresdion, in addition to bool return if the parameter passed, we can collect wrong/unreacable/etc node indexes.
Idea of in place change is to minimize expr tree rewrites and avoid allocations altogether - we just swap the unordered nodes into the right place (may be tempprary storing the replaced in nodes variable, etc) to maximize performane and reduce compute.
There was a problem hiding this comment.
Addressed in 24c9cc1, with follow-up polish in 27fc92a. The API is now centered on in-place Canonicalize(...), plus Copy() and IsCanonical(...). I kept the rewrite behavior in-place at the ExprTree level and added CanonicalizeFlags.CompactGaps so callers can choose whether to preserve cleared gaps or compact them away.
| { | ||
| for (var i = 0; i < _reachable.Length; ++i) | ||
| if (!_reachable[i] && _source.Nodes[i].Type != null) | ||
| throw new InvalidOperationException($"Flat expression contains an orphan node at index {i}."); |
There was a problem hiding this comment.
Please avoid exceptions as much as possible, prefer return result in parameters the same as IsCanonical. Change the api to mirror the IsCanonical. Canonicalize should return the number of mispaced nodes where 0 means that it was already in canonical form and have an out parameyer of unreachible node count.
The unreachible nodes shoud be cleared and maybe used/rewritten by canonicalize operation. The canonicalize may include optional flags pamameter with enum values saying CompactGaps.
There was a problem hiding this comment.
Addressed in 24c9cc1, with follow-up polish in 27fc92a. I removed the orphan-node exception path from the public API and changed it to status reporting: IsCanonical(out misplacedNodeCount, out unreachableNodeCount) and Canonicalize(out unreachableNodeCount, CanonicalizeFlags flags = CanonicalizeFlags.None). Canonicalize now returns the misplaced live-node count, clears unreachable nodes by default, and supports CompactGaps.
|
|
||
| private int CopyNode(int index) | ||
| { | ||
| if ((uint)index >= (uint)_source.Nodes.Count) |
There was a problem hiding this comment.
For exceptions if unavoidable (we may avoid them by chanhing the Api to TryCopy and teturn the responsibility to consumer) please use static Throw.. wrapperns inttoduced in modern .net and add the same wrappets for older targets.
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/74c4eb78-a408-492c-9db8-fe31b2731951 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/74c4eb78-a408-492c-9db8-fe31b2731951 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
Agent-Logs-Url: https://github.com/dadhi/FastExpressionCompiler/sessions/74c4eb78-a408-492c-9db8-fe31b2731951 Co-authored-by: dadhi <39516+dadhi@users.noreply.github.com>
ExprTree.ToCanonical() with orphan handlingExprTree.Canonicalize() with status reporting
|
Too early too soon |
This introduces a canonicalization path for
FlatExpression.ExprTreethat rewrites the reachable flat nodes directly into deterministic child-before-parent order, so equivalent trees become trivially comparable at the flat-node level. The scope remains focused on canonical flat layout, but the implementation now follows the flat-node model directly instead of round-tripping throughToExpression().Canonical flat-expression API
ExprTree.Copy().ExprTree.IsCanonical(out int misplacedNodeCount, out int unreachableNodeCount).ExprTree.Canonicalize(out int unreachableNodeCount, CanonicalizeFlags flags = CanonicalizeFlags.None).CanonicalizeFlags.CompactGaps.Behavioral guarantees covered
Type == null) are treated as ignorable gaps.goto/label) are exercised explicitly.ToExpression()round-trip path.Implementation details aligned with perf feedback
ImTools-style tracking.Related flat-tree correctness fix
ExprTree.Invoke(...)to store the invocation result type instead of the delegate type, so directly built flat trees match the canonical flattened representation.Focused regression coverage
IsCanonical(...)reports misplaced and unreachable nodes before rewriteCanonicalize(...)rewrites into the expected flattened shape while ignoring parameter names and constant valuesCanonicalizeFlags.CompactGaps