Skip to content

Optimize FlatExpression node layout for constants, parameters, parent links, and lambda/block owner metadata#534

Closed
Copilot wants to merge 8 commits intomasterfrom
copilot/optimize-flat-expression-representation
Closed

Optimize FlatExpression node layout for constants, parameters, parent links, and lambda/block owner metadata#534
Copilot wants to merge 8 commits intomasterfrom
copilot/optimize-flat-expression-representation

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 22, 2026

FlatExpression needs a more direct node model to support upcoming canonicalization and chaining work: compact constant storage, explicit parameter declaration/usage identity, reliable parent reachability from child chains, and owner-side lambda/block metadata that can absorb part of later compilation-prep analysis. This update reshapes the flat node metadata so those semantics are encoded directly instead of inferred indirectly.

  • Constant node layout

    • Inline 32-bit constant payloads directly into constant leaf metadata.
    • Keep larger/non-inlineable values in Obj instead of routing them through ClosureConstants.
    • Preserve round-trip reconstruction for enums and primitive 32-bit payloads.
  • Parameter declaration vs usage

    • Split parameter semantics into:
      • declaration nodes: own scope/position metadata
      • usage nodes: point back to the declaration index
    • Store declaration scope in ChildIdx and declaration position in ChildCount.
    • Populate declaration metadata when lambda/block owners are created.
    • Fail fast on duplicate declarations in the same binding path.
  • Child chain parent linkage

    • Make the last node in every child chain point back to its owning node.
    • Reserve a metadata flag to distinguish “next sibling” from “parent link”.
    • Keep chaining traversal simple while enabling upward navigation.
  • Lambda / block owner metadata

    • Add separate Lambdas and Blocks small lists on ExprTree.
    • Record lambda owner metadata during FE construction:
      • owner node index
      • body node index
      • owned parameter declaration indexes
    • Record block owner metadata during FE construction:
      • owner node index
      • body child-list node index
      • owned variable declaration indexes
    • Provide keyed lookup helpers by owner node index via TryGetLambdaInfo and TryGetBlockInfo.
    • Populate the metadata both for direct FE construction and for flattening from System.Linq.Expressions.
  • Reader / builder updates

    • Update flattening to emit declaration nodes once and usage nodes on reference.
    • Update reconstruction to intern parameters by declaration index rather than prior synthetic IDs.
    • Move key node access paths to GetSurePresentRef/direct refs in hot paths.
    • Bind lambda/block declaration ownership at the same time owner metadata is recorded.
  • Focused coverage

    • Adjust FlatExpression tests to assert:
      • no closure-constant spill for the covered constant cases
      • declaration metadata is written to lambda/block-owned parameters
      • usage nodes point to declarations
      • terminal child links point back to the parent owner
      • lambda/block owner metadata is recorded for both direct FE construction and system-expression flattening

Example of the new parameter model:

var p = fe.Parameter(typeof(int), "p");   // declaration node
var assign = fe.Assign(local, p);         // creates parameter usage node(s)

var decl = fe.Nodes[p];
var usage = fe.Nodes[fe.Nodes[assign].ChildIdx];

decl.ChildIdx   // owning lambda/block index
decl.ChildCount // position in owner
usage.ChildIdx  // declaration node index

Example of the new owner metadata:

var lambdaInfo = fe.TryGetLambdaInfo(fe.RootIndex, out var lambda);
var blockInfo = fe.TryGetBlockInfo(blockIndex, out var block);

lambda.BodyIndex;
lambda.ParameterDeclarations;

block.BodyExpressionsIndex;
block.VariableDeclarations;

Copilot AI linked an issue Apr 22, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Optimize FlatExpression representation Optimize FlatExpression node layout for constants, parameters, and parent links Apr 22, 2026
Copilot AI requested a review from dadhi April 22, 2026 21:13
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ubuntu build is failed:
/home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FlatExpression.cs(1793,6): error CS0246: The type or namespace name 'DoesNotReturnAttribute' could not be found (are you missing a using directive or an assembly reference?) [/home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FastExpressionCompiler.LightExpression.csproj::TargetFramework=net472]

Also just to verify. We are using the same condition for FE as for SE and LE in FEC to decide what to store in ClosureConstants. We did not change that. The only change is for non closure constants that previosly can be stored inplace in Constant node for up to 8 bytes, but now - just up to 4 bytes. So bigger than 4 bytes non-closure values dtored in Obj (possibly boxed). Please check that.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 22, 2026

The ubuntu build is failed:
/home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FlatExpression.cs(1793,6): error CS0246: The type or namespace na...

Fixed in def9eb9. I removed the DoesNotReturn annotations so net472 can compile again, and restored the original closure-constant policy: object-like constants still go through ClosureConstants, while only non-closure values that no longer fit in 4 inline bytes stay boxed in Obj.

Copilot AI requested a review from dadhi April 22, 2026 21:39
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vi errors:
Error: /home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FlatExpression.cs(1315,40): error CS0117: 'BitConverter' does not contain a definition for 'SingleToInt32Bits' [/home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FastExpressionCompiler.LightExpression.csproj::TargetFramework=net472]

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 22, 2026

Vi errors:
Error: /home/runner/work/FastExpressionCompiler/FastExpressionCompiler/src/FastExpressionCompiler.LightExpression/FlatExpression.cs(1315,40): error CS0117: 'BitConverter' does not contain a...

Fixed in 5ce2ffc. I replaced the BitConverter float bit conversions with a small explicit-layout helper so the inline float constant path stays the same but compiles on net472 too.

Copilot AI requested a review from dadhi April 22, 2026 21:46
tag = pointsToParent
? (byte)(tag | nextPointsToParentMask)
: (byte)(tag & ~nextPointsToParentMask);
_data = (_data & KeepWithoutTagAndNextMask)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let split to 4 bytes _data and 4 bytes _nodeTypeAndKind

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 1484adf. ExprNode now uses a 4-byte _data field plus a 4-byte _nodeTypeAndKind field, while keeping the 24-byte explicit layout intact.

Copilot AI requested a review from dadhi April 22, 2026 22:28
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I am not sure about shape of the complex expression for declaration and usage nodes. Could you please add a comments in code describing the decl and usage shape of Lambda, Block, etc.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

Right now I am not sure about shape of the complex expression for declaration and usage nodes. Could you please add a comments in code describing the decl and usage shape of Lambda, Block, etc.

Added the shape comments in 331ecce. FlatExpression.cs now documents the declaration/usage contract plus the concrete lambda and block child layouts close to the builder/binding code.

Copilot AI requested a review from dadhi April 23, 2026 06:03
Copy link
Copy Markdown
Owner

@dadhi dadhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let look at how do prepare the expression for compilation in FEC by gathering the info in TryCollectInfo. We are doing so by walking throw expression tree and collecting info about nested lambdas, closue constants, parameter usage, gotos, etc. We collect this info into list which is very remeniscent of our current FE.

So my idea would to do as much as possible of this separate TryCollectInfo work why building FE incrementally without additional step. And may be find/prevent some invalid expressions by construction.

We already building ClosureConstants as we go.
Can we do the same for lambdas, e.g. create a separate Lambdas small list with Lsmbda specialized declaration node to simplify its construction and handling? Create and link those nodes in such a way to directly help.nested lambda emit and closure construction whic TryCollectInfo does? Can we solve identical lsmbdas should be compilef once problem right hetr?

What about blocks? They very similar to lambdas in the regard of having parameter declarations. And teir parameters may be set to closure to be used by the nested lambdas.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

Let look at how do prepare the expression for compilation in FEC by gathering the info in TryCollectInfo. We are doing so by walking throw expression tree and collecting info about nested lambdas, clo...

Yes, partially. FE can absorb a good chunk of TryCollectInfo incrementally: closure constants already do, and the next natural step is owner-side metadata for lambdas/blocks keyed by node index (body, declaration range/list, nested-lambda links, captured declaration indexes, maybe mutated-capture bits). That would directly help nested-lambda emit and closure construction. Blocks fit the same model for declaration ownership/capture, although they are scopes, not compile units.

What I would keep separate for now is the part that depends on whole-subtree/outer-context analysis: propagation of non-passed params to outer lambdas, goto/label usage accounting, complex-expression classification, and lambda dedup/compile-once. The compile-once part especially wants a stable canonicalized key, not just “same FE node built here”, otherwise we may dedup too early or by the wrong identity.

So my take is: use FE construction to accumulate the structural data and some capture facts, then shrink TryCollectInfo into a lighter analysis/finalization pass over FE instead of trying to remove it in one step.

Copilot AI requested a review from dadhi April 23, 2026 06:46
@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 23, 2026

@copilot Agree. Let do the first part here for lmbdas and blocks, this will clarify the next steps.

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

@copilot Agree. Let do the first part here for lmbdas and blocks, this will clarify the next steps.

Implemented in 83123e9. FE now records lambda and block owner metadata during construction in separate small lists, with keyed lookup by owner node index: lambdas store body plus parameter declarations, and blocks store body-expression-list plus variable declarations. I also added coverage for both direct FE construction and flattening from system expressions.

Copilot AI changed the title Optimize FlatExpression node layout for constants, parameters, and parent links Optimize FlatExpression node layout for constants, parameters, parent links, and lambda/block owner metadata Apr 23, 2026
@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 23, 2026

@copilot Could summarize and bullet list all my ideas and remarks during the PR and pjt them as comment into the #533

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

@copilot Could summarize and bullet list all my ideas and remarks during the PR and pjt them as comment into the #533

I don’t have a tool here to post a regular issue comment to #533 directly, but here is the summary text ready to paste there:

  • FlatExpression should move toward a more direct representation to prepare for FlatExpression: explore canonical representation pros and cons #531, FlatExpression: structural equality and hash for content addressing #530, and follow-up canonicalization/chaining work.
  • Constant node layout:
    • Reduce inline constant storage from 8 bytes to 4 bytes.
    • Keep the closure-constant decision aligned with SE/LE/FEC; that policy should not change.
    • Values that no longer fit inline but are not closure constants should stay boxed in Obj.
    • Use the freed metadata space to simplify node representation (NodeType, parent/sibling chaining info).
  • Parameter model:
    • Split parameter nodes into declaration nodes and usage nodes.
    • Declaration node should store owner scope (ChildIdx) and position in owner (ChildCount).
    • Usage node should point back to the declaration index.
    • Writing declaration ownership in Lambda/Block construction helps simplify parameter comparison by scope/position.
    • Re-declaring the same variable in the same binding path should fail fast.
    • Preserve cheap checks by keeping parameter type/name on usage nodes too.
  • Child-chain / parent linkage:
    • The last node in every child chain should point back to its parent/owner expression.
    • Reserve one metadata bit to distinguish “next sibling” vs “parent link”.
    • This should make upward navigation available without complicating chain traversal.
  • General perf / representation remarks:
    • Avoid non-inlinable bare throws in hot paths; prefer Try... patterns or throw-helper wrappers.
    • Move bounds checks to the API edge where possible and use GetSurePresentRef/direct refs internally.
    • Keep Debug.Assert where useful for safety without paying runtime costs in release.
  • Documentation / clarity:
    • The FE code should explicitly document the declaration/usage shapes for Lambda, Block, and similar complex nodes.
  • Bigger design direction discussed during the PR:
    • Try to absorb as much of TryCollectInfo as practical while building FE incrementally instead of requiring a separate analysis step.
    • FE is already accumulating closure constants during construction; lambdas and blocks are the next natural area to do the same.
  • Lambdas:
    • Maintain a separate small Lambdas list keyed by FE node index.
    • Store lambda-oriented owner metadata directly to simplify construction and later handling.
    • Shape that metadata so it directly helps nested-lambda emit and closure construction.
  • Blocks:
    • Treat blocks similarly to lambdas for declaration ownership metadata.
    • Block variables may also need closure/lifting information because nested lambdas can capture them.
  • What likely stays separate for now:
    • Whole-tree / outer-context dependent analysis such as propagation of non-passed params to outer lambdas.
    • Goto/label usage accounting.
    • Complex-expression classification.
    • Lambda dedup / compile-once decisions, because that likely needs a stable canonical key rather than just current FE node identity.
  • Net effect of the PR direction:
    • Use FE construction to accumulate structural data and some capture facts by construction.
    • Then shrink TryCollectInfo into a lighter finalization/analysis pass over FE rather than trying to eliminate it in one jump.

@dadhi
Copy link
Copy Markdown
Owner

dadhi commented Apr 28, 2026

dead end

@dadhi dadhi closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants