Fix GroupBy.Select EmptyProjectionMember#38140
Merged
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a query translation failure involving GroupBy(...).Select(...) where applying an inner projection could inadvertently mutate a shared SelectExpression, leading to KeyNotFoundException: EmptyProjectionMember. It adds new regression coverage around GroupBy().Select(first).Where(...) and ordering patterns (and documents remaining gaps via skipped tests).
Changes:
- Prevents projection application side-effects by cloning the subquery
SelectExpressionbefore replacing/applying projections. - Adds new Northwind GroupBy specification tests targeting the
GroupBy(...).Select(...).Where(...)and ordering scenarios related to #31209. - Adds SQL Server baselines for the newly-enabled scenarios and skips the failing variants for InMemory.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/EFCore.Relational/Query/RelationalSqlTranslatingExpressionVisitor.cs | Clones SelectExpression for subquery property binding to avoid mutating shared query state. |
| test/EFCore.Specification.Tests/Query/NorthwindGroupByQueryTestBase.cs | Adds new GroupBy+Select+Where/Order spec tests (and keeps currently-failing shapes skipped). |
| test/EFCore.SqlServer.FunctionalTests/Query/NorthwindGroupByQuerySqlServerTest.cs | Adds SQL Server baselines for the new spec tests. |
| test/EFCore.InMemory.FunctionalTests/Query/NorthwindGroupByQueryInMemoryTest.cs | Skips the newly-added tests which currently fail for InMemory. |
henriquewr
commented
Jun 6, 2026
AndriySvyryd
approved these changes
Jun 6, 2026
Member
|
Thanks for your contribution |
ajcvickers
added a commit
to ajcvickers/efcore
that referenced
this pull request
Jun 16, 2026
…31209, dotnet#26748) - Member-mode ReplaceProjection left stale _clientProjections, producing an invalid mixed projection state. - Made the member-mode overload clear _clientProjections, mirroring the index-mode overload. - Fixes member Select / second GroupBy composed on top of GroupBy(k).Select(g => g.First()). - Enables 4 previously-skipped tests and adds 4 regression tests; InMemory's separate limitation stays skipped. - Verified across ~27k query tests (SQLite + SqlServer) with zero regressions. A SelectExpression holds its projection as either a projection-member mapping or a client-projection (index) list, never both. The two ReplaceProjection overloads were asymmetric: the index-mode overload cleared both representations, but the member-mode overload cleared only the member mapping. So a member-mode Select following an index-mode Select left stale client projections behind, and ApplyProjection then took the index path over a member-based shaper and threw. This change makes the member-mode overload clear the client-projection list too, restoring the invariant. This is a partial follow-up to dotnet#38140. It covers the member-Select / second-GroupBy family on top of the "entity per group" idiom. InMemory has an independent limitation in the same area and remains skipped. ## The original issue dotnet#31209 (and the older, more general dotnet#26748) collect a large family of GroupBy translation failures, all sharing one shape: a GroupBy whose Select projects an entire entity per group (the "latest row per group" idiom), with a further operator composed on top: db.Set<Order>() .GroupBy(o => o.CustomerID) .Select(g => g.OrderByDescending(x => x.Date).First()) // one whole entity per group .<further operator> // the trigger Reporters hit it through many surface forms - a second Select projecting a member, a final Where/OrderBy, a result-selector GroupBy, a second GroupBy, FirstOrDefault instead of First. The original repros ended in a second GroupBy (dotnet#31209) or in Count/Select/OrderBy over a composite (anonymous-type) key (dotnet#26748). The symptom was a confusing KeyNotFoundException: The given key 'EmptyProjectionMember' was not present in the dictionary, thrown deep in projection post-processing rather than at the offending operator, which is why the threads accumulated years of dead-end workarounds (FromSqlRaw round-trips, query-tag interceptors, ToList().AsQueryable()). dotnet#38140 fixed part of this by cloning the subquery SelectExpression in RelationalSqlTranslatingExpressionVisitor.BindProperty, which resolved the Where/OrderBy/ anonymous-type and Count cases. It explicitly left the member-Select-after case unsolved, and after that fix the remaining cases surfaced a different error, InvalidOperationException: Nullable object must have a value. ## What is actually being fixed Root cause is an invariant violation in SelectExpression. A projection is represented as exactly one of: - _projectionMapping - keyed by ProjectionMember (member mode), or - _clientProjections - an index list (client/index mode) GetProjection reads one or the other depending on whether the binding carries a ProjectionMember or an Index; the two are never meant to coexist. ReplaceProjection has two overloads, and they were asymmetric: - ReplaceProjection(IReadOnlyList<Expression>) clears _projectionMapping AND _clientProjections - ReplaceProjection(IReadOnlyDictionary<...>) cleared ONLY _projectionMapping <-- bug Sequence that breaks: 1. .Select(g => g.First()) falls back to index-based binding -> populates _clientProjections. 2. .Select(r => r.EmployeeID) translates in member mode -> repopulates _projectionMapping via the Dictionary overload, but leaves _clientProjections stale and non-empty. 3. ApplyProjection sees _clientProjections.Count > 0, takes the index path, and the ClientProjectionRemappingExpressionVisitor walks the member-based shaper (Index == null) -> "Nullable object must have a value" (was EmptyProjectionMember before dotnet#38140). The fix makes the Dictionary overload also clear _clientProjections and _aliasForClientProjections, making it symmetric with the List overload and restoring the one-representation invariant. ## Challenges - Misleading symptom. The exception is thrown in post-processing (ApplyProjection), far from the Select that creates the bad state, and it had already mutated once (EmptyProjectionMember -> Nullable-must-have-a-value) across dotnet#38140, so the stack trace points away from the cause. - Two plausible fix sites. The contributor on dotnet#38140 pursued the upstream NavigationExpandingExpressionVisitor.ProcessSelect path (making the GroupBy Select return a pending selector) and found it cascades into broad regressions requiring a redesign of pending- selector handling. The actual defect is downstream, in the relational SelectExpression, where a one-line symmetry restoration suffices. - Shared, hot code path. ReplaceProjection(IReadOnlyDictionary) is called from ~7 sites across the relational, SqlServer, and Sqlite providers. The change had to be proven safe everywhere, not just for the reported query. ## Outcome Production change: clear _clientProjections / _aliasForClientProjections in the member-mode ReplaceProjection overload. The clear is a no-op whenever the list is already empty (the normal case), so the only behavioral change is in the previously-invalid mixed state. Tests: - Un-skipped GroupBy_Select_Entire_Entity_Select and _Where_Select. - Added regression tests: GroupBy_Select_Entire_Entity_FirstOrDefault_Where, GroupBy_ResultSelector_Entire_Entity_Where (result-selector entry point), GroupBy_Select_Entire_Entity_GroupBy (the literal dotnet#31209 repro), GroupBy_Select_Entire_Entity_composite_key_Select (the literal dotnet#26748 repro). - Added SqlServer AssertSql baselines for all of the above (clean correlated-subquery / nested GROUP BY SQL). - InMemory keeps these skipped - it has a separate limitation in the same area. ## Review and validation Three independent reviewers analysed the change for blast radius, cross-shape consequences, and coverage gaps; their proposed experiments were then run. - Blast radius: all 7 callers of the member-mode overload traced. Six provably have an empty _clientProjections at call time (fresh select, just-cleared, or member-mode by construction), so the clear is a literal no-op; the seventh is the bug's target. - One genuine behavioral delta - the set-operation guard (SetOperationsNotAllowedAfterClientEvaluation). Verified empirically: * GroupBy(k).Select(g => g.First()).Select(o => o.OrderID).Union(...) previously threw spuriously and now returns correct results - corrective. * GroupBy(k).Select(g => g.First()).Union(...) (no intervening member Select) still correctly throws - the fix does not suppress the legitimate guard. - Structural invariant confirmed: when a kept value depends on a collection / correlated subquery / JSON node / entity held in _clientProjections, member-mode translation bails to the index fallback (List overload), so the Dictionary clear never fires destructively. No silent drop of collections, includes, JSON, or split-query metadata. - Pre-existing limitation, not a regression: projecting a navigation off the picked entity (GroupBy(k).Select(g => g.First()).Select(r => r.Customer.City)) fails as "ProjectionBindingExpression: 0 could not be translated". Confirmed by reverting the fix that it fails identically with and without this change. Separate, deeper issue. - dotnet#26748 cross-check: its Count and OrderBy repros were already resolved by dotnet#38140; its remaining member-Select repro (composite/anonymous key) failed with the exact "Nullable object must have a value" error and is resolved by this change. Verified by removing the fix and re-running. - Regression suites: ~12.1k SQLite and ~14.9k SqlServer query tests pass with zero failures; the NorthwindGroupBy class is green on SQLite, SqlServer, and InMemory. Fixes dotnet#31209. Fixes dotnet#26748. Follow-up to dotnet#38140.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
"partially" fixes #30052, #31209 and probably others
The issue was that the QueryExpression instance is the same reference as the external instance
So the inner projection was wrongly applied to the external too
This fixes the original
EmptyProjectionMembererror, nowGroupBy(...).Select(...).Where(...)(or any method that follows the Select call, as long as it is not other Select call) works, but isn't the entire thingIf the
Selectcall following aGroupByfalls back to_indexBasedBinding(RelationalProjectionBindingExpressionVisitor or InMemory)GroupBy(...).Select(SomethingThatFallsbackToIndexBasing)it is no longer possible to apply any projection that doesn't also falls back to
_indexBasedBindingThis occurs because the second projection cannot override the previous one (
_clientProjectionhas a projection and_projectionMappinghas other projection)It seems that this entire thing is caused because the
GroupByversion ofNavigationExpandingExpressionVisitor.ProcessSelectwhich does not return a pending selector. Instead, it returns a direct Select call that is not extensibleI tested the approach returning a pending selector, which resolved the issue (allowing
GroupBy(...).Select(SomethingThatFallsbackToIndexBasing).Select(...)to work)However doing it generates lots of other problems, it would need a different approach to how pending selectors are handled
Certain methods would need to resolve the pending selector before the call
GroupBy().Select().Join(),Joinwould need to resolve the pending selector before the callAt least a workaround exists:
Instead of chaining Select calls:
You can nest Select calls:
About the skipped InMemory test:
It seems to be caused by the same issue
Is supposed to be transformed into
And since
GroupBy.Selectdoes not return a pending selector, theWherecall fails to retrieve the value