Skip to content

feat: collection expression support#402

Merged
tris203 merged 1 commit into
tree-sitter:masterfrom
gustavoaca1997:user/gustavoaca1997/fix-collection-expression
Apr 14, 2026
Merged

feat: collection expression support#402
tris203 merged 1 commit into
tree-sitter:masterfrom
gustavoaca1997:user/gustavoaca1997/fix-collection-expression

Conversation

@gustavoaca1997

@gustavoaca1997 gustavoaca1997 commented Jan 17, 2026

Copy link
Copy Markdown
Contributor

This PR closes #401

This is my first time contributing to a Tree Sitter grammar, so please let me know if you have any concerns, suggestions, extra test cases, etc.

Change details

  • Add new test cases to expressions.txt.
  • Modify the test case Array with trailing comma. Roselyn is actually matching [ y, ] against collection_expression, so I've updated the test to reflect that. I used SharpLab to verify that.
  • Add grammar rules for collection_element , expression_element , spread_element , collection_expression

How the changes were verified

I ran the tests locally, to verify they are passing.

Closes #409

@tris203

tris203 commented Jan 19, 2026

Copy link
Copy Markdown
Collaborator

I have updated ci
If you rebase and regenerate with 0.26.3 please

@gustavoaca1997

gustavoaca1997 commented Jan 19, 2026

Copy link
Copy Markdown
Contributor Author

I have updated ci If you rebase and regenerate with 0.26.3 please

Done @tris203 :) Thanks

@gustavoaca1997 gustavoaca1997 force-pushed the user/gustavoaca1997/fix-collection-expression branch from e54ff03 to 3685d1a Compare January 19, 2026 23:50
@gustavoaca1997

Copy link
Copy Markdown
Contributor Author

@tris203 I have force pushed the branch again, after accidentally the PR was showing changes that were already in the master branch

@gustavoaca1997

gustavoaca1997 commented Jan 20, 2026

Copy link
Copy Markdown
Contributor Author

@tris203 the files failing in the CI are valid C# 14, but are using expressions that are not yet supported: #392

These are the statements that are failing to be parsed in my branch, which should fail as well in master:

  • This line from PowerToys' ListViewModel. It has a Conditional Access Expression as a L-Value, which is still not supported in the current grammar.
 listPage.Filters?.CurrentFilterId = currentFilterId;
  • This line from PowerToys' CommandItemViewModel.cs. It also has a Conditional Access Expression as a L-Value,
_defaultCommandContextItemViewModel?.Command = Command;
  • This line from PowerToys' KernelServiceBase.cs. I believe this is a bug in the grammar, but that still could be solved by supporting Conditional Access Expressions as L-Values. I think it's a mistake that it is parsing the macros, instead of pre-processing the file, which is causing the Conditional Access Expression to be treated as an L-Value rather than the return expression of the lambda function.
static string Redact(object data) =>
#if DEBUG
            data?.ToString();
#else
            "[Redacted]";

It's possible that #400 fixes it.

@tris203 tris203 force-pushed the user/gustavoaca1997/fix-collection-expression branch from 3685d1a to 791f279 Compare April 13, 2026 22:13
@tris203 tris203 force-pushed the user/gustavoaca1997/fix-collection-expression branch from 791f279 to 095f7cf Compare April 14, 2026 14:54
@tris203 tris203 changed the title Add support for collection expression. feat: collection expression support Apr 14, 2026
@tris203 tris203 merged commit ff2a624 into tree-sitter:master Apr 14, 2026
4 checks passed
@guillaume86

Copy link
Copy Markdown

great! what is the timeline for a release on npm usually?

@tris203

tris203 commented Apr 15, 2026

Copy link
Copy Markdown
Collaborator

great! what is the timeline for a release on npm usually?

its already done

@guillaume86

guillaume86 commented Apr 15, 2026 via email

Copy link
Copy Markdown

@AlexLaroche AlexLaroche mentioned this pull request Jun 2, 2026
4 tasks
damieng pushed a commit that referenced this pull request Jun 2, 2026
* fix: parse [type]/[field]/etc. as collection_expression, and list_pattern slice subpatterns

Two latent grammar gaps surfaced by upstream-corpus drift in CI's example
clones (microsoft/PowerToys, nunit/nunit) since master's last CI run.

## Gap 1 — attribute-target keywords in collection-element position

`[type]`, `[field]`, `[method]`, `[param]`, `[property]`, `[typevar]` (and
nunit's `{ TypeArgs = [type] }` in particular) all failed to parse as
`collection_expression` — the outer node was produced, but the inner
element became an ERROR. The 8 literal keywords used by
`attribute_target_specifier` (`field`, `event`, `method`, `param`,
`property`, `return`, `type`, `typevar`) won the LR resolution over
`_identifier_token` even though `word: $._identifier_token` is set.

Fix: add the 6 non-keyword members of that set (`field`, `method`,
`param`, `property`, `type`, `typevar`) to `_reserved_identifier`, the
same list that already houses `var`/`from`/`scoped`/etc. for the same
reason. Exclude `event` and `return` — those are real C# keywords and
must not parse as identifiers anywhere else.

Bump `attribute_target_specifier` to `prec(1)` so the LR table prefers
the attribute-target reading when `:` follows (resolves the conflict
introduced by the new `_reserved_identifier` entries).

## Gap 2 — slice pattern with bound subpattern (C# 11)

`[a, .. var rest, z]`, `[.. List<int> all]`, etc. failed to parse —
`list_pattern` accepted bare `..` as a token but no subpattern after.
PowerToys' `case ['"', .. var inner, '"']` (in
`ImplicitWildcardQueryBuilder.cs`) is the canonical example.

Fix: introduce a `slice_pattern` rule (`'..' optional($.pattern)`) and
use it in place of the bare `'..'` token inside `list_pattern`. The bare
`..` case continues to parse — it produces a `(slice_pattern)` node
instead of consuming `..` as a token-only element. Existing "List
patterns" corpus test updated to reflect the new node shape; new "List
pattern with slice subpatterns (C# 11)" fixture covers the bound cases.

Both gaps have been latent since the relevant features landed
(collection-expression support in PR #402, April 2026; list patterns
much earlier). They surfaced only now because master's CI workflow
doesn't run on a schedule and master hasn't received a triggering
commit since 2026-04-14, so 49 days of upstream corpus drift went
unvalidated until the next contributor's PR re-ran the example parse.

* feat: support null-conditional assignment (C# 14)

C# 14 allows the assignment-target side to be a null-conditional
member or element access:

  obj?.field = value;
  obj?.list[0] = value;
  obj?[key] = value;

Previously, `conditional_access_expression` was only listed under
`non_lvalue_expression`, so any `obj?.x = ...` produced ERROR nodes
and the rest of the surrounding method failed to parse correctly.

Add `conditional_access_expression` to `lvalue_expression` and
declare the resulting `lvalue_expression` / `non_lvalue_expression`
overlap as an explicit conflict so tree-sitter uses GLR to keep both
alternatives alive until the `=` (or absence thereof) disambiguates.

Add a corpus test covering all three shapes (member, member+index,
direct index).

References:
- https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-14#null-conditional-assignment
- #392

* chore: regenerate src/* for Gaps 1/2/3 grammar changes

Regenerates the parser tables, grammar.json, and node-types.json after
applying the three grammar fixes on this branch:

- `slice_pattern` and the new `_reserved_identifier` members produce new
  parser states.
- `node-types.json` gains `slice_pattern`; `_reserved_identifier`'s
  subtype list expands.
- `attribute_target_specifier`'s `prec(1)` flows through the LR table.

`src/tree_sitter/array.h` intentionally left untouched — local
tree-sitter CLI version emits a slightly different layout from what
master shipped, and that drift is incidental, not part of these gaps.

* test: pin nameof support for unbound generic types (C# 14)

C# 14 allows `nameof(List<>)` / `nameof(Dictionary<,>)` etc. The
grammar already accepts these because `type_argument_list` permits
`repeat(',')` (line 764), but there was no corpus test pinning the
behavior. Add an explicit "Nameof with unbound generic types" test
covering arity-1/2/3 unbound generics, so any future tightening of
type_argument_list is caught.

References:
- https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-14#unbound-generic-types-and-nameof
- #392

* test: add partial constructors and partial events corpus (C# 14)

Partial events and partial constructors are already accepted by the
existing grammar because 'partial' is in the modifier list and both
constructor_declaration and event_declaration consume repeat(modifier).
Add corpus tests to pin the behavior so it cannot silently regress.

Refs: https://learn.microsoft.com/en-us/dotnet/csharp/whats-new/csharp-14#partial-events-and-constructors
Refs: #392

* test: pin field-backed properties with `field` contextual keyword (C# 14)

C# 14 introduces the `field` contextual keyword to refer to the
auto-generated backing field inside property accessor bodies, and
permits a property to mix an auto-accessor (`get;` / `set;`) with
a full accessor (with a body). See:

  https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-14.0/field-keyword

Per the spec, `field` is a *contextual* keyword (a primary expression
inside property accessor bodies), not a reserved word. Tree-sitter's
existing identifier rule and accessor_declaration / property_declaration
rules already accept every example in the proposal:

  { get; set => field = value; }                          // mix auto/full
  { get => field ?? "x"; set; }                          // mix full/auto
  Lazy => field ??= Compute();                            // expression-bodied
  [field: Xyz] public string D => field ??= Compute();   // field-targeted attr
  IsActive { get; set => Set(ref field, value); } = true; // mix + initializer

Disambiguating `field`-as-backing-field vs `field`-as-local is a
semantic concern, not a grammar one, so no grammar change is needed.
This corpus test pins the syntactic coverage so it cannot regress.

Part of #392.

* feat: C# 14 user-defined compound assignment and instance increment operators

Add the new operator tokens introduced by the C# 14
"user-defined compound assignment" proposal so that operator
declarations of the form

  public void operator +=(int x) { ... }
  public void operator checked +=(int x) => ...;
  public void operator ++()    { ... }
  public void operator checked ++() => ...;

parse to operator_declaration nodes.

Per the proposal, compound assignment and instance increment
operators are instance members (no static), return void, and have
one parameter (compound) or no parameters (instance increment).
The shared operator_declaration rule keeps its existing shape; the
only grammar change is to extend the operator token choice list to
include +=, -=, *=, /=, %=, ^=, |=, &=, <<=, >>=, >>>=. The existing
'checked' optional keyword already covers checked +=/-=/*=//= and
checked ++/--. Existing modifier rules already accept
override/new/readonly which the proposal newly permits.

Refs: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-14.0/user-defined-compound-assignment
Refs: #392

* feat: C# 14 extension declarations

Add the C# 14 `extension` declaration form so that

  public static class E
  {
      extension<T>(IEnumerable<T> source) where T : INumber<T>
      {
          public bool IsEmpty => !source.Any();
          public IEnumerable<T> Where(...) { ... }
      }

      extension<T>(IEnumerable<T>)
      {
          public static IEnumerable<T> operator +(...) => ...;
          public static IEnumerable<T> Identity => ...;
      }
  }

parses to an extension_declaration node nested under the enclosing
static class. Per the proposal, the grammar is:

  extension_declaration:
    'extension' type_parameter_list? '(' receiver_parameter ')'
    type_parameter_constraints_clause* extension_body
  extension_body: '{' extension_member_declaration* '}' ';'?
  extension_member_declaration:
    method_declaration | property_declaration | operator_declaration
  receiver_parameter: attributes? parameter_modifiers? type identifier?

`extension` is a contextual keyword; it is recognized as the start of
an extension declaration only in class-member position followed by an
optional type parameter list and a parenthesized receiver. A
prec.dynamic boost biases ambiguous parses toward the declaration form
when the context fits. One new conflict is registered to keep the
receiver parameter, scoped_type, and reserved-identifier
interpretations of `scoped` alive until disambiguation.

Refs: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-14.0/extensions
Refs: #392

* feat: C# 14 simple-lambda parameters with modifiers

Adds the four genuinely-new C# 14 lambda forms — `(ref x) => x`,
`(out y) => y = 0`, `(in z) => z`, and the mixed `(text, out result)
=> ...` — by introducing an external lookahead token
(`_lambda_paren_open`) plus a new grammar rule.

The scanner forward-scans past the opening `(` and emits the token
only when it confirms a closing `)=>` after a parameter list
containing at least one hard parameter modifier
(`ref`/`out`/`in`/`readonly`). Because the token appears in exactly
one rule, the parser commits to the lambda interpretation
unambiguously — with zero conflict against parenthesized expressions,
tuples, arguments, or method-declaration parameter lists.

`scoped` is intentionally NOT a hard modifier here: the existing
`Scoped contextual keyword` corpus test pins `(scoped p) => null` as
`parameter_list(parameter(type=identifier(scoped), name=p))`. This
matches the C# 14 spec where `scoped` is auxiliary and combines with
`ref`/`ref readonly`.

Refs: #392

* docs: update README to reflect C# 14 support

C# 14 features now supported across the grammar:
- nameof with unbound generic types
- null-conditional assignment (`obj?.x = v`, `obj?[i] = v`)
- partial events and constructors
- user-defined compound assignment + instance increment/decrement
- field-backed properties (`field` contextual keyword)
- extension declarations
- simple-lambda parameters with modifiers (`(ref x) => x`, etc.)

File-based-apps preprocessor directives (#:property, #:package, #:sdk,
#:project) remain unrecognized; noted as a known limitation.

* fix(scanner): lambda-paren forward-scan leaves the lexer dirty on bail

The lambda-paren scanner introduced in 7c7b5f3 advances forward
speculatively past `(` to decide whether the next `(` opens a C# 14
simple-lambda parameter list. On bail-out the wrapper "fell through"
to the rest of the external_scanner_scan function, but the lexer
cursor was already past `(false, $` (etc.) and the next condition
that fired — typically INTERPOLATION_REGULAR_START — emitted a token
whose span ran from the original scan position to the dirty cursor.

Result: any `return (false, $"...");`-shaped tuple expression with
an interpolated string inside a tuple body became unparseable —
the inner `interpolation_start` swallowed `(false, $` and the rest
of the tuple cascaded into ERROR.

Minimal repro:
    return (false, $"Invalid {p}", "Error");

Fix: change `scan_lambda_paren_open` to a tri-state return so the
wrapper can distinguish three cases:

  - LAMBDA_SCAN_SUCCESS     emit LAMBDA_PAREN_OPEN.
  - LAMBDA_SCAN_NO_PAREN    lookahead wasn't `(`; the function did
                            nothing except skip leading whitespace
                            (via `skip(lexer)` which doesn't dirty
                            token boundaries). Wrapper falls through
                            so other handlers (e.g. INTERPOLATION_*)
                            still get a chance.
  - LAMBDA_SCAN_FAILED_AFTER_PAREN
                            cursor advanced past at least `(`.
                            Wrapper returns false; tree-sitter
                            discards every advance and retries with
                            the built-in `(` tokenizer.

A BAIL macro in `scan_lambda_paren_open` makes the failure path
unambiguous at every early return.

Verified:
  - 184/184 corpus tests pass.
  - 9173/9173 example files parse (full local Parse examples).
  - SettingsBackupAndRestoreUtils.cs (the original PowerToys CI
    failure) now parses 100%.
  - Existing "Simple lambda parameters with modifiers" and
    "Modifier-prefixed parens that are NOT lambda parameter lists"
    corpus tests still pass — the success path is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Support for C# 12 Collection Expressions

3 participants