Skip to content

Commit ee4793e

Browse files
T-GroCopilotabonie
authored
Fix quotations leaking empty-string match lowering into AST (#19873) (#19923)
* Move empty-string match optimization to Optimizer so quotations stay clean The empty-string lowering added in #19189 produced `if x <> null then x.Length = 0 else false` inside `BuildSwitch`, which leaked through pattern-match compilation into captured quotations (#19873). Move the rewrite to `OptimizeExprOp`, which already skips `Expr.Quote` bodies. `match s with "" -> _` keeps the same IL and quotations now see `op_Equality(s, "")`. As a side effect, `if s = "" then _` gets the same null-safe length-check IL — improvement, not regression. Also adds a generic `verifyOutputAgainstBaseline` helper in `Compiler.fs` and a `.bsl`-driven quotation rendering test suite under `Conformance/Expressions/ ExpressionQuotations/QuotationRendering/` to catch future leaks of pattern-match lowerings into quotation ASTs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Lock in IL trade-offs from rubber-duck review; tighten optimizer guard Adversarial reviews flagged two IL-quality concerns. Verified both by inspecting actual ildasm output, accepted both, and added IL tests locking the current behavior in so future changes are conscious: 1. --optimize- (debug) builds no longer get the null+Length fast path for `match s with "" -> _` because F#'s `(=)` is only inlined when LocalOptimizationsEnabled is true. The fallback `String.Equals(s,"")` call is still correct; JIT tiered compilation reaches the fast path. 2. `match s with null | "" -> _` emits one redundant `brfalse` on the empty-string branch because the optimizer cannot see the enclosing null-filtered context that BuildSwitch's `isNullFiltered` flag tracked. JIT trivially eliminates the redundant branch. Also tightens `IsILMethodRefSystemStringEquals` to require the call to be static (rejects hypothetical instance/user-defined `System.String` type with a 2-arg static `Equals`) and fixes a stale function-name comment in PatternMatchCompilation.fs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Polish quotation rendering test suite Driver: extract `printerProgram` helper using a triple-quoted F# string (no more `\"\"` escape hell); group tests into Regression / NoLeakReference / SideEffect / PreExistingError submodules. Baselines: drop the `Match` suffix (folder is already `QuotationRendering`), delete redundant `SparseIntMatch` (same code path as Consecutive), shrink all multi-case tests to the minimum that still demonstrates the point: - ConsecutiveInts: 1..10 → 1..3 (was 41 lines, now 7) - Chars, NonEmptyString: 3 → 2 cases - Int64, Float, Decimal: 3 → 1 case Result: 9 baselines totalling 37 lines, each fits on screen. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Tighten test comments; collapse no-leak quotation tests to Theory The empty-string regression tests and surrounding trade-off comments grew beyond what they need to convey. This trims prose to the causal mechanism only, flattens four banner-only sub-modules into one, collapses the five no-leak reference tests into a parametrised Theory, and corrects a stale docstring claim that Int64/Float/Decimal each take a distinct BuildSwitch arm (Int64 and Float share the mkILAsmCeq path, so Float.bsl was redundant with Int64.bsl and is removed). Release note shrinks from a paragraph to the one user-observable change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refactor quotation rendering tests to in-process literals after multi-architect review Three rounds of adversarial multi-model (gpt-5.5, opus-4.7-xhigh, opus-4.8) architect review converged on a cleaner design for the empty-string regression coverage: - Quotation literals now live directly in test method bodies (`<@ … @>`) instead of inside a generated FSI script. They are desugared at test-project compile time by the proto fsc, which converts a runtime baseline diff into a compile-time gate: if #19873 ever regresses, the test source itself fails to compile with FS0452. - Removes the shared-FSI session, the no-op `RunTestCasesInSequence` cargo cult, the string-templated printer with its triple-quote escape guard, and the unused `verifyOutputAgainstBaseline` helper this PR briefly introduced. - Baselines carry a `// <name>` provenance header so a `.bsl` opened in isolation identifies itself in PR diffs. - Adds an orphan-guard `[<Fact>]` that fails when `*.bsl` on disk drifts from the test method set (skipped during `TEST_UPDATE_BSL=1` to avoid racing with writes). - Adds a structural Expr walker as belt-and-suspenders alongside the baseline for the two known leaked-lowering shapes (`String.Length`, `String.Equals`). - Adds a convergence assertion that the `match x with ""` and `if x = ""` quotations desugar to byte-identical Exprs, using shared module-level let bindings so the proof and the baselines cannot drift apart. - Splits the FS0452 array-pattern diagnostic test into its own file since it uses a different harness (`FSharp |> typecheck |> shouldFail`, not a `.bsl` snapshot). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Use tname_* constants instead of magic strings in IL method-ref matchers `tname_String`, `tname_Bool`, `tname_Type` already exist in `src/Compiler/AbstractIL/il.fs` as `[<Literal>]` constants alongside ~20 other type-name literals, but they were not exposed via `il.fsi` so callers in `Optimize/Optimizer.fs` and `Symbols/Exprs.fs` had been duplicating the magic strings (`"System.String"`, `"System.Boolean"`, `"System.Type"`). Exposes the three names actually referenced outside `il.fs` via `il.fsi` and switches the four affected `IL{TypeRef,MethodRef}.Name` comparisons to use them. No behavior change. The fix applies to the empty-string-equals matcher this PR added plus the pre-existing `String.Concat` / `String.GetHashCode` / `Type.GetTypeFromHandle` matchers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Trim bloated comments in QuotationRenderingTests.fs 5-line walker docstring, 3-line section-divider blocks, 6-line bullet list above the no-leak reference Facts, and a 12-line "Design notes" header were rationalising self-evident code or restating test names. Trimmed to the essentials: top-of-file purpose + bootstrap caveat, one-line note next to the shared `let` bindings, and the one genuinely-non-obvious comment (Int64 takes the mkILAsmCeq arm). Net effect: file shrinks from 156 to 100 lines, signal density up, tests unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Remove orphan-baseline guard Fact It tested filesystem bookkeeping (does *.bsl on disk equal a hand-maintained expectedBaselines set), not the compiler. The drift it catches — a deleted test leaving its .bsl behind — is already visible in `git diff` and adds zero signal anyone would actually use to find a regression. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Cut surviving garbage after multi-agent adversarial scan Three garbage hunters (production / tests / prose) found redundancies that slipped through the earlier iterations: Production: - Merge two String.Equals(_, "")/("", _) optimizer arms into one or-pattern. - Drop the redundant CallingConv.IsStatic guard in IsILMethodRefSystemStringEquals (ArgCount=2 + both args String + return Bool already pin the static overload). - Inline single-use let bindings in MakeOptimizedStringEqualsEmptyCall. - Shrink 4-line docstring to 2 lines; shrink match-arm comment to one line. - Shrink il.fsi block comment and PatternMatchCompilation.fs trade-off comment. Tests: - Delete QuotationUnsupportedConstructsTests.fs (FS0452 array-pattern coverage already exists in Regressions/E_DecomposeArray01.fs — pure duplication). - Delete IfEqualEmpty.bsl + its test + the convergence Fact + the shared `let` bindings + assertNoEmptyStringLowering walker; all duplicated EmptyString.bsl coverage either as identical AST or as belt-and-suspenders restating the baseline diff. - Strip the `// <name>` provenance header from the printer and from all 7 remaining baselines — the test method name already identifies each baseline. - Trim trade-off comment tails in TypeTestsInPatternMatching.fs (drop the JIT-folds-the-duplicate / tiered-compilation trivia). Release notes: drop the implementation-detail bullet about `if s = ""` getting the same null-safe IL — implied by moving the rewrite to the optimizer. Net: -115/+30 across 16 files. 14 tests still pass; build clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refactor String IL-method matchers via active pattern + InlineIfLambda Extracts a `(|StringTy|_|)` partial active pattern over `ILType` and a small `isILMethodRefOnSystemString` helper that bakes in `tname_String` as the declaring type and takes the per-arg-shape check as an `[<InlineIfLambda>]`. Each `IsILMethodRefSystem*` now reads as the literal shape it matches: `[StringTy; StringTy]` for Equals(string, string); the three overload arities of String.Concat as alternative list patterns; `[ILType.Array (shape, StringTy)]` for Concat(string[]). No `ArgCount` book-keeping or `List.forall` walks left. Behaviour unchanged: build clean, 14 quotation + IL tests still green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address PR review + fix CI: drop misplaced comment, run quotations via FSI - PatternMatchCompilation.fs: remove the misplaced `// Empty-string is rewritten…` comment on the (now-pristine) string arm (review feedback from @abonie). - QuotationRenderingTests.fs: replace literal `<@…@>` quotations with source-string evaluation through a shared FSI session. The bootstrap fsc that builds the test project still has the pre-fix desugar and rejects literal `match s with ""` quotations with FS0452 — the literals only become valid AFTER this PR is in the bootstrap. FSI uses the just-built FCS (with the fix), so source-as-string evaluation is bootstrap-immune. The .bsl baselines are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Adam Boniecki <20281641+abonie@users.noreply.github.com>
1 parent b8b40f5 commit ee4793e

15 files changed

Lines changed: 261 additions & 27 deletions

File tree

docs/release-notes/.FSharp.Compiler.Service/11.0.100.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@
7575
([PR #19724](https://github.com/dotnet/fsharp/pull/19724))
7676
* Emit debug points at a stack-empty position ([PR #19877](https://github.com/dotnet/fsharp/pull/19877))
7777
* Fix spurious XmlDoc warnings (unknown parameter / no documentation for parameter) under `--warnon:3390` when a get/set property documents the full parameter set across both accessors. ([Issue #13684](https://github.com/dotnet/fsharp/issues/13684), [PR #19884](https://github.com/dotnet/fsharp/pull/19884))
78+
* Quotations of `match s with "" -> _` no longer leak the `s <> null && s.Length = 0` lowering; the empty-string optimization moved from pattern-match compilation to the optimizer so quoted expressions keep `op_Equality(s, "")`. ([Issue #19873](https://github.com/dotnet/fsharp/issues/19873))
7879

7980
### Added
8081

src/Compiler/AbstractIL/il.fsi

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2471,6 +2471,17 @@ val internal ecmaPublicKey: PublicKey
24712471
/// Strips ILType.Modified from the ILType.
24722472
val internal stripILModifiedFromTy: ILType -> ILType
24732473

2474+
// Built-in type names exposed for `mref.DeclaringTypeRef.Name = tname_X` matchers outside il.fs.
2475+
2476+
[<Literal>]
2477+
val internal tname_String: string = "System.String"
2478+
2479+
[<Literal>]
2480+
val internal tname_Type: string = "System.Type"
2481+
2482+
[<Literal>]
2483+
val internal tname_Bool: string = "System.Boolean"
2484+
24742485
/// Discriminating different important built-in types.
24752486
val internal isILObjectTy: ILGlobals -> ILType -> bool
24762487
val internal isILStringTy: ILGlobals -> ILType -> bool

src/Compiler/Checking/PatternMatchCompilation.fs

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -810,13 +810,6 @@ let rec BuildSwitch inpExprOpt g isNullFiltered expr edges dflt m =
810810
let test = mkILAsmCeq g m (mkLdlen g m vExpr) (mkInt g m n)
811811
let finalTest = if isNullFiltered then test else mkLazyAnd g m (mkNonNullTest g m vExpr) test
812812
mkLetBind m bind finalTest
813-
| DecisionTreeTest.Const (Const.String "") ->
814-
// Optimize empty string check to use null-safe length check
815-
let _v, vExpr, bind = mkCompGenLocalAndInvisibleBind g "testExpr" m testexpr
816-
let test = mkILAsmCeq g m (mkGetStringLength g m vExpr) (mkInt g m 0)
817-
// Skip null check if we're in a null-filtered context
818-
let finalTest = if isNullFiltered then test else mkLazyAnd g m (mkNonNullTest g m vExpr) test
819-
mkLetBind m bind finalTest
820813
| DecisionTreeTest.Const (Const.String _ as c) ->
821814
mkCallEqualsOperator g m g.string_ty testexpr (Expr.Const (c, m, g.string_ty))
822815
| DecisionTreeTest.Const (Const.Decimal _ as c) ->

src/Compiler/Optimize/Optimizer.fs

Lines changed: 40 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2291,27 +2291,35 @@ let TryDetectQueryQuoteAndRun cenv (expr: Expr) =
22912291
//printfn "Not eliminating because no Run found"
22922292
None
22932293

2294+
let inline (|StringTy|_|) (ilTy: ILType) : bool =
2295+
ilTy.IsNominal && ilTy.TypeRef.Name = tname_String
2296+
2297+
let inline private isILMethodRefOnSystemString
2298+
(methodName: string)
2299+
(returnTypeName: string)
2300+
([<InlineIfLambda>] argsCheck: ILType list -> bool)
2301+
(mref: ILMethodRef) =
2302+
mref.Name = methodName &&
2303+
mref.DeclaringTypeRef.Name = tname_String &&
2304+
mref.ReturnType.IsNominal && mref.ReturnType.TypeRef.Name = returnTypeName &&
2305+
argsCheck mref.ArgTypes
2306+
2307+
let IsILMethodRefSystemStringEquals (mref: ILMethodRef) =
2308+
mref |> isILMethodRefOnSystemString "Equals" tname_Bool (function
2309+
| [StringTy; StringTy] -> true
2310+
| _ -> false)
2311+
22942312
let IsILMethodRefSystemStringConcat (mref: ILMethodRef) =
2295-
mref.Name = "Concat" &&
2296-
mref.DeclaringTypeRef.Name = "System.String" &&
2297-
(mref.ReturnType.IsNominal && mref.ReturnType.TypeRef.Name = "System.String") &&
2298-
(mref.ArgCount >= 2 && mref.ArgCount <= 4 &&
2299-
mref.ArgTypes
2300-
|> List.forall (fun ilTy ->
2301-
ilTy.IsNominal && ilTy.TypeRef.Name = "System.String"))
2313+
mref |> isILMethodRefOnSystemString "Concat" tname_String (function
2314+
| [StringTy; StringTy]
2315+
| [StringTy; StringTy; StringTy]
2316+
| [StringTy; StringTy; StringTy; StringTy] -> true
2317+
| _ -> false)
23022318

23032319
let IsILMethodRefSystemStringConcatArray (mref: ILMethodRef) =
2304-
mref.Name = "Concat" &&
2305-
mref.DeclaringTypeRef.Name = "System.String" &&
2306-
(mref.ReturnType.IsNominal && mref.ReturnType.TypeRef.Name = "System.String") &&
2307-
(mref.ArgCount = 1 &&
2308-
mref.ArgTypes
2309-
|> List.forall (fun ilTy ->
2310-
match ilTy with
2311-
| ILType.Array (shape, ilTy) when shape = ILArrayShape.SingleDimensional &&
2312-
ilTy.IsNominal &&
2313-
ilTy.TypeRef.Name = "System.String" -> true
2314-
| _ -> false))
2320+
mref |> isILMethodRefOnSystemString "Concat" tname_String (function
2321+
| [ILType.Array (shape, StringTy)] when shape = ILArrayShape.SingleDimensional -> true
2322+
| _ -> false)
23152323

23162324
let rec IsDebugPipeRightExpr cenv expr =
23172325
let g = cenv.g
@@ -2540,6 +2548,14 @@ and MakeOptimizedSystemStringConcatCall cenv env m args =
25402548
| _ ->
25412549
OptimizeExpr cenv env expr
25422550

2551+
/// Rewrite `String.Equals(x, "")` to `x <> null && x.Length = 0` (issue #19873
2552+
/// done here so quotation bodies, which the optimizer skips, keep `op_Equality(_, "")`).
2553+
and MakeOptimizedStringEqualsEmptyCall cenv env m nonEmptyArg =
2554+
let g = cenv.g
2555+
let _, vExpr, bind = mkCompGenLocalAndInvisibleBind g "testExpr" m nonEmptyArg
2556+
let lengthIsZero = mkILAsmCeq g m (mkGetStringLength g m vExpr) (mkInt g m 0)
2557+
OptimizeExpr cenv env (mkLetBind m bind (mkLazyAnd g m (mkNonNullTest g m vExpr) lengthIsZero))
2558+
25432559
/// Optimize/analyze an application of an intrinsic operator to arguments
25442560
and OptimizeExprOp cenv env (op, tyargs, args, m) =
25452561

@@ -2611,6 +2627,12 @@ and OptimizeExprOp cenv env (op, tyargs, args, m) =
26112627
when IsILMethodRefSystemStringConcat ilMethRef ->
26122628
MakeOptimizedSystemStringConcatCall cenv env m args
26132629

2630+
// See MakeOptimizedStringEqualsEmptyCall (issue #19873). `(=)` lowers to `String.Equals` post-inlining.
2631+
| TOp.ILCall(_, _, _, _, _, _, _, ilMethRef, _, _, _), _,
2632+
([nonEmpty; Expr.Const(Const.String "", _, _)] | [Expr.Const(Const.String "", _, _); nonEmpty])
2633+
when IsILMethodRefSystemStringEquals ilMethRef ->
2634+
MakeOptimizedStringEqualsEmptyCall cenv env m nonEmpty
2635+
26142636
| _ ->
26152637
// Reductions
26162638
OptimizeExprOpReductions cenv env (op, tyargs, args, m)

src/Compiler/Symbols/Exprs.fs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -759,14 +759,14 @@ module FSharpExprConvert =
759759
ConvExprPrim cenv env op
760760

761761
| TOp.ILAsm ([ I_call (Normalcall, mspec, None) ], _), _, [arg]
762-
when mspec.MethodRef.DeclaringTypeRef.Name = "System.String" && mspec.Name = "GetHashCode" ->
762+
when mspec.MethodRef.DeclaringTypeRef.Name = tname_String && mspec.Name = "GetHashCode" ->
763763
let ty = tyOfExpr g arg
764764
let op = mkCallHash g m ty arg
765765
ConvExprPrim cenv env op
766766

767767
| TOp.ILCall (_, _, _, _, _, _, _, ilMethRef, _, _, _), [],
768768
[Expr.Op (TOp.ILAsm ([ I_ldtoken (ILToken.ILType _) ], _), [ty], _, _)]
769-
when ilMethRef.DeclaringTypeRef.Name = "System.Type" && ilMethRef.Name = "GetTypeFromHandle" ->
769+
when ilMethRef.DeclaringTypeRef.Name = tname_Type && ilMethRef.Name = "GetTypeFromHandle" ->
770770
let op = mkCallTypeOf g m ty
771771
ConvExprPrim cenv env op
772772

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Lambda (x,
2+
IfThenElse (Call (None, op_Equality, [x, Value ('a')]), Value (1),
3+
IfThenElse (Call (None, op_Equality, [x, Value ('b')]),
4+
Value (2), Value (0))))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Lambda (x,
2+
IfThenElse (Call (None, op_Equality, [x, Value (1)]), Value ("a"),
3+
IfThenElse (Call (None, op_Equality, [x, Value (2)]),
4+
Value ("b"),
5+
IfThenElse (Call (None, op_Equality,
6+
[x, Value (3)]), Value ("c"),
7+
Value ("z")))))
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Lambda (x,
2+
IfThenElse (Call (None, op_Equality,
3+
[x,
4+
Call (None, MakeDecimal,
5+
[Value (1), Value (0), Value (0), Value (false),
6+
Value (0uy)])]), Value ("a"), Value ("b")))
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Lambda (x,
2+
IfThenElse (Call (None, op_Equality, [x, Value ("")]), Value (1),
3+
Value (0)))
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Lambda (x,
2+
IfThenElse (Call (None, op_Equality, [x, Value (1L)]), Value ("a"),
3+
Value ("b")))

0 commit comments

Comments
 (0)