Skip to content

[Draft] Add generated C# full-generation benchmarks#10885

Draft
live1206 wants to merge 23 commits into
microsoft:mainfrom
live1206:mtg-manual-name-reduction-experiment
Draft

[Draft] Add generated C# full-generation benchmarks#10885
live1206 wants to merge 23 commits into
microsoft:mainfrom
live1206:mtg-manual-name-reduction-experiment

Conversation

@live1206

@live1206 live1206 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR is now a benchmark/profiling PR for generated C# full generation and post-processing performance.

It adds full-generation benchmark/profiling coverage and records the current performance evidence that motivated the production hybrid reference-map work in #10976. The earlier manual C# name-reduction experiment was removed because it did not improve performance on scaled generated-code corpora.

What Changed

  • Added full-generation benchmark coverage for Sample-TypeSpec.
  • Added optional per-step profiling around provider/code writing, PostProcessAsync(), GetGeneratedFilesAsync(), and file writes.
  • Added benchmark modes comparing Roslyn reference-map construction with provider reference-map analysis.
  • Kept Simplifier.ReduceAsync as the existing final document simplification path.

How To Run

Full generation benchmark:

DOTNET_ROOT="$HOME/.dotnet" PATH="$HOME/.dotnet:$PATH" dotnet run --project packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/perf/Microsoft.TypeSpec.Generator.Tests.Perf.csproj -c Release --framework net10.0 --filter "*FullGenerationBenchmark*"

Enable per-step profile files:

POSTPROCESSING_BENCHMARK_PROFILE_STEPS=true POSTPROCESSING_BENCHMARK_PROFILE_DIR="/tmp/typespec-post-processing-profiles"

Latest Benchmark Data

Latest combined BenchmarkDotNet run after final provider dependency/parity fixes:

Mode Mean Allocated
Roslyn reference map 1,298.5 ms 63.51 MB
Provider reference map 899.8 ms 49.38 MB

Approximate improvement by mean:

Time:       ~30.7% faster
Allocation: ~22.2% less

Benchmark notes:

  • Both modes were run in the same BenchmarkDotNet invocation using a UseProviderReferenceMap parameter.
  • Configuration: WarmupCount=1, MinIterationCount=15, MaxIterationCount=20, IterationTime=250 ms.
  • Runtime: .NET 10.0.9, Ubuntu 26.04, AMD EPYC 7763.
  • Benchmark report path from the local run: /tmp/typespec-bench-explicit-deps/results/Microsoft.TypeSpec.Generator.Perf.FullGenerationBenchmark-report-github.md.

Focused Reference-Map Data

Focused profiling separates the two Roslyn reference-map phases and compares them with the provider replacement:

Path Avg Time Avg Allocated
Roslyn public reference map, internalize 172.6 ms 11.43 MB
Roslyn all reference map, remove 233.3 ms 11.21 MB
Roslyn combined reference maps 405.8 ms 22.64 MB
Provider map analysis, computes both 225.9 ms 12.81 MB
Provider candidate consumption ~0.55 ms ~38 KB

Approximate focused improvement for the reference-map replacement:

Time:       ~44.2% faster
Allocation: ~43.2% less

Provider analysis computes internalize/remove candidates together in one pre-pass, so it does not naturally split the same way as the old Roslyn InternalizeAsync and RemoveAsync phases.

Full-Generation Hotspot Data

Dry-run smoke result for FullGenerationBenchmark.GenerateSampleTypeSpecProject:

Method Mean Total Allocated
GenerateSampleTypeSpecProject 1.603 s 64.28 MB

Full-generation phase profile:

Phase Total Share Allocated Notes
Generation.PostProcessAsync 949.775 ms 60.4% 29,755,928 B Remove/internalize processing
Generation.WriteGeneratedFilesToDisk 369.231 ms 23.5% 18,160,224 B Includes GetGeneratedFilesAsync() final document processing plus file writes
Generation.WriteTypeProviders 116.061 ms 7.4% 5,799,832 B Code writer path
Generation.CreateSourceInputModel 50.538 ms 3.2% 1,533,832 B Source input model setup
Generation.BuildTypeProviders 40.014 ms 2.5% 4,618,616 B Type provider construction
Generation.ResolveExternalTypeReferences 30.133 ms 1.9% 6,538,112 B External type/reference setup

PostProcessAsync() Breakdown

Reference-map construction dominates GeneratedCodeWorkspace.PostProcessAsync():

Step Total Allocated
PostProcess.InternalizeAsync 517.453 ms 16,235,712 B
PostProcess.RemoveAsync 432.280 ms 13,520,216 B
PostProcessor.Internalize.BuildPublicReferenceMapAsync 353.832 ms 11,319,552 B
PostProcessor.Remove.BuildAllReferenceMapAsync 390.248 ms 11,306,400 B

Reference-map construction inside PostProcessAsync() accounts for roughly:

(353.832 ms + 390.248 ms) / (517.453 ms + 432.280 ms) = 78.4%

GetGeneratedFilesAsync() Subphase

Simplifier.ReduceAsync runs in GetGeneratedFilesAsync(), not in PostProcessAsync().

In the same full-generation run:

Step Total Average/File Allocated
Roslyn.Simplifier.ReduceAsync 349.972 ms 6.862 ms 17,455,096 B

Compared with the larger full-generation hotspot:

Reference-map construction inside PostProcessAsync(): 744.080 ms
Simplifier.ReduceAsync inside GetGeneratedFilesAsync(): 349.972 ms
Reference maps are ~2.13x hotter than ReduceAsync in full generation.

Latest Validation Data

Local validation performed while stabilizing the benchmark and production follow-up included:

  • dotnet build packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/perf/Microsoft.TypeSpec.Generator.Tests.Perf.csproj -c Release passed.
  • dotnet test packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/test/Microsoft.TypeSpec.Generator.Tests.csproj --filter "FullyQualifiedName~GeneratedCodeWorkspaceTests" passed: 8/8.
  • Focused post-processing and client-model tests passed in the production follow-up branch.
  • Local all data-plane Azure SDK regen using RegenPreview.ps1 -Azure -Unbranded completed successfully for 42/42 libraries.
  • Latest local all data-plane regen runtime: 00:14:28; previous clean run: 00:11:41.

Representative regenerated package builds exposed additional Azure custom-stub accessibility issues in the production follow-up branch. Those are correctness follow-ups for #10976, not benchmark instrumentation issues in this PR.

Conclusion

The largest full-generation hotspot is PostProcessor reference-map construction inside GeneratedCodeWorkspace.PostProcessAsync().

Simplifier.ReduceAsync is a separate hotspot inside GetGeneratedFilesAsync() final document processing, but it is not part of PostProcessAsync() and is not the largest full-generation issue.

This PR should be used as the benchmark/profiling basis for generated C# post-processing optimization work. The production implementation lives separately in #10976.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

@microsoft-github-policy-service microsoft-github-policy-service Bot added the emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp label Jun 4, 2026
@pkg-pr-new

pkg-pr-new Bot commented Jun 4, 2026

Copy link
Copy Markdown

Open in StackBlitz

npm i https://pkg.pr.new/@typespec/http-client-csharp@10885

commit: e4019d3

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

No changes needing a change description found.

live1206 and others added 7 commits June 4, 2026 01:54
@live1206

live1206 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Network post-processing performance experiment results

Documenting the current experiment state before switching back to the Network MPG migration work.

Network generator-only timing

All runs used the saved Network generator inputs from sdk/network/Azure.ResourceManager.Network/{Configuration.json,tspCodeModel.json} and produced 3,542 .cs files.

Variant Wall time Roslyn/post-processing time Result
main (ccc3f6004) 712.482s (~11m52s) 11m48.286s Fastest baseline
PR #10846 / scoped simplifier (53124ea7f) 851.905s (~14m12s) 14m07.350s ~19.6% slower than main
This PR / manual no-Roslyn reducer (e4019d359) 1152.665s (~19m13s) 19m08.268s ~61.8% slower than main

Artifacts: /tmp/network-three-branch-20260604052728/.

Bounded parallelism follow-up

I also tried an env-var-gated bounded document-processing experiment (TYPESPEC_GENERATOR_POSTPROCESSING_PARALLELISM=16). The synthetic BenchmarkDotNet corpus looked promising, but real Network generation did not beat main:

Variant Wall time Roslyn/post-processing time Files
main baseline 712s 11m48s 3542
manual branch + parallelism=16 722s 11m57s 3542
main + parallelism=16 733s 12m08s 3542

Artifacts: /tmp/network-parallelism16-20260604062933/ and /tmp/network-main-parallelism16-20260604064307/.

Conclusion

This PR is useful as an experiment, but the manual reducer approach is not currently a Network performance improvement. The bounded-parallelism experiment was removed from the working tree after it failed to beat main on Network.

Recommended next direction is deeper phase/document-size instrumentation around Roslyn workspace post-processing, then target reducing the number or size of documents that go through semantic simplification rather than replacing Roslyn simplification wholesale.

@live1206

Copy link
Copy Markdown
Contributor Author

Latest shadow-map replacement results from local BenchmarkDotNet runs.

Correctness Shadow Comparison

Replacement mode still compares the provider/custom-code hybrid candidates against the Roslyn candidates before using them.

Internalize:
Roslyn candidates: 32
Provider candidates: 32
Missing: 0
Extra: 0

Remove:
Roslyn candidates: 37
Provider candidates: 37
Missing: 0
Extra: 0

This is for Sample-TypeSpec.

Full-Generation Benchmark

Benchmark command shape:

DOTNET_ROOT="$HOME/.dotnet" PATH="$HOME/.dotnet:$PATH" \
dotnet run --project packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/perf/Microsoft.TypeSpec.Generator.Tests.Perf.csproj \
  -c Release --framework net10.0 --filter "*FullGenerationBenchmark*"

Replacement mode additionally used:

TYPESPEC_PROVIDER_REFERENCE_MAP_SHADOW=true
TYPESPEC_PROVIDER_REFERENCE_MAP_USE_SHADOW=true
TYPESPEC_PROVIDER_REFERENCE_MAP_SHADOW_DIR="/tmp/typespec-provider-reference-map-shadow"
Mode Mean Error StdDev Allocated
Baseline Roslyn reference-map path 1.042 s 0.363 s 0.418 s 63.98 MB
Hybrid provider/custom map replacement 641.3 ms 98.37 ms 113.3 ms 44.41 MB

Approximate improvement:

Time:       ~38.5% faster
Allocation: ~30.6% less

Interpretation

The hybrid provider/custom-code map is now exact for Sample-TypeSpec in shadow comparison and shows a meaningful local benchmark improvement when used to replace Roslyn reference-map construction.

Next step: create a clean PR from latest main with this replacement path and run a proper regeneration across SDK services to validate output correctness broadly.

@live1206 live1206 changed the title [Draft] Experiment with manual C# name reduction [Draft] Add generated C# full-generation benchmarks Jun 18, 2026
live1206 and others added 2 commits June 19, 2026 08:45
…tion-experiment

# Conflicts:
#	packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator.ClientModel/src/Providers/MrwSerializationTypeDefinition.cs
#	packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/PostProcessing/GeneratedCodeWorkspace.cs
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk-automation

Copy link
Copy Markdown

You can try these changes here

🛝 Playground 🌐 Website 🛝 VSCode Extension

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant