Modernize SIP string handling by paulomorgado · Pull Request #1645 · sipsorcery-org/sipsorcery

paulomorgado · 2026-05-29T14:28:17Z

Update SIP and user-agent code to use clearer interpolation, span/string helpers, and explicit string comparison semantics. Add Polyfill support required by the new string APIs and remove the duplicate KeyValuePair Deconstruct extension.

Split of #1639

sipsorcery · 2026-05-29T21:30:37Z

This PR did introduce a few new bugs in the SIP logic. One would have broken any SIP request that used the Route or Record-Route headers. The unit tests that show the bugs have been committed to master and if you rebase this PR the new tests will fail.

Overall the improvements do generally modernise the code but the size of PRs like this is unmanageable. The concerns need to be done separately, i.e. if it's jsut to update string formatting that should be one PR. Any functional changes should be done in a different PR. It tkaes mroe work to review PRs of this size than it does to write them.

paulomorgado · 2026-05-29T22:19:34Z

The issue was the ToString() method declared as public new String() instead of public override String().

I had already run into 2 of these that were covered by tests.

And now I found there are other 6. Why?

sipsorcery · 2026-05-29T22:41:22Z

And now I found there are other 6. Why?

Just the way I did it at the time, it was ~20 years ago! It doesn't matter so much as to the "why" as to the fact that the current behaviour works. Adding breaking "optimisations" would be a BIG problem for library users.

sipsorcery · 2026-05-31T08:51:50Z

Another regression detected. Unit tests added to master see here.

Modernize SIP message parsing and formatting for performance: - Replace string concat/Regex with Span-based parsing - Use StringBuilder and interpolated strings in ToString() - Improve header/parameter parsing, folded header handling - Update code style, error messages, and logging - Add Polyfill, set C# 14, ensure .NET 4.6.2 compatibility - Remove obsolete code, update project metadata and docs

paulomorgado · 2026-05-31T11:32:23Z

Another regression detected. Unit tests added to master see here.

All tests are passing now.

paulomorgado · 2026-05-31T11:33:10Z

using System;

var c = new C();
Console.WriteLine(c.ToString());
Console.WriteLine(c);

class C
{
    public new string ToString()
    {
        return "Hello, World!";
    }
}

$ dotnet run app.cs
Hello, World!
C

paulomorgado · 2026-05-31T11:33:39Z

Benchmark Analysis — sipsorcery-org/sipsorcery PR #1645

"Modernize SIP string handling"

Environment: BenchmarkDotNet v0.15.8, Windows 11, Intel Core i9-13900K,
.NET 10.0.8 (RyuJIT x86-64-v3), Job=ShortRun (3 iterations, 3 warmup)

1. `ParseSIPExtensions`

Results

Scenario	Old (ns)	New (ns)	Speedup	Alloc Old	Alloc New	Alloc Δ
Single known (`100rel`)	38.6	17.2	2.2×	104 B	72 B	−31%
3 known extensions (`100rel, norefersub, replaces`)	92.1	37.1	2.5×	336 B	72 B	−79%
3 known, upper-case input (`100REL, REPLACES, NOREFERSUB`)	119.1	34.3	3.5×	464 B	72 B	−85%
Mixed known + unknown (`100rel, x-unknown, multiple-refer`)	107.4	39.9	2.7×	400 B	112 B	−72%
Two unknowns (`x-foo, x-bar`)	78.2	43.7	1.8×	296 B	184 B	−38%
Single unknown only (`x-unknown-ext`)	34.2	16.1	2.1×	64 B	80 B	+25% ⚠️

What changed

Old implementation:

string[] extensions = extensionList.Trim().Split(',');
foreach (string extension in extensions)
{
    string trimmedExtension = extension.Trim().ToLower();
    switch (trimmedExtension) { ... }
    // unknowns collected via string concatenation (+=)
}

Trim().Split(',') allocates a string[] array plus a new string per token
ToLower() allocates a new lowercased string per token regardless of case
Unknown extensions collected via += string concatenation (further allocations per unknown)

New implementation:

var extensions = extensionList.AsSpan().Trim();
foreach (var extensionRange in extensions.Split(','))
{
    var trimmedExtension = extensions[extensionRange].Trim(); // stack-only slice
    switch (trimmedExtension) { ... }  // OrdinalIgnoreCase span comparison — no ToLower
}
// firstUnknown held as ReadOnlySpan<char>, ToString() deferred until end
// 2+ unknowns collected via StringBuilder.Append(ReadOnlySpan<char>)

AsSpan().Trim() is stack-only — zero heap allocations for splitting
OrdinalIgnoreCase span comparisons replace ToLower() — no per-token string allocation for known extensions
firstUnknown stored as ReadOnlySpan<char> and only materialised via .ToString() at the very end, and only if it is the sole unknown
2+ unknowns collected in a StringBuilder via Append(ReadOnlySpan<char>) — no intermediate strings per token, replaces List<string> + string.Join

Why known-extension cases improve so dramatically

Every known token (e.g. 100rel, replaces, norefersub) never touches the heap at
all in the new path — the OrdinalIgnoreCase span comparison is done entirely on the
stack. The old path allocated one ToLower() string per token regardless of whether
the token was known or unknown.

Upper-case input (100REL, REPLACES, NOREFERSUB) shows the largest gain (3.5×
faster, −85% allocations) because ToLower was doing real character-by-character
work and creating a full new string for every token.

The single-unknown +16 B explained

Both paths allocate new List<SIPExtensions>() (32 B) — this cancels out.
The delta comes entirely from what each path allocates for the unknown token itself.

Old path allocations for `"x-unknown-ext"`

Object	Bytes	Reason
`string[1]` from `.Split(',')`	32 B	Array object; no comma found so .NET returns `new[]{ this }` — the single element is a pointer to the already-existing input string, no new string object is created for the token itself
`.ToLower()` on `"x-unknown-ext"`	0 B	The string is already all-lowercase ASCII — .NET's `ChangeCaseCommon` scans for a character that needs changing, finds none, and returns the same instance
`unknownExtensions = extension.Trim()`	0 B	No leading/trailing whitespace — `Trim()` returns the same instance

Total: 32 B (List) + 32 B (array) = 64 B

New path allocations for `"x-unknown-ext"`

Object	Bytes	Reason
`firstUnknown.ToString()`	48 B	Must create a brand-new heap string: 8 (pre-header) + 8 (MT pointer) + 4 (length field) + 26 (13 chars × 2 bytes) + 2 (null terminator) = 48 B

Total: 32 B (List) + 48 B (string) = 80 B

Visualised

Old:  [ string[1] array: 32 B ] ──points to──▶ [ "x-unknown-ext" (pre-allocated input, 0 B extra) ]

New:  (no array: 0 B)           ──creates──▶   [ "x-unknown-ext" (new heap string, 48 B)          ]

Δ = 48 B − 32 B = +16 B

The old code's Split allocates a pointer array (32 B) whose single element
references the already-existing input string — no new string object is allocated for
the token itself. The new span-based code eliminates the array entirely, but
span.ToString() must materialise a brand-new string object (48 B) on the managed heap.

This gap cannot be closed without changing the method's out string unknownExtensions
contract to out ReadOnlySpan<char> — a breaking API change. The 2.1× time
improvement makes this a net win despite the marginal allocation increase.

2. `ParseCustomHeaders`

Results

Scenario	Old (Regex, ns)	New (Span, ns)	Speedup	Alloc Old
`Via` (match)	86.9	0.9	94× faster	248 B
`Content-Length` (match)	234.4	3.0	77× faster	248 B
`Max-Forwards` (trim + match)	193.7	8.2	24× faster	296 B
`X-Custom-Header` (no match)	83.4	2.7	31× faster	0 B
`X-Sip-Callid: value` (no match)	90.5	2.6	35× faster	0 B

What changed

Old implementation:

Regex.Match(header, SIP_CUSTOM_HEADER_PREFIX, RegexOptions.IgnoreCase)

New implementation:

header.AsSpan().Trim().StartsWith(..., StringComparison.OrdinalIgnoreCase)

Key findings

Replacing Regex.Match() with ReadOnlySpan<char> comparisons delivers a
24–94× speedup across all inputs.
AsSpan().Trim() is fully stack-based — zero heap allocations in every scenario,
compared to the Match object the regex engine allocated on every call.
Both matching and non-matching paths benefit equally; regex engine initialisation
dominates the cost regardless of whether the pattern is ultimately found.
The Max-Forwards case (leading/trailing whitespace) shows the lowest speedup
(24×) because AsSpan().Trim() adds a small scanning cost — still 24× faster and
still allocates nothing.
Content-Length shows the highest variance in the old path (error: 280 ns) due to
regex JIT compilation occasionally being triggered during the short run.

Summary

Change	Best case	Worst case	Allocations
`ParseSIPExtensions`	3.5× faster	2.1× faster	−85% to −31%; one structural edge case: +16 B (explained above)
`ParseCustomHeaders`	94× faster	24× faster	0 B in all cases (was 248–296 B)

paulomorgado mentioned this pull request May 29, 2026

Reduce string allocations #1639

Closed

sipsorcery requested changes May 29, 2026

View reviewed changes

Comment thread src/SIPSorcery/core/SIP/SIPConstants.cs Outdated

paulomorgado force-pushed the strings-4 branch 2 times, most recently from 98d5b86 to 4d3b642 Compare May 29, 2026 16:37

paulomorgado force-pushed the strings-4 branch from f2596e3 to d730ee5 Compare May 29, 2026 22:19

paulomorgado marked this pull request as draft May 30, 2026 20:53

paulomorgado force-pushed the strings-4 branch from d730ee5 to 673e253 Compare May 31, 2026 11:25

paulomorgado marked this pull request as ready for review May 31, 2026 11:33

sipsorcery approved these changes May 31, 2026

View reviewed changes

sipsorcery merged commit ef1e6e8 into sipsorcery-org:master May 31, 2026
7 checks passed

paulomorgado deleted the strings-4 branch May 31, 2026 19:50

This was referenced Jun 1, 2026

Refactor SIP code for clarity and consistency #1653

Merged

Implement turns #1434

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize SIP string handling#1645

Modernize SIP string handling#1645
sipsorcery merged 1 commit into
sipsorcery-org:masterfrom
paulomorgado:strings-4

paulomorgado commented May 29, 2026

Uh oh!

Uh oh!

sipsorcery commented May 29, 2026

Uh oh!

paulomorgado commented May 29, 2026

Uh oh!

sipsorcery commented May 29, 2026

Uh oh!

sipsorcery commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

paulomorgado commented May 29, 2026

Uh oh!

Uh oh!

sipsorcery commented May 29, 2026

Uh oh!

paulomorgado commented May 29, 2026

Uh oh!

sipsorcery commented May 29, 2026

Uh oh!

sipsorcery commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Uh oh!

paulomorgado commented May 31, 2026

Benchmark Analysis — sipsorcery-org/sipsorcery PR #1645

"Modernize SIP string handling"

1. ParseSIPExtensions

Results

What changed

Why known-extension cases improve so dramatically

The single-unknown +16 B explained

Old path allocations for "x-unknown-ext"

New path allocations for "x-unknown-ext"

Visualised

2. ParseCustomHeaders

Results

What changed

Key findings

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `ParseSIPExtensions`

Old path allocations for `"x-unknown-ext"`

New path allocations for `"x-unknown-ext"`

2. `ParseCustomHeaders`