perf: replace execution-context map copies with a non-copying scope chain by Richtermeister · Pull Request #378 · flosch/pongo2

Richtermeister · 2026-06-07T16:57:01Z

Summary

This PR removes the map copies that the execution context performs on every render, replacing them with a non-copying scope chain. It is a performance change with identical rendering output; the only API impact is that ExecutionContext.Private/Public become method-accessed types instead of bare maps (mechanical, compiler-flagged migration — details below).

Two copies were happening on every Execute, both scaling O(N) with map size:

Per-Execute merge — the user context and the set's globals were merged into a freshly allocated map (newContextForExecution), copying every entry even though a template typically references only a few keys.
Per-child copy — every child context ({% for %}, {% with %}, {% macro %}, block.Super) copied the entire parent private map. This compounds with nesting depth.

Approach

ExecutionContext.Private: Context → Scope. A child layers its own variables over its parent via a chain walked on lookup; nothing is copied. Scope is a lightweight handle backed by the ExecutionContext itself, so there is no extra per-scope allocation, and the own layer is allocated lazily on first write. Delete uses a tombstone to mask a parent key without mutating the parent, so delete/range semantics are fully preserved.
ExecutionContext.Public: Context → PublicContext. A read-only view over the user context and the set globals. The caller's map is used directly, never copied; globals are a resolution fallback reached through Get/Range. Because access is method-only, callers cannot bypass globals by indexing a map — so resolution results are unchanged.
Precedence is preserved exactly: Private → user context → globals.

Two follow-on wins on the same hot path:

Skip the macro-clash scan entirely when the template exports no macros.
New TemplateSet.SkipContextValidation (default false) to opt out of the per-Execute identifier validation in hot paths with large, trusted contexts.

Benchmarks

go test -bench . -benchmem. Existing benchmarks (small contexts), median of 3:

Benchmark	Before	After
`ExecuteBlocks`	907 ns, 1365 B, 13 allocs	650 ns, 1045 B, 11 allocs
`ExecuteBlocksDeep`	1596 ns, 2360 B, 22 allocs	1170 ns, 1720 B, 18 allocs
`ExecuteComplex`	15110 ns, 6560 B, 230 allocs	14950 ns, 6243 B, 228 allocs

The win scales with context size and nesting depth. A new BenchmarkExecuteLargeContext (1000-key context, template references 3 keys) makes the structural cost visible:

Stage	Time/op	Bytes/op
Original (copy + validate + clash)	~71 µs	82,720 B
+ scope chain (no copy)	~29 µs	672 B
+ skip clash when no macros	~19 µs	672 B
+ `SkipContextValidation`	~0.65 µs	672 B

Context handling no longer scales with map size — bytes are flat at 672 B whether the context has 30 keys or 10,000.

Why this is safe

Rendering output is unchanged. Lookup precedence, variable shadowing, {% for %}/{% with %}/{% macro %} scoping, block.Super, {% include %}/{% ssi %} context passing, and delete/range semantics all behave identically. The full suite (2960+ tests) passes; go vet and gofmt are clean.
Verified race-clean with go test -race on linux/arm64 (CGO_ENABLED=1): the full suite and the concurrent BenchmarkParallelExecuteComplex report no data races. The scope chain is per-ExecutionContext, so concurrent renders never share private state. The only newly-shared read is set.Globals, which is read-only during rendering (no tag writes globals) — safe for concurrent reads.
SkipContextValidation defaults to false, so existing behavior is unchanged unless explicitly opted in.

Breaking change (mechanical, compiler-flagged)

Custom tags that touch ExecutionContext.Private/Public must use methods instead of indexing. The compiler flags every site, so nothing fails silently:

ctx.Private[k]            → ctx.Private.Get(k)   // returns (any, bool)
ctx.Private[k] = v        → ctx.Private.Set(k, v)
delete(ctx.Private, k)    → ctx.Private.Delete(k)
range ctx.Private         → ctx.Private.Range(func(k string, v any) bool { ... })
ctx.Public[k]             → ctx.Public.Get(k)    // includes globals

All built-in tags and the bundled docs are updated in this PR. A CHANGELOG.md entry documents the migration.

One intentional behavior note: with the un-merge, the set's globals are no longer re-validated/macro-clash-checked on every Execute (only the user context is). Globals are set once and effectively trusted infrastructure.

🤖 Generated with Claude Code

…chain Child execution contexts (created by {% for %}, {% with %}, {% macro %}, and block.Super) previously copied the entire parent private map on every creation, and each Execute merged the user context and set globals into a fresh map. Both copies scaled O(N) with map size and ran on every render. Replace both with a non-copying scope chain: - ExecutionContext.Private changes from Context (map) to Scope, a method-only handle backed by the ExecutionContext itself (no per-scope allocation). A child layers its own data over the parent via the context chain; lookups walk outward. The own layer is allocated lazily on first write. Delete uses a tombstone to mask a parent key without mutating the parent, so delete and range semantics are preserved. - ExecutionContext.Public changes from Context (map) to PublicContext, a read-only view over the user context and set globals. The user's map is used directly (never copied); globals are a resolution fallback reached through Get/Range. Method-only access means callers cannot bypass globals via direct indexing, so resolution output is unchanged. Rendering output is identical. The only peripheral change: globals are no longer re-validated or macro-clash-checked on every Execute (only the user context is). Measured (BenchmarkExecuteLargeContext, 1000-key input map, template uses 3): before: ~71us/op 82,720 B/op 21 allocs/op after: ~29us/op 672 B/op 15 allocs/op Context handling no longer scales with input-map size. Nested/block-heavy templates render ~27% faster with ~25% less memory. BREAKING CHANGE: custom tags must use methods instead of map indexing on ExecutionContext.Private/Public (ctx.Private[k] -> ctx.Private.Get/Set, etc.). See CHANGELOG.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The per-Execute clash check iterated the entire input context comparing each key against the template's exported macros. When the template exports no macros (the common case) there is nothing to clash with, so the O(N) scan is pure overhead. Guard it on len(tpl.exportedMacros) > 0. BenchmarkExecuteLargeContext (1000-key map, no exported macros): ~29us -> ~19us. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The per-Execute checkForValidIdentifiers scan byte-checks every key in the user context on every render. It is a usability guard — an invalid identifier key cannot be referenced from a template anyway — and its cost scales with the context size, which is wasteful for hot paths that render large, trusted maps. Add TemplateSet.SkipContextValidation (default false, validation enabled). When set, the check is skipped. Behavior is unchanged by default. BenchmarkExecuteLargeContext (1000-key map): ~19us/op with validation, ~0.65us/op with SkipContextValidation. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

sonarqubecloud · 2026-06-07T16:57:36Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Richtermeister and others added 3 commits June 7, 2026 16:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: replace execution-context map copies with a non-copying scope chain#378

perf: replace execution-context map copies with a non-copying scope chain#378
Richtermeister wants to merge 3 commits into
flosch:masterfrom
TheDMSGroup:perf/context-scope-chain

Richtermeister commented Jun 7, 2026

Uh oh!

sonarqubecloud Bot commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Richtermeister commented Jun 7, 2026

Summary

Approach

Benchmarks

Why this is safe

Breaking change (mechanical, compiler-flagged)

Uh oh!

sonarqubecloud Bot commented Jun 7, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant