*: support cache prepared statement in plan cache#67857
Conversation
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a session-level prepare-statement deduplication cache and planner helpers to reuse prepared-plan templates across repeated COM_STMT_PREPARE calls, with a session toggle, sysvar, helper functions, rebuild logic, and integration tests validating behavior and isolation. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Session
participant Cache as DedupCache
participant Parser
participant Preprocessor
participant Executor
Client->>Session: PrepareStmt(sql)
alt EnableCachePrepareStmt
Session->>Session: ComputeDedupKey(sql, charset, collation, DB, sqlMode)
Session->>Cache: Get(key)
alt Cache hit && SchemaVersion matches
Cache-->>Session: PrepareStmtCacheEntry
Session->>Parser: Parse(sql)
Parser-->>Session: AST
Session->>Session: ExtractAndSortParamMarkers(AST)
Session->>Preprocessor: Preprocess(ctx, AST)
Preprocessor-->>Session: ResolveCtx
Session->>Session: rebuildFromPrepareCache -> new PlanCacheStmt
Session-->>Client: stmtID, paramCount, fields
else Cache miss or mismatch
Session->>Executor: NewPrepareExec().Next()
Executor-->>Session: PlanCacheStmt, fields, paramCount
Session->>Cache: Set(key, PrepareStmtCacheEntry)
Session-->>Client: stmtID, paramCount, fields
end
else
Session->>Executor: NewPrepareExec().Next()
Executor-->>Session: PlanCacheStmt, fields, paramCount
Session-->>Client: stmtID, paramCount, fields
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.4)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (4)
pkg/planner/core/plan_cache_utils.go (1)
627-630: Consider returning defensive copies from slice accessors
DBName()/Tbls()expose mutable internal slices. External mutation can break ownership guarantees introduced bySetDBNameAndTbls.💡 Proposed refactor
-func (s *PlanCacheStmt) DBName() []model.CIStr { return s.dbName } +func (s *PlanCacheStmt) DBName() []model.CIStr { return slices.Clone(s.dbName) } -func (s *PlanCacheStmt) Tbls() []table.Table { return s.tbls } +func (s *PlanCacheStmt) Tbls() []table.Table { return slices.Clone(s.tbls) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/plan_cache_utils.go` around lines 627 - 630, DBName() and Tbls() currently expose internal mutable slices (s.dbName and s.tbls), allowing external callers to mutate PlanCacheStmt internals and break the ownership guarantees set by SetDBNameAndTbls; fix by having DBName() return a new slice copy of s.dbName (copy the elements into a freshly allocated []model.CIStr) and having Tbls() return a new slice copy of s.tbls (allocate a []table.Table and copy references), so callers get defensive copies instead of direct references to the internal fields.pkg/sessionctx/variable/session.go (1)
2698-2702: Nit: encodesqlModewithout embedding raw bytes that can collide with the separator.
modeBufis the little-endian encoding of auint64, which itself can contain0x00bytes. Since\x00is used as the field separator, a mode byte pattern could in principle alias with a boundary. Collisions are extraordinarily unlikely here becausesqlModeis the last field, but a trivial formatting likestrconv.FormatUint(uint64(sqlMode), 16)would make the key unambiguous and human-readable in debugging.♻️ Proposed tweak
-func PrepareDedupCacheKey(sql, charset, collation, currentDB string, sqlMode mysql.SQLMode) string { - var modeBuf [8]byte - binary.LittleEndian.PutUint64(modeBuf[:], uint64(sqlMode)) - return sql + "\x00" + charset + "\x00" + collation + "\x00" + currentDB + "\x00" + string(modeBuf[:]) -} +func PrepareDedupCacheKey(sql, charset, collation, currentDB string, sqlMode mysql.SQLMode) string { + return sql + "\x00" + charset + "\x00" + collation + "\x00" + currentDB + "\x00" + strconv.FormatUint(uint64(sqlMode), 16) +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/sessionctx/variable/session.go` around lines 2698 - 2702, PrepareDedupCacheKey currently appends raw little-endian bytes of sqlMode which can contain 0x00 and collide with the "\x00" separator; change it to append a string-encoded sqlMode (e.g. use strconv.FormatUint(uint64(sqlMode), 16) or base10) instead of modeBuf so the key fields are unambiguous and human-readable, update the concatenation in PrepareDedupCacheKey to use that formatted string, and add the strconv import if necessary.pkg/session/session.go (1)
2495-2495: Avoid== truecomparisons on bools.Idiomatic Go writes
if s.sessionVars.EnableCachePrepareStmt {. Both sites should be updated.♻️ Proposed change
- if s.sessionVars.EnableCachePrepareStmt == true { + if s.sessionVars.EnableCachePrepareStmt {(apply in both places)
Also applies to: 2532-2532
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/session/session.go` at line 2495, Replace the explicit boolean comparison "s.sessionVars.EnableCachePrepareStmt == true" with the idiomatic Go form "s.sessionVars.EnableCachePrepareStmt" at both occurrences in the file; locate the two uses of the EnableCachePrepareStmt field on the session receiver (the checks using s.sessionVars.EnableCachePrepareStmt) and update them to remove the "== true" comparison.pkg/session/test/common/prepare_dedup_cache_test.go (1)
87-102: Prepared stmts are never dropped in this test (resource accounting).Minor:
id1andid2are left registered on the session until teardown. Not a correctness bug but inconsistent withTestPrepareStmtDedupCachePrepareExecuteCloseLoop(which does callDropPreparedStmt). Consider addingdefer tk.Session().DropPreparedStmt(id1)/id2for symmetry.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/session/test/common/prepare_dedup_cache_test.go` around lines 87 - 102, The test leaves prepared statements registered on the session (resource leak); add cleanup calls to drop them: after creating id1 and id2 via tk.Session().PrepareStmt(...) register deferred calls to tk.Session().DropPreparedStmt(id1) and tk.Session().DropPreparedStmt(id2) (or equivalent) so both prepared stmts are unregistered at test teardown while leaving the rest of the test logic (runAndCollect, assertions) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/plan_cache_utils.go`:
- Around line 621-624: CollectPlanCacheStmtInfo currently only appends
AST-derived data to a reused PlanCacheStmt, so before creating the
planCacheStmtProcessor and calling stmtNode.Accept(processor) you should
clear/reset all statement-derived fields on the PlanCacheStmt (e.g., tables,
columns/usedCols, tablesInExprs, table/stat tracking maps or slices) so stale
data from previous uses is removed; modify CollectPlanCacheStmtInfo to zero
those fields on the passed-in PlanCacheStmt (or call a Reset method on
PlanCacheStmt if you add one) then proceed to instantiate planCacheStmtProcessor
and accept the node.
In `@pkg/planner/core/plan_cache_utils.go.orig`:
- Around line 1-745: A stale backup file (plan_cache_utils.go.orig) was
accidentally committed and is missing recent additions like
PrepareStmtCacheEntry, ExtractAndSortParamMarkers, and CollectPlanCacheStmtInfo;
remove it from the repo and history by running git rm on that backup file and
commit the change so only the canonical plan_cache_utils.go (which contains the
latest functions such as GeneratePlanCacheStmtWithAST, NewPlanCacheKey,
PlanCacheValue, and related helpers) remains tracked.
In `@pkg/session/session.go`:
- Line 2512: The assignment fields = cached.Fields wrongly aliases the
cache-owned slice; replace it with a shallow copy to avoid mutating the cache
(e.g. fields = append([]*resolve.ResultField(nil), cached.Fields...)) so the
caller gets a new slice header while reusing the same element pointers, leaving
paramCount as-is; update the code paths around the cached.Fields usage where
fields is returned to callers (the variables named fields and cached.Fields)
accordingly.
- Around line 2505-2515: The rebuild error from rebuildFromPrepareCache is being
ignored and rollbackOnError is skipped on AddPreparedStmt failure; modify the
dedup-hit path to (1) if rebuildErr != nil call logutil.Logger(ctx).Warnf (or
Warn) with a clear message including rebuildErr before falling through, and (2)
ensure that after s.sessionVars.AddPreparedStmt fails you call
s.rollbackOnError(ctx) before returning (matching the full-prepare path); update
the block around rebuildFromPrepareCache, rebuildErr,
s.sessionVars.AddPreparedStmt and stmtID to add the warning log and guaranteed
rollbackOnError on all early returns.
- Around line 2582-2586: Clarify the defensive check comment to explain that
after the earlier is.SchemaMetaVersion() check and the Preprocess call,
ret.InfoSchema can be mutated (e.g., via the stale-read processor path inside
Preprocess) even when InPrepare is set, so we must compare the final
ret.InfoSchema.SchemaMetaVersion() against the original cached prepared
statement version cached.Stmt.SchemaVersion (not the pre-Preprocess snapshot);
if they differ, return the error and fall back to the full prepare path because
the cached plan may be unsafe.
In `@pkg/session/test/common/prepare_dedup_cache_test.go`:
- Around line 29-52: The tests (TestPrepareStmtDedupCacheBasic and
TestPrepareStmtDedupCacheSchemaChange) never enable the new prepare-stmt cache
and use a non-distinguishing stmtID check; explicitly enable the session
variable (e.g., call tk.MustExec("set @@tidb_enable_cache_prepare_stmt = ON") or
the correct variable name) at each test start, and replace the id1/id2 NotEqual
assertion with a verifiable cache-hit/miss signal: either inspect the session's
dedup-cache structure on the session object (e.g., s.sessionVars or the map
added for caching) before/after PrepareStmt, or increment/read a test-only
metric/counter exposed by the PrepareStmt path to assert a cache hit on the
second prepare; for TestPrepareStmtDedupCacheSchemaChange also assert the
resulting prepared statement uses the updated schema (e.g., execute the prepared
stmt and verify the new column visibility) rather than relying on stmtID
changes.
In `@pkg/sessionctx/variable/session.go`:
- Around line 2692-2702: PrepareDedupCacheKey currently omits parser-affecting
session flags and CharacterSetClient, so update PrepareDedupCacheKey to
incorporate the parser configuration used by BuildParserConfig (at least
EnableWindowFunction and EnableStrictDoubleTypeCheck) and the CharacterSetClient
value from GetParseParams into the returned key (e.g. append flag bytes or a
small encoded struct along with the existing sqlMode bytes) so that
PlanCacheStmt ASTs are only reused when those parser-affecting session variables
match; alternatively, if you prefer not to change the key, add a clear comment
on PrepareDedupCacheKey and callers documenting that it is only safe when
EnableWindowFunction, EnableStrictDoubleTypeCheck and CharacterSetClient are
stable for the session.
---
Nitpick comments:
In `@pkg/planner/core/plan_cache_utils.go`:
- Around line 627-630: DBName() and Tbls() currently expose internal mutable
slices (s.dbName and s.tbls), allowing external callers to mutate PlanCacheStmt
internals and break the ownership guarantees set by SetDBNameAndTbls; fix by
having DBName() return a new slice copy of s.dbName (copy the elements into a
freshly allocated []model.CIStr) and having Tbls() return a new slice copy of
s.tbls (allocate a []table.Table and copy references), so callers get defensive
copies instead of direct references to the internal fields.
In `@pkg/session/session.go`:
- Line 2495: Replace the explicit boolean comparison
"s.sessionVars.EnableCachePrepareStmt == true" with the idiomatic Go form
"s.sessionVars.EnableCachePrepareStmt" at both occurrences in the file; locate
the two uses of the EnableCachePrepareStmt field on the session receiver (the
checks using s.sessionVars.EnableCachePrepareStmt) and update them to remove the
"== true" comparison.
In `@pkg/session/test/common/prepare_dedup_cache_test.go`:
- Around line 87-102: The test leaves prepared statements registered on the
session (resource leak); add cleanup calls to drop them: after creating id1 and
id2 via tk.Session().PrepareStmt(...) register deferred calls to
tk.Session().DropPreparedStmt(id1) and tk.Session().DropPreparedStmt(id2) (or
equivalent) so both prepared stmts are unregistered at test teardown while
leaving the rest of the test logic (runAndCollect, assertions) unchanged.
In `@pkg/sessionctx/variable/session.go`:
- Around line 2698-2702: PrepareDedupCacheKey currently appends raw
little-endian bytes of sqlMode which can contain 0x00 and collide with the
"\x00" separator; change it to append a string-encoded sqlMode (e.g. use
strconv.FormatUint(uint64(sqlMode), 16) or base10) instead of modeBuf so the key
fields are unambiguous and human-readable, update the concatenation in
PrepareDedupCacheKey to use that formatted string, and add the strconv import if
necessary.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 41b5bfd8-c9eb-408d-a9b5-8f0275e55447
📒 Files selected for processing (10)
pkg/planner/core/plan_cache_utils.gopkg/planner/core/plan_cache_utils.go.origpkg/session/session.gopkg/session/test/common/prepare_dedup_cache_test.gopkg/sessionctx/variable/session.gopkg/sessionctx/variable/setvar_affect.gopkg/sessionctx/variable/sysvar.gopkg/sessionctx/variable/tidb_vars.gotests/integrationtest/r/sessionctx/setvar.resulttests/integrationtest/t/sessionctx/setvar.test
| return | ||
| } | ||
| paramCount = cached.ParamCount | ||
| fields = cached.Fields |
There was a problem hiding this comment.
fields = cached.Fields aliases the cache-owned slice returned to the caller.
cached.Fields lives inside the dedup cache entry. Handing the same underlying slice to the caller creates a hidden mutation hazard: if any caller (now or in the future) appends/rewrites entries in the returned fields slice or its *resolve.ResultField pointees, the cached entry silently gets corrupted and all subsequent dedup hits for this SQL return wrong metadata.
paramCount is an int and safe to alias, but fields deserves a shallow copy.
🛡️ Proposed fix
paramCount = cached.ParamCount
- fields = cached.Fields
+ // Defensive copy: cached.Fields must not be handed out and mutated.
+ fields = append([]*resolve.ResultField(nil), cached.Fields...)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fields = cached.Fields | |
| paramCount = cached.ParamCount | |
| // Defensive copy: cached.Fields must not be handed out and mutated. | |
| fields = append([]*resolve.ResultField(nil), cached.Fields...) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/session/session.go` at line 2512, The assignment fields = cached.Fields
wrongly aliases the cache-owned slice; replace it with a shallow copy to avoid
mutating the cache (e.g. fields = append([]*resolve.ResultField(nil),
cached.Fields...)) so the caller gets a new slice header while reusing the same
element pointers, leaving paramCount as-is; update the code paths around the
cached.Fields usage where fields is returned to callers (the variables named
fields and cached.Fields) accordingly.
There was a problem hiding this comment.
deep copy fields will affect the performance, for now, caller doesn't change fields array
There was a problem hiding this comment.
@guo-shaoge, understood — just to clarify, my suggestion was a shallow copy (only copying the n pointer values into a new slice header, not duplicating the ResultField structs themselves), so the allocation overhead is minimal even for wide tables. That said, you're right that if no current caller mutates the returned slice, the aliasing is harmless in practice.
I'll accept your decision — no change needed.
✏️ Learnings added
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/session/session.go:2512-2512
Timestamp: 2026-04-18T01:18:14.660Z
Learning: In `pkg/session/session.go` (pingcap/tidb PR `#67857`), `PrepareStmt` returns `fields = cached.Fields` directly from the dedup-cache entry (a `*plannercore.PrepareStmtCacheEntry`). This is an intentional alias (no defensive copy) because all current callers treat the returned `fields` slice as read-only. Do not flag this as an aliasing/mutation-hazard bug; the author (guo-shaoge) explicitly chose aliasing over a shallow copy for performance reasons.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/session/test/common/prepare_dedup_cache_test.go:29-52
Timestamp: 2026-04-17T14:08:27.332Z
Learning: In `pkg/sessionctx/variable/tidb_vars.go` (pingcap/tidb), `DefEnableCachePrepareStmt = true`, meaning the `tidb_enable_cache_prepare_stmt` session variable is **enabled by default**. Tests for the prepare-stmt dedup-cache feature do not need an explicit `SET `@tidb_enable_cache_prepare_stmt` = ON` — the cache is active in all test sessions by default.
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/planner/core/plan_cache_utils.go:621-624
Timestamp: 2026-04-17T14:03:09.042Z
Learning: In `pkg/planner/core/plan_cache_utils.go` (pingcap/tidb), `CollectPlanCacheStmtInfo` is always called with a **newly constructed** `PlanCacheStmt` (created inside `rebuildFromPrepareCache` in `pkg/session/session.go`), not a reused one. The `limits`, `hasSubquery`, and `tables` fields are therefore always zero-valued on entry. Do not flag the absence of reset/clear guards in `CollectPlanCacheStmtInfo` as a potential stale-data bug.
Learnt from: hawkingrei
Repo: pingcap/tidb PR: 67499
File: pkg/executor/prepared_test.go:0-0
Timestamp: 2026-04-13T03:14:21.604Z
Learning: In `pkg/executor/prepared_test.go` (pingcap/tidb), `TestExecuteWithWrongType` verifies that prepared statements with `c2 IN (?, ?)` on a `decimal(32,30)` column remain executable even when a wrong-type parameter (e.g., string `'aa'`) is passed. Both the first and subsequent executions always report `@last_plan_from_cache = 0` because wrong-type parameter shapes (type mismatch vs. the cached plan's shape) cause the plan cache to be bypassed entirely for all executions of that prepared statement. Asserting `1` for the second execution is incorrect; the stable expected value is `0` throughout.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 67054
File: pkg/lightning/mydump/parquet_parser.go:542-545
Timestamp: 2026-03-17T06:01:45.974Z
Learning: In `pkg/lightning/mydump/parquet_parser.go`, `ParquetParser.lastRow.SkipCast` intentionally aliases the mutable `pp.skipCast` buffer (same as `Row.Row` aliasing the pool datum slice). Both are only valid until the next `ReadRow()` call. This is by design: callers (e.g., the encode path in `pkg/executor/importer/chunk_process.go`) consume `LastRow().SkipCast` immediately before the next read. Do not flag the direct assignment `pp.lastRow.SkipCast = pp.skipCast` as an aliasing bug.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 64752
File: pkg/planner/core/planbuilder.go:3987-3991
Timestamp: 2026-03-12T02:22:08.966Z
Learning: In `pkg/planner/core/planbuilder.go` (pingcap/tidb), `resolveGeneratedColumns` uses `onDups != nil` (not `len(onDups) > 0`) as the intentional guard for all mutations to the `onDups` map. `nil` means the IMPORT INTO / LOAD DATA call path (these callers pass nil explicitly, and the map must not be mutated). Non-nil — even if the map is empty — means the INSERT path; in this case ON UPDATE NOW columns must be recorded in `onDups` so that dependent generated columns propagate to `igc.OnDuplicates` when a duplicate-key conflict arises. Suggesting `len(onDups) > 0` instead of `onDups != nil` is a false positive: the nil/non-nil contract was established by PR `#63146` ("fix missing nilness check for passed onDup map") as a follow-up to PR `#58401`.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 64752
File: pkg/planner/core/planbuilder.go:3987-3991
Timestamp: 2026-03-12T02:22:08.966Z
Learning: In `pkg/planner/core/planbuilder.go`, `resolveGeneratedColumns` uses `onDups != nil` (not `len(onDups) > 0`) as the intentional guard to distinguish call sites: `nil` means IMPORT INTO / LOAD DATA (must not mutate the map, and ON UPDATE NOW columns should NOT be recorded); non-nil (even if empty) means the INSERT path, where ON UPDATE NOW columns should be added to `onDups` so that dependent generated columns can be included in `igc.OnDuplicates`. Suggesting `len(onDups) > 0` instead of `onDups != nil` is a false positive — the nil/non-nil contract was established by PR `#63146` to fix a panic when IMPORT INTO passed nil.
Learnt from: AilinKid
Repo: pingcap/tidb PR: 66677
File: pkg/sessionctx/stmtctx/stmtctx.go:614-614
Timestamp: 2026-03-05T08:55:21.797Z
Learning: In `pkg/sessionctx/stmtctx/stmtctx.go`, `sc.TableStats` stores `*statistics.Table` pointers that are treated as read-only references obtained from `GetPhysicalTableStats`. These values are never mutated after being stored. The map is only used as a backup/restore snapshot for selecting the winner logical plan. Therefore, a shallow clone via `maps.Clone` is sufficient in `SaveLogicalPlanBuildState`; deep-copying the `*statistics.Table` values is not needed.
Learnt from: winoros
Repo: pingcap/tidb PR: 67411
File: pkg/planner/core/plan_cache_utils.go:0-0
Timestamp: 2026-04-01T11:13:21.523Z
Learning: In pingcap/tidb PR `#67411` (`pkg/planner/core/rule/rule_collect_plan_stats.go`), the correct fix for sync-load timeout fallback plans being cached with stale/pseudo stats is to call `stmtCtx.SetSkipPlanCache(skipPlanCacheReasonSyncLoadFallback)` immediately when `RequestLoadStats` or `SyncWaitStatsLoad` fails under `vardef.StatsLoadPseudoTimeout`. This prevents the fallback plan from ever being written to the plan cache, avoiding the stale-entry re-hit problem. Do not suggest the alternative of caching the fallback plan and invalidating it on fresh stats (via `SyncLoadFallbackItems` or similar mechanisms), as that approach leaves a stale entry resident in the cache.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: pkg/planner/core/logical_plan_builder.go:6455-6472
Timestamp: 2026-03-11T06:29:00.122Z
Learning: Ensure code reviews verify the VirtualAssignmentsOffset semantics: the planner sets Update.VirtualAssignmentsOffset = len(update.List). The executor should only apply OrderedList[:VirtualAssignmentsOffset] when composing new rows, and only after the 'changed' check should it group/evaluate OrderedList[VirtualAssignmentsOffset:] per table. This pattern applies to files under pkg/planner/core and pkg/executor (e.g., common_plans.go and update.go). Reviewers should check that updates respect slicing behavior, that the offset is consistently derived from the planner, and that downstream code does not bypass the offset when creating new rows. Add tests validating both branches: the slice before the offset for new rows, and the per-table handling of the slice after the offset.
| // Defensive: if schema changed between our earlier check and Preprocess, | ||
| // fall through to the full prepare path. | ||
| if ret.InfoSchema.SchemaMetaVersion() != cached.Stmt.SchemaVersion { | ||
| return nil, errors.New("schema version changed during rebuild") | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check whether Preprocess (with InPrepare) can mutate PreprocessorReturn.InfoSchema.
rg -nP --type=go -C3 '\.InfoSchema\s*=' -g 'pkg/planner/core/preprocess*.go'Repository: pingcap/tidb
Length of output: 1315
🏁 Script executed:
# Check the InPrepare flag usage in preprocess.go to see if it gates the mutations
rg -nP --type=go 'InPrepare' -A 2 -B 2 'pkg/planner/core/preprocess.go' | head -60Repository: pingcap/tidb
Length of output: 266
🏁 Script executed:
# Look at the context around lines 2582-2586 in session.go to understand initialization
sed -n '2560,2600p' pkg/session/session.goRepository: pingcap/tidb
Length of output: 1548
🏁 Script executed:
# Check what cached.Stmt.SchemaVersion is and when it's set
rg -nP --type=go 'SchemaVersion' -B 2 -A 2 'pkg/session/session.go' | head -50Repository: pingcap/tidb
Length of output: 902
Add a comment clarifying why the schema-version comparison uses cached.Stmt.SchemaVersion rather than the pre-Preprocess snapshot.
The defensive check is valid: Preprocess can mutate ret.InfoSchema via the stale-read processor path (lines 1809–1810 of pkg/planner/core/preprocess.go), and this mutation is not prevented when InPrepare is set.
The comparison against cached.Stmt.SchemaVersion (the original prepared statement's schema version) is the correct reference—not against the is.SchemaMetaVersion() checked just before the Preprocess call. The intent is to detect if the schema changed during the Preprocess step itself. However, the current comment "Defensive: if schema changed between our earlier check and Preprocess, fall through to the full prepare path" does not make this clear. Consider rephrasing to explain: "Between the check at line 2504 and Preprocess, the InfoSchema may change (e.g., via stale-read processor). We verify that the final InfoSchema still matches the cached prepared statement's original schema version; if not, the cached plan is unsafe."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/session/session.go` around lines 2582 - 2586, Clarify the defensive check
comment to explain that after the earlier is.SchemaMetaVersion() check and the
Preprocess call, ret.InfoSchema can be mutated (e.g., via the stale-read
processor path inside Preprocess) even when InPrepare is set, so we must compare
the final ret.InfoSchema.SchemaMetaVersion() against the original cached
prepared statement version cached.Stmt.SchemaVersion (not the pre-Preprocess
snapshot); if they differ, return the error and fall back to the full prepare
path because the cached plan may be unsafe.
| func TestPrepareStmtDedupCacheBasic(t *testing.T) { | ||
| store := testkit.CreateMockStore(t) | ||
| tk := testkit.NewTestKit(t, store) | ||
| tk.MustExec("use test") | ||
| tk.MustExec("create table t (id bigint primary key, age int, city varchar(32))") | ||
|
|
||
| sql := "select id, city from t where age > ? and city = ?" | ||
|
|
||
| id1, paramCount1, fields1, err := tk.Session().PrepareStmt(sql) | ||
| require.NoError(t, err) | ||
| require.EqualValues(t, 2, paramCount1) | ||
| require.Len(t, fields1, 2) | ||
| require.Equal(t, "id", fields1[0].Column.Name.L) | ||
| require.Equal(t, "city", fields1[1].Column.Name.L) | ||
|
|
||
| // Second prepare of the same SQL — should hit the dedup cache. | ||
| id2, paramCount2, fields2, err := tk.Session().PrepareStmt(sql) | ||
| require.NoError(t, err) | ||
| require.NotEqual(t, id1, id2, "each Prepare must return a distinct stmtID") | ||
| require.Equal(t, paramCount1, paramCount2) | ||
| require.Len(t, fields2, 2) | ||
| require.Equal(t, "id", fields2[0].Column.Name.L) | ||
| require.Equal(t, "city", fields2[1].Column.Name.L) | ||
| } |
There was a problem hiding this comment.
Tests don't actually verify dedup-cache behavior.
Two structural problems across these tests:
-
The
EnableCachePrepareStmtsession variable is never enabled. The PR description mentions adding a new server/system variable for this feature; if its default is OFF (typical for new opt-in features), every assertion here is being evaluated against the legacy full-prepare path, not the dedup-cache path. The tests will pass whether the new code works or not. -
require.NotEqual(t, id1, id2)is trivially satisfied.GetNextPreparedStmtID()always returns a monotonically increasing ID, whether or not dedup reuse happened, so this assertion does not distinguish a cache hit from a full build. Same applies toTestPrepareStmtDedupCacheSchemaChange, whereNotEqualafter DDL does not prove invalidation.
Suggestions:
- Explicitly
tk.MustExec("set @@tidb_enable_cache_prepare_stmt = ON")(or whatever the actual name is) at the top of each test — don't rely on defaults. - Add an observable signal of cache-hit vs. miss: a metric counter, a session-level hit/miss counter, a test-only hook, or inspect
s.sessionVarsdedup-cache contents before/after. Without that, these tests cannot regress-protect the feature. - For
TestPrepareStmtDedupCacheSchemaChange, additionally assert the new schema is reflected (e.g., executeSELECT * ...onid2and verifyemailcolumn visibility), not just that two IDs differ.
Based on learnings: "For session variables or protocol behavior changes, perform targeted package tests plus SQL integration tests for user-visible behavior." — the behavior under test needs to be actually observable from the assertions.
Also applies to: 107-138, 174-211
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/session/test/common/prepare_dedup_cache_test.go` around lines 29 - 52,
The tests (TestPrepareStmtDedupCacheBasic and
TestPrepareStmtDedupCacheSchemaChange) never enable the new prepare-stmt cache
and use a non-distinguishing stmtID check; explicitly enable the session
variable (e.g., call tk.MustExec("set @@tidb_enable_cache_prepare_stmt = ON") or
the correct variable name) at each test start, and replace the id1/id2 NotEqual
assertion with a verifiable cache-hit/miss signal: either inspect the session's
dedup-cache structure on the session object (e.g., s.sessionVars or the map
added for caching) before/after PrepareStmt, or increment/read a test-only
metric/counter exposed by the PrepareStmt path to assert a cache hit on the
second prepare; for TestPrepareStmtDedupCacheSchemaChange also assert the
resulting prepared statement uses the updated schema (e.g., execute the prepared
stmt and verify the new column visibility) rather than relying on stmtID
changes.
There was a problem hiding this comment.
EnableCachePrepareStmtis enabled by default- minor issue, ok to ignore
There was a problem hiding this comment.
@guo-shaoge, thanks for the clarification!
- Acknowledged — since
DefEnableCachePrepareStmt = true, the tests naturally exercise the dedup-cache path without an explicitSETcall. My concern was a false positive. - Understood, I'll drop that suggestion.
✏️ Learnings added
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/session/test/common/prepare_dedup_cache_test.go:29-52
Timestamp: 2026-04-17T14:08:27.332Z
Learning: In `pkg/sessionctx/variable/tidb_vars.go` (pingcap/tidb), `DefEnableCachePrepareStmt = true`, meaning the `tidb_enable_cache_prepare_stmt` session variable is **enabled by default**. Tests for the prepare-stmt dedup-cache feature do not need an explicit `SET `@tidb_enable_cache_prepare_stmt` = ON` — the cache is active in all test sessions by default.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: hawkingrei
Repo: pingcap/tidb PR: 67499
File: pkg/executor/prepared_test.go:0-0
Timestamp: 2026-04-13T03:14:21.604Z
Learning: In `pkg/executor/prepared_test.go` (pingcap/tidb), `TestExecuteWithWrongType` verifies that prepared statements with `c2 IN (?, ?)` on a `decimal(32,30)` column remain executable even when a wrong-type parameter (e.g., string `'aa'`) is passed. Both the first and subsequent executions always report `@last_plan_from_cache = 0` because wrong-type parameter shapes (type mismatch vs. the cached plan's shape) cause the plan cache to be bypassed entirely for all executions of that prepared statement. Asserting `1` for the second execution is incorrect; the stable expected value is `0` throughout.
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/planner/core/plan_cache_utils.go:621-624
Timestamp: 2026-04-17T14:03:09.042Z
Learning: In `pkg/planner/core/plan_cache_utils.go` (pingcap/tidb), `CollectPlanCacheStmtInfo` is always called with a **newly constructed** `PlanCacheStmt` (created inside `rebuildFromPrepareCache` in `pkg/session/session.go`), not a reused one. The `limits`, `hasSubquery`, and `tables` fields are therefore always zero-valued on entry. Do not flag the absence of reset/clear guards in `CollectPlanCacheStmtInfo` as a potential stale-data bug.
Learnt from: zimulala
Repo: pingcap/tidb PR: 66911
File: pkg/util/topsql/stmtstats/aggregator_bench_test.go:147-152
Timestamp: 2026-03-11T09:45:33.267Z
Learning: In `pkg/util/topsql/stmtstats/aggregator_bench_test.go` (pingcap/tidb PR `#66911`), `BenchmarkDrainAndPushRU160KKeysPreloaded` intentionally reuses prebuilt `RUIncrementMap` batches across benchmark iterations. This is safe because: (1) `StatementStats.execCtx` is nil in the benchmark setup, so `sampleActiveRUDeltaLocked` is never reached and does not mutate the source maps; (2) batches use non-overlapping keys (via `userOffset`). The batches are effectively immutable for the purposes of the benchmark, and data is truly reusable for isolating merge/drain cost from data-setup cost.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-20T13:18:29.186Z
Learning: Applies to **/*_test.go,**/testdata/** : Follow `.agents/skills/tidb-test-guidelines` for placement, naming, `shard_count` guidance, planner-specific casetest rules, and related testdata conventions.
Learnt from: ChangRui-Ryan
Repo: pingcap/tidb PR: 67145
File: pkg/planner/core/casetest/tici/tici_test.go:470-487
Timestamp: 2026-03-24T10:04:49.982Z
Learning: In `pkg/planner/core/casetest/tici/tici_test.go` (pingcap/tidb PR `#67145`), TiCI FTS tests (e.g., TestTiCINonCoveringUsesDoubleRead, TestTiCIJoinNoMPPPlan, TestTiCIWithTiFlashReplicaNoMPPNoTiFlashCop) use positive assertions for `mpp[tiflash]` and do NOT include negative assertions for `mpp[tiflash]` or `ExchangeSender`. This is intentional: TiCI is MPP-only by design, so all TiCI-side plans are expected to include `mpp[tiflash]`. Do not suggest adding `CheckNotContain("mpp[tiflash]")` or `CheckNotContain("ExchangeSender")` to these TiCI test cases.
Learnt from: AilinKid
Repo: pingcap/tidb PR: 66677
File: pkg/sessionctx/stmtctx/stmtctx.go:614-614
Timestamp: 2026-03-05T08:55:21.797Z
Learning: In `pkg/sessionctx/stmtctx/stmtctx.go`, `sc.TableStats` stores `*statistics.Table` pointers that are treated as read-only references obtained from `GetPhysicalTableStats`. These values are never mutated after being stored. The map is only used as a backup/restore snapshot for selecting the winner logical plan. Therefore, a shallow clone via `maps.Clone` is sufficient in `SaveLogicalPlanBuildState`; deep-copying the `*statistics.Table` values is not needed.
Learnt from: D3Hunter
Repo: pingcap/tidb PR: 67543
File: pkg/util/traceevent/test/integration_test.go:336-349
Timestamp: 2026-04-03T05:25:45.158Z
Learning: In `pkg/util/traceevent/test/integration_test.go` (pingcap/tidb PR `#67543`), the sampling integration test uses `require.Len(t, matchedEvents, 2)` intentionally. The flakiness was fixed not by relaxing the assertion but by adding trace-ID filtering via `eventsBelongToAnyTrace`: the drain loop only counts event batches belonging to the 10 known trace IDs from the test run, so with `sampling=5` over 10 executions exactly 2 hits are deterministic. Do NOT suggest relaxing this to `require.NotEmpty` or `require.GreaterOrEqual` — the exact count is correct and intentional.
Learnt from: zimulala
Repo: pingcap/tidb PR: 67265
File: pkg/util/topsql/reporter/ru_datamodel_test.go:259-308
Timestamp: 2026-03-25T03:46:10.574Z
Learning: In `pkg/util/topsql/reporter/ru_datamodel_test.go` (pingcap/tidb PR `#67265`), `TestRUCollectingOthersWireLabelNoCollisionWithRuntimeUserShape` intentionally uses `"app127.0.0.1"` (not `othersUserWireLabel`) as the runtime user. The regression is scoped to runtime user shapes (`userhost` / empty string); broadening to arbitrary raw user strings matching the wire label is out of contract. The empty-user and merge-path regressions (`TestRUCollectingEmptyUserAndGlobalOthersRemainDistinct`, `TestRUCollectingMergeFromKeepsEmptyUserDistinctFromGlobalOthers`) plus aggregator-side assertion hardening collectively cover the fix.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 67054
File: pkg/executor/importer/kv_encode.go:89-96
Timestamp: 2026-03-18T01:55:56.861Z
Learning: In `pkg/executor/importer/kv_encode.go` (pingcap/tidb), `TableKVEncoder.initInsertColFileMapping` uses `intest.Assert` (not a returned error) to guard the invariant that `fieldMappings` column IDs match `insertColumns` in order. This invariant is established at construction time and is covered by the broader test suite; it cannot be violated in production. Suggesting a production `error` return for this check is a false positive — the `intest.Assert` pattern is intentional and preferred here.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: tests/realtikvtest/importintotest/import_into_test.go:253-267
Timestamp: 2026-03-11T06:26:14.065Z
Learning: In pingcap/tidb, PR `#66878` (cherry-pick of `#58401` to release-7.5) also bundles the follow-up nil-check fix from PR `#63146`. `TestOnUpdateColumn` in `tests/realtikvtest/importintotest/import_into_test.go` was cherry-picked from PR `#63146` and tests that `IMPORT INTO` on a table with an `ON UPDATE CURRENT_TIMESTAMP` column does NOT panic due to a nil `onDup` map in `resolveGeneratedColumns`. It is NOT intended to cover the transitive generated-column / index-inconsistency fix from `#58401/`#58400.
Learnt from: winoros
Repo: pingcap/tidb PR: 67411
File: pkg/planner/core/plan_cache_utils.go:0-0
Timestamp: 2026-04-01T11:13:21.523Z
Learning: In pingcap/tidb PR `#67411` (`pkg/planner/core/rule/rule_collect_plan_stats.go`), the correct fix for sync-load timeout fallback plans being cached with stale/pseudo stats is to call `stmtCtx.SetSkipPlanCache(skipPlanCacheReasonSyncLoadFallback)` immediately when `RequestLoadStats` or `SyncWaitStatsLoad` fails under `vardef.StatsLoadPseudoTimeout`. This prevents the fallback plan from ever being written to the plan cache, avoiding the stale-entry re-hit problem. Do not suggest the alternative of caching the fallback plan and invalidating it on fresh stats (via `SyncLoadFallbackItems` or similar mechanisms), as that approach leaves a stale entry resident in the cache.
Learnt from: ChangRui-Ryan
Repo: pingcap/tidb PR: 67145
File: pkg/planner/core/operator/physicalop/physical_indexlookup_reader.go:342-347
Timestamp: 2026-03-24T10:01:51.352Z
Learning: In `pkg/planner/core/operator/physicalop/physical_indexlookup_reader.go` (pingcap/tidb PR `#67145`), `BuildIndexLookUpTask` injects a `PhysicalExchangeSender` wrapper around the TiCI index subtree without calling `sender.SetSchema(indexPlan.Schema())`. This is intentional — the downstream pipeline consumes the child schema correctly without explicit schema propagation on the sender, and the path is covered by UT/integration tests. Do not flag the missing `SetSchema` call on this injected sender as a required correctness fix.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T20:19:06.925Z
Learning: Applies to pkg/session/**|pkg/sessionctx/** : For session variables or protocol behavior changes, perform targeted package tests plus SQL integration tests for user-visible behavior.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-20T13:18:29.186Z
Learning: Applies to **/*_test.go : Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T12:48:57.065Z
Learning: Applies to pkg/executor/** : For SQL behavior changes in executor, perform targeted unit test plus relevant integration test.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T12:05:28.866Z
Learning: Applies to **/*_test.go : Keep test changes minimal and deterministic; avoid broad golden/testdata churn unless required.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: pkg/planner/core/logical_plan_builder.go:6455-6472
Timestamp: 2026-03-11T06:29:00.122Z
Learning: Ensure code reviews verify the VirtualAssignmentsOffset semantics: the planner sets Update.VirtualAssignmentsOffset = len(update.List). The executor should only apply OrderedList[:VirtualAssignmentsOffset] when composing new rows, and only after the 'changed' check should it group/evaluate OrderedList[VirtualAssignmentsOffset:] per table. This pattern applies to files under pkg/planner/core and pkg/executor (e.g., common_plans.go and update.go). Reviewers should check that updates respect slicing behavior, that the offset is consistently derived from the planner, and that downstream code does not bypass the offset when creating new rows. Add tests validating both branches: the slice before the offset for new rows, and the per-table handling of the slice after the offset.
| // PrepareDedupCacheKey builds the lookup key for the prepare dedup cache. | ||
| // Including charset, collation, currentDB and sqlMode ensures that the cached | ||
| // PlanCacheStmt is only reused when the session context that affects parsing, | ||
| // name-resolution and cacheability decisions is identical. sqlMode is included | ||
| // because flags like PIPES_AS_CONCAT and ANSI_QUOTES change AST shape, and | ||
| // IsASTCacheable (which computes StmtCacheable) runs on that AST. | ||
| func PrepareDedupCacheKey(sql, charset, collation, currentDB string, sqlMode mysql.SQLMode) string { | ||
| var modeBuf [8]byte | ||
| binary.LittleEndian.PutUint64(modeBuf[:], uint64(sqlMode)) | ||
| return sql + "\x00" + charset + "\x00" + collation + "\x00" + currentDB + "\x00" + string(modeBuf[:]) | ||
| } |
There was a problem hiding this comment.
Dedup cache key omits parser-affecting flags, risking stale AST reuse.
BuildParserConfig (line 1936) also depends on EnableWindowFunction and EnableStrictDoubleTypeCheck, and GetParseParams (line 2487) additionally includes CharacterSetClient. If any of these change within a session (they are session-scoped sysvars), a subsequent PrepareStmt with identical sql/charset/collation/currentDB/sqlMode will reuse a PlanCacheStmt whose AST was produced under a different parser configuration — notably, window-function tokens may have been treated as identifiers (or vice versa), and double-type validation may have been applied differently. That can yield incorrect execution for the new prepare without any plan-cache miss signal.
Consider folding the full parser config (and CharacterSetClient) into the key — or explicitly documenting that the helper is only safe when callers guarantee these flags are stable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/sessionctx/variable/session.go` around lines 2692 - 2702,
PrepareDedupCacheKey currently omits parser-affecting session flags and
CharacterSetClient, so update PrepareDedupCacheKey to incorporate the parser
configuration used by BuildParserConfig (at least EnableWindowFunction and
EnableStrictDoubleTypeCheck) and the CharacterSetClient value from
GetParseParams into the returned key (e.g. append flag bytes or a small encoded
struct along with the existing sqlMode bytes) so that PlanCacheStmt ASTs are
only reused when those parser-affecting session variables match; alternatively,
if you prefer not to change the key, add a clear comment on PrepareDedupCacheKey
and callers documenting that it is only safe when EnableWindowFunction,
EnableStrictDoubleTypeCheck and CharacterSetClient are stable for the session.
There was a problem hiding this comment.
I'll leave it as a minor todo and fix later
There was a problem hiding this comment.
@guo-shaoge, understood — acknowledged. Would you like me to open a follow-up GitHub issue to track adding EnableWindowFunction, EnableStrictDoubleTypeCheck, and CharacterSetClient to PrepareDedupCacheKey so it doesn't get lost?
(•ᵕ•)っ jots note carefully
✏️ Learnings added
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/sessionctx/variable/session.go:2692-2702
Timestamp: 2026-04-18T01:35:24.957Z
Learning: In `pkg/sessionctx/variable/session.go` (pingcap/tidb PR `#67857`), `PrepareDedupCacheKey` intentionally omits `EnableWindowFunction`, `EnableStrictDoubleTypeCheck`, and `CharacterSetClient` from the dedup cache key. The author (guo-shaoge) is aware of this gap and has deferred fixing it to a later PR. Do not flag this omission as a new bug in future reviews of this file; it is a known, tracked TODO.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/session/session.go:2512-2512
Timestamp: 2026-04-18T01:18:14.660Z
Learning: In `pkg/session/session.go` (pingcap/tidb PR `#67857`), `PrepareStmt` returns `fields = cached.Fields` directly from the dedup-cache entry (a `*plannercore.PrepareStmtCacheEntry`). This is an intentional alias (no defensive copy) because all current callers treat the returned `fields` slice as read-only. Do not flag this as an aliasing/mutation-hazard bug; the author (guo-shaoge) explicitly chose aliasing over a shallow copy for performance reasons.
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/session/test/common/prepare_dedup_cache_test.go:29-52
Timestamp: 2026-04-17T14:08:27.332Z
Learning: In `pkg/sessionctx/variable/tidb_vars.go` (pingcap/tidb), `DefEnableCachePrepareStmt = true`, meaning the `tidb_enable_cache_prepare_stmt` session variable is **enabled by default**. Tests for the prepare-stmt dedup-cache feature do not need an explicit `SET `@tidb_enable_cache_prepare_stmt` = ON` — the cache is active in all test sessions by default.
Learnt from: guo-shaoge
Repo: pingcap/tidb PR: 67857
File: pkg/planner/core/plan_cache_utils.go:621-624
Timestamp: 2026-04-17T14:03:09.042Z
Learning: In `pkg/planner/core/plan_cache_utils.go` (pingcap/tidb), `CollectPlanCacheStmtInfo` is always called with a **newly constructed** `PlanCacheStmt` (created inside `rebuildFromPrepareCache` in `pkg/session/session.go`), not a reused one. The `limits`, `hasSubquery`, and `tables` fields are therefore always zero-valued on entry. Do not flag the absence of reset/clear guards in `CollectPlanCacheStmtInfo` as a potential stale-data bug.
Learnt from: cryo-zd
Repo: pingcap/tidb PR: 65532
File: pkg/planner/core/preprocess.go:754-760
Timestamp: 2026-03-13T00:34:41.277Z
Learning: In TiDB's AST visitor pattern (pkg/planner/core/preprocess.go), when a child node's Accept returns (node, false), the parent node's Accept propagates false upward WITHOUT calling Leave on the parent. Because preprocessor.Leave returns `ok = (p.err == nil)`, any subtree error causes the entire walk to short-circuit before the enclosing SelectStmt's Leave is reached. Therefore, code inside `case *ast.SelectStmt:` in `Leave` (e.g., checkLockClauseTables) is only ever executed when all children completed successfully and p.err is still nil. A guard like `if p.err == nil { checkLockClauseTables(...) }` would be purely cosmetic/defensive, not fixing any reachable bug.
Learnt from: cryo-zd
Repo: pingcap/tidb PR: 65532
File: pkg/planner/core/preprocess.go:1887-1898
Timestamp: 2026-03-12T08:41:44.509Z
Learning: In pkg/planner/core/preprocess.go (pingcap/tidb PR `#65532`), the `checkLockClauseTables` function intentionally accepts `FOR UPDATE OF <base_table>` even when the FROM item was introduced with an alias (`matched.alias.L != "" && !matchedByAlias`). Rather than returning an error (strict MySQL behavior), TiDB emits a warning to guide users toward the alias form. This backward-compatibility fallback is a deliberate acceptance criterion for the change — do not flag it as a MySQL-incompatibility bug in future reviews.
Learnt from: qw4990
Repo: pingcap/tidb PR: 67532
File: pkg/parser/digester.go:568-590
Timestamp: 2026-04-02T13:37:28.684Z
Learning: In `pkg/parser/digester.go` (pingcap/tidb PR `#67532`), the `isAtomicPredicateTokens` function and `isComparisonOperator` intentionally do not handle `BETWEEN ... AND` predicates. The PR only focuses on CMP (comparison) function predicates. Do not flag the missing `BETWEEN` support in reviews of this file — it is a known gap to be addressed separately.
Learnt from: hawkingrei
Repo: pingcap/tidb PR: 67499
File: pkg/executor/prepared_test.go:0-0
Timestamp: 2026-04-13T03:14:21.604Z
Learning: In `pkg/executor/prepared_test.go` (pingcap/tidb), `TestExecuteWithWrongType` verifies that prepared statements with `c2 IN (?, ?)` on a `decimal(32,30)` column remain executable even when a wrong-type parameter (e.g., string `'aa'`) is passed. Both the first and subsequent executions always report `@last_plan_from_cache = 0` because wrong-type parameter shapes (type mismatch vs. the cached plan's shape) cause the plan cache to be bypassed entirely for all executions of that prepared statement. Asserting `1` for the second execution is incorrect; the stable expected value is `0` throughout.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 67054
File: pkg/lightning/mydump/parquet_parser.go:542-545
Timestamp: 2026-03-17T06:01:45.974Z
Learning: In `pkg/lightning/mydump/parquet_parser.go`, `ParquetParser.lastRow.SkipCast` intentionally aliases the mutable `pp.skipCast` buffer (same as `Row.Row` aliasing the pool datum slice). Both are only valid until the next `ReadRow()` call. This is by design: callers (e.g., the encode path in `pkg/executor/importer/chunk_process.go`) consume `LastRow().SkipCast` immediately before the next read. Do not flag the direct assignment `pp.lastRow.SkipCast = pp.skipCast` as an aliasing bug.
Learnt from: Reminiscent
Repo: pingcap/tidb PR: 65367
File: pkg/planner/core/rule_join_reorder.go:787-798
Timestamp: 2026-04-15T02:48:41.178Z
Learning: In `pkg/planner/core/rule_join_reorder.go` (pingcap/tidb), `baseSingleGroupJoinOrderSolver.injectExpr` intentionally passes the original `expr` (not the `substituted` result of `expression.ColumnSubstitute`) to `proj.AppendExpr(expr)`. This is correct because `AppendExpr` performs the projection-schema substitution internally. Passing the pre-substituted expression would cause double-substitution and incorrectly rematerialize non-reusable projection outputs. Do not flag the use of `expr` instead of `substituted` in this `AppendExpr` call as a bug.
Learnt from: winoros
Repo: pingcap/tidb PR: 67411
File: pkg/planner/core/plan_cache_utils.go:0-0
Timestamp: 2026-04-01T11:13:21.523Z
Learning: In pingcap/tidb PR `#67411` (`pkg/planner/core/rule/rule_collect_plan_stats.go`), the correct fix for sync-load timeout fallback plans being cached with stale/pseudo stats is to call `stmtCtx.SetSkipPlanCache(skipPlanCacheReasonSyncLoadFallback)` immediately when `RequestLoadStats` or `SyncWaitStatsLoad` fails under `vardef.StatsLoadPseudoTimeout`. This prevents the fallback plan from ever being written to the plan cache, avoiding the stale-entry re-hit problem. Do not suggest the alternative of caching the fallback plan and invalidating it on fresh stats (via `SyncLoadFallbackItems` or similar mechanisms), as that approach leaves a stale entry resident in the cache.
Learnt from: CR
Repo: pingcap/tidb PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-12T20:19:06.925Z
Learning: Applies to pkg/session/**|pkg/sessionctx/** : For session variables or protocol behavior changes, perform targeted package tests plus SQL integration tests for user-visible behavior.
Learnt from: joechenrh
Repo: pingcap/tidb PR: 66878
File: pkg/planner/core/logical_plan_builder.go:6455-6472
Timestamp: 2026-03-11T06:29:00.122Z
Learning: Ensure code reviews verify the VirtualAssignmentsOffset semantics: the planner sets Update.VirtualAssignmentsOffset = len(update.List). The executor should only apply OrderedList[:VirtualAssignmentsOffset] when composing new rows, and only after the 'changed' check should it group/evaluate OrderedList[VirtualAssignmentsOffset:] per table. This pattern applies to files under pkg/planner/core and pkg/executor (e.g., common_plans.go and update.go). Reviewers should check that updates respect slicing behavior, that the offset is consistently derived from the planner, and that downstream code does not bypass the offset when creating new rows. Add tests validating both branches: the slice before the offset for new rows, and the per-table handling of the slice after the offset.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## release-8.5-20260418-v8.5.4 #67857 +/- ##
================================================================
Coverage ? 57.1985%
================================================================
Files ? 1784
Lines ? 636995
Branches ? 0
================================================================
Hits ? 364352
Misses ? 248121
Partials ? 24522
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/retest |
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
/retest |
6 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
/retest |
|
@guo-shaoge: You cannot manually add or delete the reviewing state labels, only I and the tursted members have permission to do so. DetailsIn response to adding label named approved. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hawkingrei, yudongusa The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3f03f9f
into
pingcap:release-8.5-20260418-v8.5.4
What problem does this PR solve?
Issue Number: close #67815
Problem Summary: CPU profiling showed
HandleStmtPrepareconsuming ~21% of total CPU in workloads using the prepare-per-request pattern (COM_STMT_PREPARE → COM_STMT_EXECUTE →COM_STMT_CLOSE per request with the same SQL). Each Prepare redundantly executed Parse + Preprocess + PlanBuilder.Build for identical SQL text within the same
session.
What changed and how does it work?
Add a session-level LRU cache that deduplicates repeated
COM_STMT_PREPAREcalls for the same SQL. On a cache hit, we skip the expensivePlanBuilder.Buildstep(~9.6% CPU) while still re-parsing the SQL (to get independent AST/ParamMarkerExpr nodes) and re-running Preprocess (to get a fresh
ResolveCtxaligned with thenew AST).
Why re-parse instead of cloning the AST?
ParamMarkerExprnodes are written with parameter values during Execute. Each prepared statement needs its own AST tree.Why still run Preprocess?
ResolveCtx.tableNamesis amap[*ast.TableName]*TableNameWkeyed by AST pointer. Reusing the cachedResolveCtxwith a freshly-parsed AST would causenil-dereference panics on plan-cache miss, because the old pointer keys don't match the new AST's nodes.
Cache key
sql + charset + collation + currentDB + sqlMode— covers all session variables that affect parsing semantics.(todo might be better to add
enableWindowFunction + enableStrictDoubleTypeCheck, check #67857 (comment))Check List
Tests
./bench_client -set && ./bech_client -d '10m' -c 512 -conn 512 -threads 16bench_client.go.txt
we can see QPS increase from 8K to 10K wich CPU are both full:
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
New Features
Tests