fix(compose): fix ToolInfo.ParamsOneOf loss and eliminate deep-copy in checkpoint serialization#1007
Open
shentongmartin wants to merge 1 commit intoalpha/09from
Open
fix(compose): fix ToolInfo.ParamsOneOf loss and eliminate deep-copy in checkpoint serialization#1007shentongmartin wants to merge 1 commit intoalpha/09from
shentongmartin wants to merge 1 commit intoalpha/09from
Conversation
… InternalSerializer When InternalSerializer marshals a struct value that implements json.Marshaler via pointer receiver (e.g. *ToolInfo), rv.Interface() produces a non-addressable copy. json.Marshal then cannot call the pointer method and falls back to default struct encoding, which skips unexported fields — causing ParamsOneOf data loss after deepCopyState during interrupt/resume. Fix: pass a pointer to json.Marshal by using rv.Addr() when addressable, or copying into reflect.New() otherwise. Change-Id: Ib325f5cbb1f97271f1609d8f29b9669e9ec01d60
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## alpha/09 #1007 +/- ##
===========================================
Coverage ? 82.97%
===========================================
Files ? 162
Lines ? 22040
Branches ? 0
===========================================
Hits ? 18288
Misses ? 2529
Partials ? 1223 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6c1d86a to
58b8751
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ToolInfo.ParamsOneOflost during checkpoint serialization —InternalSerializer.internalMarshalpasses a non-addressable value tojson.Marshal, preventing(*ToolInfo).MarshalJSON()(pointer receiver) from being dispatchedjson.Marshalwhen the reflect.Value is not addressable, so custom pointer-receiver marshalers are correctly invokedInternalSerializerwas inherently fragile — any type with unexported fields or custom marshaling could silently lose datadeepCopyStatewith mutex-based synchronization: acquirestate.muaroundcheckPointer.set()to get mutual exclusion with concurrentProcessStatecalls, eliminating the need for serialization-based deep-copy entirelyKey Insight
The root cause is a subtle interaction between Go's
reflectpackage andencoding/json:internalMarshalextracted a struct value viarv.Interface()and passed it tojson.Marshal. Whenrvis not addressable (e.g., it came from a map value or was created viareflect.New(...).Elem()-style paths that lost addressability), Go cannot obtain a pointer to it. SinceMarshalJSONis defined on*ToolInfo(pointer receiver),json.Marshalnever sees the custom marshaler and falls back to default struct encoding — which silently dropsParamsOneOf(a field with unexported sub-fields only accessible through the custom method).The second insight is architectural: deep-copying state via marshal/unmarshal was introduced to prevent races between checkpoint persistence and concurrent
ProcessStatemutations. ButProcessStatealready acquiresstate.mu, andconvertCheckPointonly readscp.Channelsandcp.Inputs(notcp.State). Therefore, acquiring the same mutex aroundcheckPointer.set()provides the same mutual exclusion guarantee without relying on serialization fidelity.Changed Files
internal/serialization/serialization.go— ininternalMarshal, check addressability; if not addressable, create a new pointer and set the value before callingjson.Marshalcompose/graph_run.go— removedeepCopyStatefunction; acquirestate.mu.Lock()aroundcheckPointer.set(cp)inrunPregel概要
ToolInfo.ParamsOneOf丢失 —InternalSerializer.internalMarshal传入不可寻址的值给json.Marshal,导致(*ToolInfo).MarshalJSON()(指针接收者)无法被调用json.Marshal的始终是指针,使自定义指针接收者的 marshaler 能被正确派发InternalSerializer的深拷贝本质上脆弱 — 任何包含未导出字段或自定义序列化的类型都可能静默丢失数据deepCopyState:在checkPointer.set()周围获取state.mu锁,与并发的ProcessState调用互斥,从而完全消除对序列化深拷贝的依赖核心洞察
根因是 Go 的
reflect包与encoding/json之间的微妙交互:internalMarshal通过rv.Interface()提取结构体值并传给json.Marshal。当rv不可寻址时(例如来自 map 值、或经过丧失可寻址性的 reflect 路径),Go 无法获取其指针。由于MarshalJSON定义在*ToolInfo(指针接收者)上,json.Marshal看不到自定义 marshaler,回退到默认结构体编码 — 静默丢弃ParamsOneOf(一个仅通过自定义方法可访问未导出子字段的字段)。第二个洞察是架构层面的:通过 marshal/unmarshal 深拷贝 state 是为了防止 checkpoint 持久化与并发
ProcessState修改之间的竞争。但ProcessState已经获取state.mu,且convertCheckPoint只读取cp.Channels和cp.Inputs(不读cp.State)。因此在checkPointer.set()周围获取同一把锁即可提供相同的互斥保证,无需依赖序列化的正确性。变更文件
internal/serialization/serialization.go— 在internalMarshal中检查可寻址性;不可寻址时创建新指针并设置值后再调用json.Marshalcompose/graph_run.go— 删除deepCopyState函数;在runPregel中的checkPointer.set(cp)周围获取state.mu.Lock()