[feat] [evaluate] code evaluator commercial#210
Conversation
… coverage (LogID: 20250926135253010091110134060FD66) Co-Authored-By: Coda <coda@bytedance.com>
(LogID: 20250927194646172020010002380FB80) Co-Authored-By: Coda <coda@bytedance.com>
370557e to
95d4509
Compare
…、FaaS集成、监控配置等完整功能实现 (LogID: 20250928143922010091110134199CBD2) Co-Authored-By: Coda <coda@bytedance.com>
c6d5e14 to
c36439a
Compare
CozeLoop
left a comment
There was a problem hiding this comment.
Aime CR Stage 2 summary:
- Must-Fix: New evaluator/experiment routes register with middleware stubs returning nil; add auth/rate/audit/trace/recover chains. Fix ValidateEvaluator and insight_analysis handlers to call kitex services and return structured errors.
- Should-Fix: Avoid raw
err.Error()in handlers; protect mock_output; verify ACL on records. - Suggestion: Verify Hertz v0.10.1 compatibility and strengthen HTTP client connection reuse.
See inline comments for precise lines and proposals.
|
|
||
| func _validateevaluatorMw(handler *apis.APIHandler) []app.HandlerFunc { | ||
| // your code... | ||
| return nil |
There was a problem hiding this comment.
🚫 Must-Fix: Middleware stub returns nil — missing security and operational guarantees.
Problem: This middleware currently returns nil, which effectively attaches no middleware to the new /evaluators/validate route. That leaves the endpoint without authentication, rate limiting, audit logging, or standardized error handling. In CozeLoop and ByteDance service specs, public-facing write endpoints must enforce authn/authz, request limits, and trace/audit propagation.
Why it matters: Without these guards, the route can be abused (e.g., batch requests, unauthenticated access), observability is reduced, and consistent error/metrics propagation breaks. Hertz v0.10.1 does not add defaults; explicit handlers are required.
Proposal: Provide a middleware chain that at minimum includes: auth (token/session), rate limiter, request size/timeout limits, audit/trace injection, and a panic/error recovery shim. Example (illustrative):
func _validateevaluatorMw(handler *apis.APIHandler) []app.HandlerFunc {
return []app.HandlerFunc{
handler.Middleware.Auth(), // authn/authz
handler.Middleware.RateLimit(100, time.Minute), // QoS per user/workspace
handler.Middleware.RequestLimit(1<<20), // 1MB body max
handler.Middleware.Trace(), // tracing span/context
handler.Middleware.Audit("evaluator.validate"),
handler.Middleware.Recover(), // uniform error mapping
}
}Ensure the same baseline is applied to other newly-added evaluator endpoints (batch_debug, etc.), tuned by route sensitivity.
|
|
||
| func _batchdebugevaluatorMw(handler *apis.APIHandler) []app.HandlerFunc { | ||
| // your code... | ||
| return nil |
There was a problem hiding this comment.
🚫 Must-Fix: Batch debug route lacks protective middleware.
Problem: _batchdebugevaluatorMw returns nil, so /evaluators/batch_debug runs without auth, rate limiting, payload size limits, or audit/trace context. Batch operations are resource-intensive and can be abused for DoS if not guarded.
Impact: Elevated compute and sandbox invocations without QoS lead to performance hotspots and noisy neighbors. Missing audit breaks accountability.
Fix: Attach a strict middleware chain tailored for batch ops:
func _batchdebugevaluatorMw(handler *apis.APIHandler) []app.HandlerFunc {
return []app.HandlerFunc{
handler.Middleware.Auth(),
handler.Middleware.RateLimitPerUser(20, time.Minute), // conservative per-user burst
handler.Middleware.ConcurrencyLimit(4), // bound concurrent batch execution
handler.Middleware.RequestLimit(1<<20), // cap body size
handler.Middleware.Trace(),
handler.Middleware.Audit("evaluator.batch_debug"),
handler.Middleware.Recover(),
}
}Also consider deferring long-running execution via job queue with polling APIs to avoid tying up request threads.
|
|
||
| func _insightanalysisexperimentMw(handler *apis.APIHandler) []app.HandlerFunc { | ||
| // your code... | ||
| return nil |
There was a problem hiding this comment.
🚫 Must-Fix: Insight analysis endpoints are exposed without middleware.
Problem: _insightanalysisexperimentMw returns nil, so /experiments/:expt_id/insight_analysis executes without authn/authz checks, rate limiting, or audit/trace context. Insight analysis is typically expensive and should be permission-scoped to the experiment’s space/owner.
Fix: Attach auth (space-scoped), rate/QoS, audit, and tracing middleware. Example:
func _insightanalysisexperimentMw(handler *apis.APIHandler) []app.HandlerFunc {
return []app.HandlerFunc{
handler.Middleware.AuthSpaceScoped(), // verify access to :expt_id
handler.Middleware.RateLimit(30, time.Minute),
handler.Middleware.Trace(),
handler.Middleware.Audit("experiment.insight_analysis"),
handler.Middleware.Recover(),
}
}Consider async job submission with status polling to avoid request blocking for long-running analysis.
|
|
||
| func _insight_analysis_recordsMw(handler *apis.APIHandler) []app.HandlerFunc { | ||
| // your code... | ||
| return nil |
There was a problem hiding this comment.
🚫 Must-Fix: Records/feedback/comments routes lack auth and audit middleware.
Problem: _insight_analysis_recordsMw (and related _insight_analysis_record_idMw, _feedbackexptinsightanalysisreportMw, _listexptinsightanalysiscommentMw, etc.) all return nil. These endpoints modify/access experiment-scoped records; without authn/authz, rate limiting, and audit, they are vulnerable to unauthorized access and abuse.
Fix: Define a shared middleware set for the records group that enforces space-scoped permission checks, rate limits, and audit/trace propagation:
func _insight_analysis_recordsMw(handler *apis.APIHandler) []app.HandlerFunc {
return []app.HandlerFunc{
handler.Middleware.AuthSpaceScoped(),
handler.Middleware.RateLimit(60, time.Minute),
handler.Middleware.Trace(),
handler.Middleware.Audit("experiment.insight_analysis.records"),
handler.Middleware.Recover(),
}
}Apply complementary guards for child routes (DELETE/POST) to ensure proper authorization on record IDs.
|
|
||
| func _mockevaltargetoutputMw(handler *apis.APIHandler) []app.HandlerFunc { | ||
| // your code... | ||
| return nil |
There was a problem hiding this comment.
🚨 Should-Fix: Mock output endpoint should be guarded to avoid misuse.
Problem: _mockevaltargetoutputMw returns nil for /eval_targets/mock_output. Even for mock/testing paths, unauthenticated use in production environments can leak target metadata or be abused for traffic amplification.
Suggestion: Attach lightweight but explicit guards: auth (or admin-only), rate limiting, and audit tagging so mock usage is observable.
func _mockevaltargetoutputMw(handler *apis.APIHandler) []app.HandlerFunc {
return []app.HandlerFunc{
handler.Middleware.Auth(),
handler.Middleware.RateLimit(60, time.Minute),
handler.Middleware.Audit("eval_target.mock_output"),
handler.Middleware.Trace(),
}
}| _expt_id.POST("/associate_tag", append(_associateannotationtagMw(handler), apis.AssociateAnnotationTag)...) | ||
| _expt_id.POST("/clone", append(_cloneexperimentMw(handler), apis.CloneExperiment)...) | ||
| _expt_id.DELETE("/delete_tag", append(_deleteannotationtagMw(handler), apis.DeleteAnnotationTag)...) | ||
| _expt_id.POST("/insight_analysis", append(_insightanalysisexperimentMw(handler), apis.InsightAnalysisExperiment)...) |
There was a problem hiding this comment.
🚨 Should-Fix: Insight analysis should verify experiment-level permissions and guard resource usage.
Observation: /experiments/:expt_id/insight_analysis is registered with _insightanalysisexperimentMw(handler) which returns nil. This endpoint should verify access rights to :expt_id (space-scoped), add rate limiting, and emit audit/trace. Consider async job submission to avoid request blocking for long-running analysis.
| _export_records.POST("/list", append(_listexptresultexportrecordMw(handler), apis.ListExptResultExportRecord)...) | ||
| } | ||
| { | ||
| _insight_analysis_records := _expt_id.Group("/insight_analysis_records", _insight_analysis_recordsMw(handler)...) |
There was a problem hiding this comment.
🚨 Should-Fix: Records and nested routes need ACL checks and audit.
Observation: The group /experiments/:expt_id/insight_analysis_records attaches _insight_analysis_recordsMw(handler) which returns nil. Ensure space-scoped ACL, rate limiting, trace/audit, and consistent error handling across nested routes (DELETE, POST list/get, feedback, comments).
|
|
||
| # Database configuration - use SQLite for testing | ||
| sqlite: | ||
| path: "/tmp/coze_loop.db" |
There was a problem hiding this comment.
🚨 Should-Fix: Local-only infrastructure.yaml risks environment parity and accidental production use.
Observation: New config defaults to SQLite and disables Redis/ClickHouse/MQ, suitable for local testing but problematic if committed without environment scoping. Database规范 requires parity across Docker/K8s deployments and proper SQL init paths.
Recommendation:
- Move this into a clearly local/dev profile (e.g.,
conf/infrastructure.local.yaml) and exclude from production packaging. - Ensure docker-compose and helm chart paths continue to apply the same SQL init changes; do not diverge schema.
- Add comments and build-time gating to prevent accidental use in production.
| Python = 1 | ||
| JS = 2 | ||
| } | ||
| typedef string LanguageType(ts.enum="true") |
There was a problem hiding this comment.
💡 Suggestion: Confirm generated code and usage align with LanguageType as string enum.
Observation: LanguageType moved from numeric enum to typedef string with constants. Ensure downstream code (builders, runtime selection, validation) treat language type as canonical strings and normalize appropriately. Domain code CodeEvaluatorVersion.ValidateBaseInfo appears to normalize casing — good. Verify kitex-generated types and conversion paths are updated accordingly.
|
|
||
| // 检查HTTP状态码 | ||
| if resp.StatusCode < 200 || resp.StatusCode >= 300 { | ||
| bodyBytes, _ := io.ReadAll(resp.Body) |
There was a problem hiding this comment.
💡 Suggestion: Improve HTTP client robustness and connection reuse.
- Consider draining response body on non-2xx before close to allow persistent connection reuse (e.g., io.Copy(io.Discard, resp.Body)).
- For
Bodytyped as io.Reader, defaulting Content-Type totext/plainmay be surprising; prefer requiring explicit content-type from caller or default toapplication/octet-stream. - Optionally add retry/backoff for transient 5xx, with context-aware cancellation.
What type of PR is this?
Check the PR title.
(Optional) Translate the PR title into Chinese.
(Optional) More detailed description for this PR(en: English/zh: Chinese).
en:
zh(optional):
(Optional) Which issue(s) this PR fixes: