[feat][evaluation]open api evaluator#418
Conversation
704da7f to
f347866
Compare
2198b7b to
66e6e2e
Compare
5f44e6d to
923e6e8
Compare
CozeLoop
left a comment
There was a problem hiding this comment.
Automated review by Aime: adding line-level comments for identified issues.
923e6e8 to
4329950
Compare
CozeLoop
left a comment
There was a problem hiding this comment.
Added one must-fix issue: nil-safe metrics defer for new OpenAPI handlers (and highlighted all affected methods).
4329950 to
1bd9d63
Compare
CozeLoop
left a comment
There was a problem hiding this comment.
High-level summary:
Thanks for the comprehensive OpenAPI evaluator/evaluation work. The overall layering (IDL → convertors → application → domain services) is consistent with the existing evaluation module and the new endpoints are wired into the router/handlers in a clean, DDD-friendly way.
However, I found one compatibility-breaking issue and a few behavioural edge cases that should be addressed before merging:
-
[Must-Fix] Thrift/IDL compatibility
domain_openapi.evaluator.thriftchanges existing field IDs instruct Evaluator(insertingworkspace_idas field 2 and shifting other fields). This breaks the wire format and violates the IDL compatibility rules. Please keep existing IDs stable and addworkspace_idwith a new ID.
-
[Should-Fix] OpenAPI input validation and error mapping
OpenAPIRateLimitDTO2DObubbles uptime.ParseDurationerrors directly, and the OpenAPI layer wraps them as generic internal errors. Invalidrate_limit.periodvalues should be mapped to a well-defined invalid-parameter error (4xx) instead of 500.OpenAPIExptTemplateFilterDTO2DOcurrently returnsnilwhenlogic_op != "and", silently discarding both filter conditions and the fuzzy-name keyword. This makes some client inputs appear to be accepted while actually ignored.
-
[Should-Fix] Template-based experiment submission robustness
TemplateToSubmitExperimentRequestassumestemplate.Metais always non-nil and will panic if that invariant is violated. Given it’s used bySubmitExptFromTemplateOApi, it’s safer to guardMetaand returnnilso the caller can surface a controlled error.
I’ve left detailed line comments with concrete suggestions and example code for each issue. Once these are addressed, the PR should be in good shape.
What type of PR is this?
Check the PR title
(Optional) Translate the PR title into Chinese
(Optional) More detailed description for this PR(en: English/zh: Chinese)
en:
zh(optional):
(Optional) Which issue(s) this PR fixes