Add OpenAIRecorder#91
Conversation
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
xenoscopic
left a comment
There was a problem hiding this comment.
LGTM. Only thing I can think of is maybe we also want to record User-Agent in case the user has multiple components interacting with the model runner (e.g. in an agentic app) and wants to be able to distinguish between them.
ilopezluna
left a comment
There was a problem hiding this comment.
Nice! I added a couple of comments, nothing blocking, they can be addressed in follow up PRs if needed
| } | ||
| } | ||
|
|
||
| func (r *OpenAIRecorder) convertStreamingResponse(streamingBody string) string { |
There was a problem hiding this comment.
Maybe we could use the OpenAI Go SDK to get this conversion.
They have the acc := openai.ChatCompletionAccumulator{} which might be useful:
https://github.com/openai/openai-go/blob/main/examples/chat-completion-accumulating/main.go
There was a problem hiding this comment.
I can do that, but I would still need to split the full body on chunks and pass them separately to the accumulator as it is designed to work with individual streaming chunks 🤔 .
There was a problem hiding this comment.
I think I used the sdk to convert chunks into a full response in the past, but I actually don't remember if I used the accumulator.
In any case, not a blocker so we can revisit eventually
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Good point, thanks @xenoscopic! |
| UserAgent string `json:"user_agent,omitempty"` | ||
| } | ||
|
|
||
| type ModelData struct { |
There was a problem hiding this comment.
I agree it's a bit confusing, but the backend configuration are backend specific, not OpenAI specific.
| } | ||
| } | ||
|
|
||
| func (r *OpenAIRecorder) convertStreamingResponse(streamingBody string) string { |
There was a problem hiding this comment.
I think I used the sdk to convert chunks into a full response in the past, but I actually don't remember if I used the accumulator.
In any case, not a blocker so we can revisit eventually
Inform standalone installations of their operating environment.
Adds recording functionality for OpenAI inference requests and responses:
GET /engines/requests?model=<model>endpointNecessary work left to be done:
remove records on model eviction(2nd commit)include the configured context size for the model(3rd commit)With the backend configuration included (the 3rd commit):
With the User-Agent captured in records if it's not empty (the 4th commit):