requestcostmetadata: add cost-sample extractor publishing per-model t-digests#1
Closed
davidbreitgand wants to merge 3 commits into
Closed
requestcostmetadata: add cost-sample extractor publishing per-model t-digests#1davidbreitgand wants to merge 3 commits into
davidbreitgand wants to merge 3 commits into
Conversation
…md#42) * Adds CostDigest t-digest data structure to the pricing package Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com> * Removes Cloneable contract tests to address reviewer's feedback Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com> --------- Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com>
Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com>
Signed-off-by: David Breitgand <davidbreitgand@users.noreply.github.com>
545eb2f to
521861d
Compare
Owner
Author
|
Superseded by PR#43 in The prerequisite ( Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
/kind feature
What this PR does / why we need it:
Depends on PR ms-llmd#42 (the tdigest branch, which adds
pricing.CostDigestand thecaio/go-tdigest/v5 dependency). This branch is opened againsttdigest, notmain(I will rebase onto main after PR#42 lands).Adds the
requestcostmetadataextractor: on eachResponseEventTypeevent, it readsprompt_tokens / completion_tokensfrom the response'susageblock, looks up the model'spricing.TokenPrices, computes the per-request cost, and adds it into a per-model running t-digest. At the end of each batch (i.e., the flush interval has elapsed), models that were updated during the flush interval get a digest snapshot published to theirAttributeMapunderpricing.CostDigestAttributeKey.No epoch handling — the digest accumulates without bound. Epoch boundary semantics will be added in a followup PR.
No warmup counter in
CostDigestorrequestcostmetadata. Will be added in a followup PR.Part of the CostGuard implementation track (proposal). Roadmap items ms-llmd#2/ms-llmd#3 (partially).
README.mdadded that documents the plugin + lists known limitations.The code also have several of
TODOcomments that spill over the scope of this PR. Will be captured as issues and handled in the separate PRs.Partially fixes ms-llmd#35
Release note (write
NONEif no user-facing change):