Skip to content

[Feature]: running-requests scorer #93

@Mohammad-nassar10

Description

@Mohammad-nassar10

Problem Statement

Add a simple running-requests scorer to score models based on their current in-flight request count, ranking the least loaded model highest by assigning a normalized score in [0.0, 1.0]. The most loaded model receives a score of 0.0, and the least loaded receives a score of 1.0.

Proposed Solution

  • RunningRequestsScorer implements the Scorer interface embedding Plugin.
  • The scorer reads the running-requests attribute from each model, populated by RunningRequestsExtractor in the datalayer.
    Models without a running-requests attribute are treated as idle (0 requests).
    If all models have the same count (including all zero), all receive a score of 1.0.
    If there is only one model, it receives a score of 1.0.

Alternatives Considered

No response

Willingness to Contribute

Yes, I can submit a PR

Additional Context

No response

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions