Skip to content

fix: Update OpenAI backend to parse logprobs and token distributions#13

Merged
NullPointerDepressiveDisorder merged 5 commits into
mainfrom
fix/openai-compat
Apr 14, 2026
Merged

fix: Update OpenAI backend to parse logprobs and token distributions#13
NullPointerDepressiveDisorder merged 5 commits into
mainfrom
fix/openai-compat

Conversation

@NullPointerDepressiveDisorder

Copy link
Copy Markdown
Owner
  • Add unit tests to validate equal and uneven category prompt distribution
  • Extend load_suite to support limiting prompts by category balance
  • Update OpenAI backend to parse logprobs and token distributions

- Add unit tests to validate equal and uneven category prompt distribution
- Extend `load_suite` to support limiting prompts by category balance
- Update OpenAI backend to parse logprobs and token distributions
Copilot AI review requested due to automatic review settings April 13, 2026 02:49
@codecov

codecov Bot commented Apr 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 98.25581% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/infer_check/backends/openai_compat.py 95.83% 2 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves prompt-suite sampling and inference result introspection by adding category-balanced prompt limiting in load_suite, updating the CLI to use it, and extending the OpenAI-compatible backend to request/parse chat-completions logprobs and top-token distributions.

Changes:

  • Added unit tests validating equal/uneven category distribution behavior in load_suite.
  • Extended load_suite(..., num_prompts=...) to select prompts via round-robin across categories (instead of simple slicing in the CLI).
  • Updated the OpenAI-compatible chat backend to request and parse logprobs/top_logprobs, populating per-token distributions and metadata.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
tests/unit/test_loader_distribution.py Adds unit coverage for category-balanced prompt selection and no-limit behavior.
src/infer_check/suites/loader.py Implements num_prompts support with category-balanced selection and updated logging.
src/infer_check/cli.py Delegates --num-prompts limiting to load_suite (removes post-load slicing).
src/infer_check/backends/openai_compat.py Requests and parses chat logprobs/top-logprobs into InferenceResult fields.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/infer_check/suites/loader.py
Comment thread src/infer_check/backends/openai_compat.py Outdated
Comment thread src/infer_check/backends/openai_compat.py Outdated
Comment thread src/infer_check/backends/openai_compat.py
@NullPointerDepressiveDisorder NullPointerDepressiveDisorder changed the title test: add coverage for load_suite category distribution logic fix: Update OpenAI backend to parse logprobs and token distributions Apr 13, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/infer_check/suites/loader.py
Comment thread src/infer_check/backends/openai_compat.py Outdated
Comment thread src/infer_check/backends/openai_compat.py Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/infer_check/backends/openai_compat.py
Comment thread src/infer_check/backends/openai_compat.py Outdated
Comment thread src/infer_check/backends/openai_compat.py
Comment thread src/infer_check/suites/loader.py
- Introduce _ServerHTTPError for clearer HTTP error propagation in OpenAICompatBackend
- Refine logprobs retry logic to use status codes instead of string matching
- Adjust prompt category assignment to allow None values instead of defaulting to "default"
- Safeguard token extraction in logprobs parsing
@NullPointerDepressiveDisorder NullPointerDepressiveDisorder merged commit aa75120 into main Apr 14, 2026
5 checks passed
@NullPointerDepressiveDisorder NullPointerDepressiveDisorder deleted the fix/openai-compat branch April 14, 2026 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants