Skip to content

chore: refactor http-client, http-client-sample-issuer#42

Merged
viraatc merged 3 commits intomainfrom
fix-viraatc-http-client-issue-clear
Dec 19, 2025
Merged

chore: refactor http-client, http-client-sample-issuer#42
viraatc merged 3 commits intomainfrom
fix-viraatc-http-client-issue-clear

Conversation

@viraatc
Copy link
Copy Markdown
Collaborator

@viraatc viraatc commented Dec 5, 2025

What does this PR do?

Cleanup of Http-Client APIs:

  • remove explicit http_client.start() ; cleanup
  • remove explicit http_sample_issuer.start() ; cleanup
  • remove wait_for_all_complete from http-client (unused, causing overhead in hotpath)
  • abstract out HttpEndpointClient(AsyncHttpEndpointClient) to isolate sync/async code in http_client.py
  • update probe.py / futures-client.py as required
  • removed awaits / reduced function calls in hot-path (issue, recv response)
  • remove max-concurrency from http-client (and its config/yaml-schema)

Type of change

  • Bug fix
  • New feature
  • Documentation update
  • Refactor/cleanup

Related issues

Testing

  • Tests added/updated
  • All tests pass locally
  • Manual testing completed

Checklist

  • Code follows project style
  • Pre-commit hooks pass
  • Documentation updated (if needed)

Copilot AI review requested due to automatic review settings December 5, 2025 07:11
@viraatc viraatc requested a review from a team as a code owner December 5, 2025 07:11
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 5, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @viraatc, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant refactor and cleanup of the http-client APIs, aiming to make them more idiomatic Python, especially concerning asynchronous operations and concurrency. The changes streamline the lifecycle management of the HTTPEndpointClient, enhance concurrency control, and modernize the futures-client test utility to align with standard Python asynchronous patterns. The overall goal is to improve the clarity, robustness, and maintainability of the client-side inference endpoint components.

Highlights

  • API Renaming and Simplification: The HTTPEndpointClient's asynchronous methods have been renamed from async_start() to start_async() and async_shutdown() to shutdown_async(), and get_ready_responses_async() to try_recv_response_async(). This change is reflected across various client usages and documentation.
  • Autonomous Client Lifecycle Management: The HTTPEndpointClient now manages its own event loop and dedicated thread internally if no event loop is explicitly provided during initialization. This simplifies client setup by removing the need for external start() and shutdown() calls in many scenarios.
  • Concurrency Control Refinement: A new with_semaphore utility has been introduced in async_utils.py to wrap asynchronous functions for semaphore-based concurrency control. The HTTPEndpointClient now dynamically applies this semaphore to its issue_query_async method based on the configured max_concurrency.
  • FuturesHttpClient Test Utility Overhaul: The FuturesHttpClient test utility has been significantly refactored to use standard concurrent.futures.Future objects, providing a more Pythonic and thread-safe way to handle asynchronous results in tests. This change removes the custom StreamingFuture class and simplifies its internal logic.
  • HttpClientSampleIssuer Streamlining: The HttpClientSampleIssuer has been simplified by removing internal state management for idle events and inflight queries. Its response handling task is now initiated directly in its constructor, eliminating the need for a separate start() method and wait_for_all_complete().
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@viraatc viraatc changed the title chore: pythonize http-client apis, further cleanup futures-client test-util chore: pythonize http-client apis Dec 5, 2025
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the HTTP client APIs to be more Pythonic and simplifies the futures-based test client. The changes focus on making the client initialization more automatic and the API more intuitive for Python developers.

Key Changes:

  • HTTPEndpointClient now auto-starts on initialization (when loop=None), eliminating manual start/stop calls
  • Renamed async methods to follow Python conventions: async_start_start_async, async_shutdownshutdown_async
  • Simplified FuturesHttpClient to use thread-safe concurrent.futures instead of asyncio futures
  • Removed redundant streaming features (first chunk access) from test utilities

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/futures_client.py Simplified to use concurrent.futures, removed StreamingFuture class and first-chunk access
tests/integration/endpoint_client/conftest.py Updated fixture to use simplified client API
tests/integration/endpoint_client/test_http_client_core.py Updated tests to use concurrent.futures and new auto-start behavior
tests/integration/endpoint_client/test_http_client_streaming.py Removed streaming first-chunk tests, simplified to use concurrent.futures
tests/integration/endpoint_client/test_external_serving.py Updated to use new client API
tests/integration/endpoint_client/test_zmq_communication.py Increased async sleep timeouts for connection stability
tests/integration/test_end_to_end_oracle.py Removed manual start() calls (now auto-starts)
tests/performance/endpoint_client/test_http_client_performance_single.py Removed wait_for_all_complete() call
src/inference_endpoint/endpoint_client/http_client.py Added auto-start logic, renamed async methods, refactored initialization
src/inference_endpoint/endpoint_client/http_sample_issuer.py Simplified response handling, removed wait_for_all_complete()
src/inference_endpoint/endpoint_client/worker.py Improved error message handling for empty exceptions
src/inference_endpoint/endpoint_client/zmq_utils.py Moved uvloop event loop policy setup to zmq_utils
src/inference_endpoint/endpoint_client/async_utils.py Added with_semaphore utility for cleaner concurrency control
src/inference_endpoint/endpoint_client/README.md Updated documentation for renamed methods
src/inference_endpoint/commands/probe.py Updated to use new API methods
src/inference_endpoint/commands/benchmark.py Removed manual start() calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/endpoint_client/http_sample_issuer.py Outdated
Comment thread src/inference_endpoint/endpoint_client/zmq_utils.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py Outdated
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the HTTPEndpointClient to improve its asynchronous capabilities and simplify its API, particularly for testing and integration. Key changes include renaming async methods (e.g., async_start to start_async, async_shutdown to shutdown_async, get_ready_responses_async to try_recv_response_async), and introducing a with_semaphore utility for concurrency control. The client's __init__ method now supports auto-starting its event loop in a dedicated thread if no loop is provided, or integrating with an existing loop. The HttpClientSampleIssuer was simplified by removing its explicit start() method and n_inflight tracking, with its response handler now initiated directly in its constructor. The FuturesHttpClient test utility was updated to use concurrent.futures.Future for thread-safe futures and now relies on the HTTPEndpointClient's auto-start mechanism, removing explicit async_start() and async_shutdown() calls from tests. Additionally, error message generation in worker.py was enhanced, and the uvloop event loop policy for ZMQ was centralized in zmq_utils.py. Review comments highlighted a confusing comment about _start() vs _start_async() in http_client.py, suggested removing or clarifying a TODO about try_recv_response's synchronous API, and questioned the implicit start behavior of HttpClientSampleIssuer's response handler. A concern was also raised about increasing asyncio.sleep durations in ZMQ tests, suggesting a more robust readiness check instead of arbitrary delays.

Comment thread src/inference_endpoint/endpoint_client/http_client.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_sample_issuer.py Outdated
Comment thread tests/integration/endpoint_client/test_zmq_communication.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py
@viraatc viraatc changed the title chore: pythonize http-client apis WIP: chore: pythonize http-client apis Dec 8, 2025
@viraatc viraatc changed the title WIP: chore: pythonize http-client apis chore: refactor http-client, http-client-sample-issuer Dec 8, 2025
Copilot AI review requested due to automatic review settings December 15, 2025 20:09
@viraatc viraatc force-pushed the fix-viraatc-http-client-issue-clear branch from 1c31437 to 6f1b2f1 Compare December 15, 2025 20:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/endpoint_client/async_utils.py Outdated
Comment thread tests/futures_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Copilot AI review requested due to automatic review settings December 15, 2025 20:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_zmq_communication.py Outdated
Comment thread src/inference_endpoint/endpoint_client/zmq_utils.py
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Copilot AI review requested due to automatic review settings December 15, 2025 20:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread tests/integration/endpoint_client/test_http_client_streaming.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
@viraatc viraatc requested a review from nv-alicheng December 15, 2025 21:34
Copy link
Copy Markdown
Collaborator

@nv-alicheng nv-alicheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you asked about removing 'stop_sample_issuer_on_test_end' from BenchmarkSession.start, but it wasn't included in this PR - Was that missed or intentional?

Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread tests/futures_client.py
Comment thread tests/integration/endpoint_client/test_zmq_communication.py Outdated
Copilot AI review requested due to automatic review settings December 16, 2025 00:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread tests/futures_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py Outdated
Copilot AI review requested due to automatic review settings December 16, 2025 00:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_streaming.py Outdated
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Copilot AI review requested due to automatic review settings December 16, 2025 01:54
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_streaming.py
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py
@viraatc viraatc force-pushed the fix-viraatc-http-client-issue-clear branch from 5a50f39 to bd918e9 Compare December 16, 2025 02:01
Copilot AI review requested due to automatic review settings December 16, 2025 02:13
@viraatc viraatc force-pushed the fix-viraatc-http-client-issue-clear branch from 1f88625 to 4cff236 Compare December 16, 2025 02:15
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_streaming.py
Comment thread tests/integration/endpoint_client/test_http_client_core.py
Comment thread tests/integration/endpoint_client/test_http_client_core.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Copilot AI review requested due to automatic review settings December 16, 2025 02:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_core.py
Comment thread tests/integration/endpoint_client/test_http_client_core.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread tests/futures_client.py
Comment thread tests/integration/endpoint_client/conftest.py
@viraatc
Copy link
Copy Markdown
Collaborator Author

viraatc commented Dec 16, 2025

stop_sample_issuer_on_test_end' from BenchmarkSession.start

included in separate MR :
#55

EDIT:

  • included in this mr since it was a small change

Comment thread src/inference_endpoint/endpoint_client/worker.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread src/inference_endpoint/endpoint_client/README.md Outdated
Comment thread tests/futures_client.py
Comment thread tests/integration/endpoint_client/test_zmq_communication.py Outdated
Comment thread src/inference_endpoint/endpoint_client/worker.py Outdated
Comment thread src/inference_endpoint/endpoint_client/worker.py
Copilot AI review requested due to automatic review settings December 16, 2025 02:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/futures_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread tests/integration/endpoint_client/conftest.py
Comment thread src/inference_endpoint/endpoint_client/http_sample_issuer.py Outdated
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Comment thread src/inference_endpoint/endpoint_client/http_client.py
Copilot AI review requested due to automatic review settings December 17, 2025 23:36
@viraatc viraatc force-pushed the fix-viraatc-http-client-issue-clear branch from 4677f10 to 34ac55b Compare December 17, 2025 23:37
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_streaming.py
Comment thread tests/integration/endpoint_client/conftest.py
Comment thread tests/futures_client.py
Comment thread src/inference_endpoint/endpoint_client/http_sample_issuer.py Outdated
Comment thread tests/integration/endpoint_client/test_http_client_core.py
Copilot AI review requested due to automatic review settings December 18, 2025 23:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/integration/endpoint_client/test_http_client_core.py
Comment thread tests/futures_client.py
Comment thread tests/integration/endpoint_client/test_http_client_core.py
@viraatc viraatc merged commit 0411574 into main Dec 19, 2025
4 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Dec 19, 2025
@viraatc viraatc deleted the fix-viraatc-http-client-issue-clear branch February 6, 2026 23:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants