Skip to content

[Experimental] HTTP agent runtime#6356

Open
lspinheiro wants to merge 6 commits into
microsoft:mainfrom
lspinheiro:feat-6355-experimental-http-runtime
Open

[Experimental] HTTP agent runtime#6356
lspinheiro wants to merge 6 commits into
microsoft:mainfrom
lspinheiro:feat-6355-experimental-http-runtime

Conversation

@lspinheiro
Copy link
Copy Markdown
Collaborator

Why are these changes needed?

This is an initial proposal for an experimental http agent runtime based on fastapi and json-rpc. The goal is to support existing middleware system to facilitate enhancement of agent functionality such as enabling authentication.

Related issue number

Related to #6355

Checks

@lspinheiro lspinheiro requested review from ekzhu and jackgerrits April 22, 2025 02:58
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2025

Codecov Report

Attention: Patch coverage is 61.65884% with 245 lines in your changes missing coverage. Please review.

Project coverage is 78.85%. Comparing base (cc2693b) to head (8ab2a22).

Files with missing lines Patch % Lines
...t/src/autogen_ext/runtimes/http/_worker_runtime.py 63.55% 129 Missing ⚠️
...ext/runtimes/http/_worker_runtime_host_servicer.py 53.50% 73 Missing ⚠️
.../autogen_ext/runtimes/http/_worker_runtime_host.py 60.90% 43 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6356      +/-   ##
==========================================
- Coverage   79.52%   78.85%   -0.67%     
==========================================
  Files         225      230       +5     
  Lines       16641    17280     +639     
==========================================
+ Hits        13233    13627     +394     
- Misses       3408     3653     +245     
Flag Coverage Δ
unittests 78.85% <61.65%> (-0.67%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown

@yogitasrivastava yogitasrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Content-type
    Consider enforcing a stricter Content‑Type: application/json on incoming requests early, with clear 415/400 responses for bad payloads.

  2. Runtime vs. Host separation
    The split between HttpWorkerAgentRuntime and HttpWorkerAgentRuntimeHost is clean, but the naming feels a bit opaque (the “HostServicer” suffix sounds gRPC‑ish). Kindly consider renaming the host side to something like HttpAgentServer or HttpAgentHost for clarity.

@nidhishgajjar
Copy link
Copy Markdown

Orb Code Review (powered by GLM 5.1 on Orb Cloud)

PR #6356 — [Experimental] HTTP agent runtime

Thanks for this ambitious contribution! An HTTP-based distributed runtime is a valuable addition to AutoGen. Here are my findings from a thorough review:

🔴 Security concerns (expected for experimental, but worth documenting)

  1. No authentication: The x-client-id header is a client-provided UUID with no verification. Any client can impersonate another. Please add a note in the module docstring about security requirements for production use.
  2. No TLS/encryption support documented: All communication is plain HTTP/WS. For distributed deployments, HTTPS/WSS is essential.
  3. WebSocket URL construction: self._base_url.replace('''http''','''ws''') would turn https:// into wss:// (correct) but http:// into ws:// (correct) — however, edge cases like httpss:// could produce wrong results. Consider using urllib.parse for URL manipulation.

🔴 Potential bugs

  1. Test import mismatch: The test file imports HttpWorkerAgentRuntimeHost but __init__.py exports HttpAgentServer and HttpAgentService. This suggests the test may not run as-is. Could you verify the export names are consistent?

  2. Pending futures in _process_request: In _worker_runtime_service.py, the _process_request method forwards requests but doesn'''t create a pending future to track the response (unlike rpc_agent_call which does). This means responses from WebSocket messages routed through _process_request may not be properly awaited.

  3. _get_agent async handling: The code checks asyncio.iscoroutine(agent) after calling factory(), but the variable naming suggests this might not correctly handle all factory patterns. Consider using inspect.iscoroutinefunction on the factory itself.

🟡 Design observations

  1. Excessive INFO logging: There are dozens of logger.info() calls that would be very noisy in production. Most should be logger.debug(). Consider a logging best practices pass.
  2. SubscriptionManager exported publicly: Adding SubscriptionManager to autogen_core.__init__.py exposes an internal implementation detail. Is this intentional? If so, consider documenting it as a stable API.
  3. Duplicated serialization helpers: subscription_from_json/subscription_to_json appear to be duplicated. If they exist in autogen_core._runtime_impl_helpers, import them instead.
  4. Background task tracking: Tasks created in _read_loop (asyncio.create_task(self._process_request(raw_msg))) are not tracked in _background_tasks, which could lead to unhandled exceptions.
  5. Hardcoded 20s timeout: The RPC timeout in rpc_agent_call is hardcoded. Consider making it configurable.

🟡 Missing for production use (document as experimental limitations)

  • No reconnection logic for dropped WebSocket connections
  • No heartbeat/keepalive mechanism
  • Several NotImplementedError methods (save_state, load_state, agent_metadata, etc.)

✅ Good aspects

  • Clean JSON-RPC 2.0 protocol choice for communication
  • Separation of concerns between server, service, and client runtime
  • Test uses ephemeral ports for parallel execution
  • Proper async patterns throughout

Summary

A promising experimental runtime with solid architecture but needs security documentation, test consistency fixes, and attention to pending request tracking. As an experimental feature, clearly documenting the security and completeness gaps would help set expectations.

Assessment: request-changes — Please fix the test import mismatch, verify pending future tracking in _process_request, and add security caveat documentation.

@0xbrainkid
Copy link
Copy Markdown

The HTTP agent runtime + JSON-RPC approach enables a class of deployments that the in-process model cannot support — agents running in separate processes, on different hosts, or behind load balancers. The middleware system for authentication is the right extension point.

One authentication dimension worth designing for from the start: agent identity as a first-class HTTP credential, not just user identity. The current framing (FastAPI + JSON-RPC) naturally maps to user-facing auth (Bearer tokens, API keys). But in multi-agent AutoGen deployments, the caller is often another agent, not a user.

Agent-to-agent authentication over HTTP needs a different credential model than user authentication:

  • Agents rotate more frequently than users
  • Trust should be scoped to interaction history, not just credential validity
  • A revoked agent identity should propagate across all HTTP endpoints without requiring user session invalidation

A lightweight addition to the middleware design:

class AgentAuthMiddleware:
    async def __call__(self, request: Request, call_next):
        agent_id = request.headers.get("X-Agent-ID")
        if agent_id:
            # Verify agent identity + behavioral trust
            trust = await satp.verify(agent_id)
            if trust.score < self.config.min_agent_trust:
                return Response(status_code=403, content="Agent trust below threshold")
            request.state.agent_trust = trust
        return await call_next(request)

This plugs into the existing middleware system without changing the JSON-RPC layer — agent callers get trust-gated access, human callers continue through standard Bearer token auth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants