Skip to content

m serve: generic except-Exception handler masks errors and leaks message to client #991

@planetf1

Description

@planetf1

Problem

The chat-completion endpoint wraps the whole handler in a generic except Exception at cli/serve/app.py:252-258:

except Exception as e:
    # Catch-all for any unexpected errors (including AttributeError)
    return create_openai_error_response(
        status_code=500,
        message=f"Internal server error: {e!s}",
        error_type="server_error",
    )

This has two observable problems:

  1. No server-side log trail. When the endpoint hits an unexpected error in production, nothing is written to the logger. Operators have no stack trace to diagnose the 500. The comment "for any unexpected errors (including AttributeError)" is exactly the class of bug that needs a visible traceback, not a silent swallow.

  2. Exception message leaked verbatim to the client. str(e) on arbitrary exceptions can contain file paths, internal module names, schema fragments, Pydantic internals, or state-specific details that should not cross the API boundary. OpenAI-compatible clients expect a generic "server_error" message; debuggable detail belongs in the server log.

This is pre-existing (not introduced by #884) but was surfaced during review of that PR because the PR adds new exception sources (schema-conversion failures) that land in this handler when not caught earlier.

AGENTS.md §5 flags silent exception swallowing as an anti-pattern:

Fail fast and loud. Never write try/except blocks with silent pass or swallowed exceptions. Expose the root cause.

The current handler is not quite "silent pass" — it returns a 500 — but it does swallow the stack trace before anyone can see it.

Observed behaviour

A malformed request triggering an AttributeError inside module.serve produces:

  • Client sees: {"error": {"message": "Internal server error: 'NoneType' object has no attribute 'value'", "type": "server_error"}}
  • Server log: nothing.

Scope

This concerns the catch-all only. The explicit except ValueError at L245 that maps validation errors to 400 is correct and should stay.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions