Skip to content

Topic load shedding causes Producers (including built in one in connection object) to fail until you manually close them after error #378

@chamons

Description

@chamons

Sample Project

Here are two unit tests that show the issue at hand:

https://gist.github.com/chamons/f78672c4bb659aeb1e8499a6925e5f3b

Replace broker-1 with your local pulsar cluster or use this docker compose.

Details

After a topic load moves to another broker, due to:

  • Pulsar broker restarts
  • Load shedding due to high CPU
  • Calling admin/v2/persistent/public/default/{topic}/unload or the pulsar-admin command

Sending notifications via a MultiTopicProducer to that topic will now fail with:

Producer(Connection(Io(Custom { kind: TimedOut, error: " connection c0010f8b-9ae8-4b31-abba-91d83e283695 timedout sending message to the Pulsar server" }))

However, if you close the consumer and then try again, it works just fine.

This is rather non-obvious, however it makes the Pulsar::send simplified API unusable in these cases, as there is now way to manually close it. This means once you write to a topic using it, if it load sheds then that entire connection's Pulsar::send can not write to it again.

We work around this in our code base by only using MultiTopicProducer and closing it after a subset of errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions