Skip to content

feat(vector sink): vector connection concurrency#25432

Open
splitice wants to merge 3 commits into
vectordotdev:masterfrom
X4BNet:feat/vector-concurrency
Open

feat(vector sink): vector connection concurrency#25432
splitice wants to merge 3 commits into
vectordotdev:masterfrom
X4BNet:feat/vector-concurrency

Conversation

@splitice
Copy link
Copy Markdown
Contributor

@splitice splitice commented May 14, 2026

Summary

Open multiple connections over the vector source.

Why? So that you can balance load to multiple backend servers that are either DNS round robined or balanced at the round robined at the IP layer.

Vector configuration

vector-receiver.yaml

api:
  enabled: true
  address: "127.0.0.1:8686"

sources:
  in:
    type: vector
    address: "127.0.0.1:6000"

sinks:
  out:
    type: console
    inputs:
      - in
    encoding:
      codec: json

vector-sender.yaml

sources:
  in:
    type: demo_logs
    interval: 0.0
    format: json
    count: 10000

transforms:
  mark:
    type: remap
    inputs:
      - in
    source: |
      .sender = "vector-sender"
      .ts = now()

sinks:
  out:
    type: vector
    inputs:
      - mark
    address: "http://127.0.0.1:6000"

    connection:
      concurrency: 2

    request:
      concurrency: 4

    batch:
      max_events: 1
      timeout_secs: 1

How did you test this PR?

Start the receiver with debug logging:

VECTOR_LOG=debug vector --config vector-receiver.yaml

Start the sender with debug logging:

VECTOR_LOG=debug vector --config vector-sender.yaml

In another shell, watch localhost connections:

watch -n 0.5 'ss -tanp | grep 6000'

You should see two established client connections plus a healthcheck connection from the sender to 127.0.0.1:6000.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

This will require documentation of the connection concurrency field.

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

My general performance discussion.

@splitice splitice requested a review from a team as a code owner May 14, 2026 07:36
@github-actions github-actions Bot added the domain: sinks Anything related to the Vector's sinks label May 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b6466e9a71

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sinks/vector/config.rs Outdated
@splitice
Copy link
Copy Markdown
Contributor Author

splitice commented May 14, 2026

Notes:

  • Could this have an ARC like system? Potentially - but lets KISS for now.
  • Could this be implemented in more sinks? Potentially - but I'm not sure how much code re-use there would be.
  • Should ARC and request concurrency be on the inside or outside of the connection logic? I went for inside due to implementation simplicity and due to differences in ARC implementation vs connection concurrency goals.
  • I'm still learning rust so sorry in advance for best practice violations. I needed alot of AI help with errors along the way. Definately review this code.

The main goal is to stop the huge load that can occur when one producer is sending logs to one consumer and that producer peaks in its log production. While there are many ways to do this the simplest is to balance between multiple consumers. In the future if an adaprtive connection system was introduced this could be much smarter. e.g spin up a new connection at $backlog.

Additional connection balancing modes could also be introduced in the future (e.g batch, round-robin). Currently it uses "least request"

Ideas for concurrency mode:

  • batch: Take the lowest and push the next $batch events there (good for compression)
  • round-robin: one for every destination (I'm not sure if there would be any benifit to this)
  • least connections: Whichever connection has the least requests queued gets the request (good for balance)

@splitice splitice changed the title feat(vector): vector connection concurrency feat(vector sink): vector connection concurrency May 14, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12076d0152

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/sinks/vector/config.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant