Skip to content

fix: support self-hosted Elasticsearch via --host/--port in elasticcloud commands#761

Merged
XuanYang-cn merged 2 commits intomainfrom
fix/elastic-cloud-self-hosted-host-port
Apr 20, 2026
Merged

fix: support self-hosted Elasticsearch via --host/--port in elasticcloud commands#761
XuanYang-cn merged 2 commits intomainfrom
fix/elastic-cloud-self-hosted-host-port

Conversation

@xiaofan-luan
Copy link
Copy Markdown
Collaborator

Summary

  • Extend ElasticCloudConfig to accept either cloud_id or a self-hosted connection (scheme / host / port / user).
  • Add --host / --port / --scheme / --user flags (and make --cloud-id optional) on all four elasticcloudhnsw* subcommands.
  • Existing --cloud-id based usage is unchanged — cloud_id still takes precedence when both are supplied.

Motivation (#758)

Users running stock Elasticsearch 8.x (self-hosted, on-prem, or Tencent/Aliyun-hosted stock ES) had no working VectorDBBench entrypoint:

  • elasticcloudhnsw* required --cloud-id — no way to point at a host/port.
  • tencentelasticsearch accepts --host/--port but hardcodes index_options.type = "vsearch" (Tencent-specific), which stock ES rejects with Unknown vector index options type [vsearch] for field [vector].

With this change, the correct mapping (hnsw, int8_hnsw, int4_hnsw, bbq_hnsw) can finally be used against self-hosted ES clusters:

vectordbbench elasticcloudhnsw \
  --scheme http --host rw-xxxxx --port 9204 \
  --user elastic --password 'pw' \
  --case-type Performance768D1M --m 16 --ef-construction 200 --k 10

Test plan

  • elasticcloudhnsw --cloud-id ... --password ... --dry-run — builds {"cloud_id": ..., "basic_auth": (...)} (backward compatible).
  • elasticcloudhnsw --host rw-test --port 9204 --scheme http --password ... --dry-run — builds {"hosts": [{...}], "basic_auth": (...)}.
  • elasticcloudhnsw --password ... --dry-run (neither flag) — rejected with ElasticCloudConfig requires either cloud_id or host to be set.
  • ruff check vectordb_bench/backend/clients/elastic_cloud/ — passes.
  • Manual run against a real self-hosted Elasticsearch 8.x cluster (requires reporter env).

Closes #758

🤖 Generated with Claude Code

…oud commands

ElasticCloudConfig previously required cloud_id, so the elasticcloudhnsw*
subcommands could only target Elastic Cloud. Users benchmarking self-hosted
stock Elasticsearch had no working path: tencentelasticsearch accepts
host/port but forces Tencent's vsearch index_options type, which stock ES
rejects with "Unknown vector index options type [vsearch]".

Extend ElasticCloudConfig with scheme/host/port/user fields (mutually
exclusive with cloud_id) and expose them on all four ElasticCloudHNSW*
CLI subcommands. Existing cloud_id callers are unchanged.

Refs #758

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sre-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xiaofan-luan, XuanYang-cn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@XuanYang-cn XuanYang-cn merged commit 0c20701 into main Apr 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

how does the VectorDBBench support self-built ElasticSearch

3 participants