Skip to content

Aisaq improvments#1637

Open
liorf95 wants to merge 5 commits into
zilliztech:mainfrom
liorf95:aisaq_improvments
Open

Aisaq improvments#1637
liorf95 wants to merge 5 commits into
zilliztech:mainfrom
liorf95:aisaq_improvments

Conversation

@liorf95
Copy link
Copy Markdown
Contributor

@liorf95 liorf95 commented May 20, 2026

Related to #1611

@sre-ci-robot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: liorf95
To complete the pull request process, please assign liliu-z after the PR has been reviewed.
You can assign the PR to them by writing /assign @liliu-z in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

liorf95 added 5 commits May 20, 2026 09:43
    1.      Shorten build time for AiSAQ in scale configuration
    •       In this version build process will no longer rearrange the index, it will only generate the rearranged data, this data shall be used during search for reading the PQ vectors in a rearranged manner.
    •       Use multiple threads when generating the rearranged data during build.
    2.      Inline PQ optimization - Utilize sector unused space for additional inline PQ vectors when possible (this will not increase the index footprint on the disk).
    3.      When rearranged is enabled, instead of saving both aligned rearranged PQ vectors file and original PQ vectors file, only the first one is saved.
    4.      Set AiSAQ default beamwidth value from 8 to 2 to avoid excessive disk load.
    5.      Improved RAM estimation for index build with GPU
    6.      Fixed consumed memory size calculation function.
    7.      Fixed AiSAQ thread context leak when index 'size' API is called and all PQ vectors are stored inline.
    8.      Fixed context release to be done before notifying that it is free.
    9.      Fixed index file existence check.
    10.     Fixed index build with COSINE metric
    11.     Fixed context resource release order to avoid race condition.

    Signed-off-by: Gili Buzaglo <Gili.Buzaglo@il.kioxia.com>
    Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>

Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Gili Buzaglo <Gili.Buzaglo@il.kioxia.com>
Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Gili Buzaglo <Gili.Buzaglo@il.kioxia.com>
Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Gili Buzaglo <Gili.Buzaglo@il.kioxia.com>
Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
Signed-off-by: Gili Buzaglo <Gili.Buzaglo@il.kioxia.com>
Signed-off-by: Hiroshi Murayama <hiroshi4.murayama@kioxia.com>
Signed-off-by: Lior Friedman <lior.friedman@il.kioxia.com>
@liorf95 liorf95 force-pushed the aisaq_improvments branch from fb3c3b2 to f46ea53 Compare May 20, 2026 09:44
@mergify mergify Bot added dco-passed and removed needs-dco labels May 20, 2026
@mergify
Copy link
Copy Markdown

mergify Bot commented May 20, 2026

@liorf95 e2e jenkins job failed, comment /run-e2e can trigger the job again.

Copy link
Copy Markdown
Collaborator

@alexanderguzhva alexanderguzhva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, are there any changes on the backward compatibility?

knowhere::KnowhereConfig::SetSimdType(knowhere::KnowhereConfig::SimdType::AVX2);
knowhere::KnowhereConfig::SetBuildThreadPoolSize(default_build_thread_num);
knowhere::KnowhereConfig::SetSearchThreadPoolSize(default_search_thread_num);
knowhere::KnowhereConfig::SetBuildThreadPoolSize(48);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep these numbers as is, and add environment overrides, if needed. This is related only to these two numbers of threads. We also run these benchmarks internally.
Alternatively, just please introduce a new benchmark .cpp file, which is more suitable for your needs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

SetUp() override {
T0_ = elapsed();
set_ann_test_name("sift-128-euclidean");
const char* dataset = std::getenv("DATASET");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use not DATASET, but BENCH_DATASET, in order to specialize environment variables and ensure that theirs names are related to the benchmark code. DATASET is a too generic name.

Same for others

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

constexpr size_t default_pool_size = default_max_nr / default_max_events;

struct AioInfo {
long aio_nr;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please initialize fields with default values

size_t available_aio = (size_t)(aio_info.aio_max_nr - aio_info.aio_nr);
if(num_ctx*max_events > available_aio) {
LOG_KNOWHERE_INFO_ << " max events requested is too high: " << max_events << " setting to: " << available_aio/num_ctx;
max_events= max_events_ = available_aio/num_ctx;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use several a = b; instead

auto item = params.find(name);
if (item == params.end()) {
throw std::invalid_argument("Invalid parameter name.");
throw std::invalid_argument("Invalid parameter name."+name);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw std::invalid_argument("Invalid parameter name (" + name + ")");

vamana_reader.read((char *)&nnbrs, sizeof(uint32_t));

// sanity checks on nnbrs
assert(nnbrs > 0);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please confirm that these are assert() calls, not a C++ check-and-throw

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, these are asserts.

inline_pq_vectors, max_node_len);


uint64_t *vamana_reader_node_to_pos_map = nullptr;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this a major upgrade, then maybe it is time to upgrade these naked pointers and T* a = new b[]; calls into std::unique_ptr<T[]> a = std::make_unique<T[]>(b); C++ calls. Custom deallocators can also be passed (like std::unique_ptr<FILE, decltype(&fclose)>)

#include <stdio.h>
#include <memory>

bool Avx2SupportedCPU = true;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these two used anywhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants