docs(hgraph): document brute_force_threshold and add example#2124
docs(hgraph): document brute_force_threshold and add example#2124wxyucs wants to merge 1 commit into
Conversation
The HGraph search-time parameter `brute_force_threshold` (added in #1631 to let HGraph skip the graph walk and run an exact scan when the active filter's `ValidRatio()` is small) ships in code but was never documented or demonstrated, leaving the feature undiscoverable to users. This commit closes that gap without changing any code paths: - Add a parameter row + dedicated subsection to the EN and ZH HGraph index pages (`docs/docs/{en,zh}/src/indexes/hgraph.md`) covering trigger condition, storage-preference order, reorder-skip behavior, the iterator-search opt-out, and value-picking guidance. - Add a `brute_force_threshold` entry plus an extra JSON snippet to the top-level `docs/hgraph.md` reference. - Mention the new key in the HGraph search-params snippet of `docs/docs/{en,zh}/src/resources/index_parameters.md`. - Cross-link from the EN/ZH `advanced/filtered_search.md` Performance Notes section, since that is where users hitting "very selective filter" guidance currently land. - Add `examples/cpp/322_feature_hgraph_brute_force_threshold.cpp`, register it in `examples/cpp/CMakeLists.txt`, and list it in `examples/cpp/README.md`. The example builds a 10k-vector HGraph, defines a `Filter` with `ValidRatio() = 0.02`, and shows three runs: baseline graph search, threshold-below-ratio (fallback NOT triggered), threshold-above-ratio (fallback triggered). A hand-rolled exhaustive reference is printed so users can verify the exact-scan branch matches. No source code changes; the parameter, its default (`0.0`, disabled), dispatch logic, and tests already exist (`src/algorithm/hgraph/hgraph_parameter.{h,cpp}`, `src/algorithm/hgraph/hgraph_search.cpp`, `tests/test_hgraph.cpp`). Closes: #2123 Signed-off-by: Xiangyu Wang <wxy407827@antgroup.com> Assisted-by: OpenCode:claude-opus-4.7
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Require kind labelWonderful, this rule succeeded.
🟢 Require version labelWonderful, this rule succeeded.
|
There was a problem hiding this comment.
Code Review
This pull request introduces documentation and a runnable C++ example for the new HGraph search-time parameter brute_force_threshold, which automatically triggers an exact brute-force scan when filter selectivity is high. The review comments identify a few issues in the newly added files: missing standard library headers (<algorithm>, <memory>, and <utility>) and an unchecked .value() call on CreateIndex in the C++ example, as well as a broken relative link in docs/hgraph.md.
| #include <iostream> | ||
| #include <random> | ||
| #include <vector> |
There was a problem hiding this comment.
The example uses std::sort (which requires <algorithm>), std::shared_ptr/std::make_shared (which requires <memory>), and std::pair (which requires <utility>). These headers are not explicitly included, which can lead to compilation failures on compilers/standard libraries that do not transitively include them. Please include them explicitly.
#include <algorithm>
#include <iostream>
#include <memory>
#include <random>
#include <utility>
#include <vector>| "ef_construction": 200 | ||
| } | ||
| })"; | ||
| auto index = vsag::Factory::CreateIndex("hgraph", build_params).value(); |
There was a problem hiding this comment.
The return value of vsag::Factory::CreateIndex is a tl::expected (or similar), but .value() is called directly without checking if the creation succeeded. If index creation fails (e.g., due to invalid parameters), this will cause an unhandled exception or crash. It is safer and more consistent with the rest of the example to check has_value() and handle the error gracefully.
| auto index = vsag::Factory::CreateIndex("hgraph", build_params).value(); | |
| auto index_res = vsag::Factory::CreateIndex("hgraph", build_params); | |
| if (not index_res.has_value()) { | |
| std::cerr << "Failed to create index: " << index_res.error().message << std::endl; | |
| return -1; | |
| } | |
| auto index = index_res.value(); |
| - **Optional Values**: any float in `[0.0, 1.0]` | ||
| - **Default Value**: 0.0 (disabled — preserves legacy behavior) | ||
| - **Applies to**: `KnnSearch` (non-iterator overload, also used by `SearchWithRequest`) and `RangeSearch`. The iterator-style `KnnSearch` does not use this parameter. | ||
| - **Note**: The decision relies on `Filter::ValidRatio()` returning a meaningful selectivity estimate; see [filtered search](docs/docs/en/src/advanced/filtered_search.md). The brute-force scan visits every indexed id once to call `CheckValid`, so its cost is roughly `O(N × dim)` regardless of selectivity. A runnable example is [`322_feature_hgraph_brute_force_threshold.cpp`](https://github.com/antgroup/vsag/blob/main/examples/cpp/322_feature_hgraph_brute_force_threshold.cpp). |
There was a problem hiding this comment.
The relative link to filtered_search.md is broken. Since docs/hgraph.md is located in the docs/ directory, the path docs/docs/en/src/advanced/filtered_search.md resolves to docs/docs/docs/en/... which results in a 404 error. It should be corrected to docs/en/src/advanced/filtered_search.md.
| - **Note**: The decision relies on `Filter::ValidRatio()` returning a meaningful selectivity estimate; see [filtered search](docs/docs/en/src/advanced/filtered_search.md). The brute-force scan visits every indexed id once to call `CheckValid`, so its cost is roughly `O(N × dim)` regardless of selectivity. A runnable example is [`322_feature_hgraph_brute_force_threshold.cpp`](https://github.com/antgroup/vsag/blob/main/examples/cpp/322_feature_hgraph_brute_force_threshold.cpp). | |
| - **Note**: The decision relies on `Filter::ValidRatio()` returning a meaningful selectivity estimate; see [filtered search](docs/en/src/advanced/filtered_search.md). The brute-force scan visits every indexed id once to call `CheckValid`, so its cost is roughly `O(N � dim)` regardless of selectivity. A runnable example is [`322_feature_hgraph_brute_force_threshold.cpp`](https://github.com/antgroup/vsag/blob/main/examples/cpp/322_feature_hgraph_brute_force_threshold.cpp). |
There was a problem hiding this comment.
Pull request overview
Documentation-only PR that closes the discoverability gap for the already-shipped HGraph search-time parameter brute_force_threshold. It adds an English/Chinese reference, cross-links from the filtered-search guide, updates the top-level HGraph doc, and ships a runnable C++ example that contrasts graph-search vs. fallback-triggered vs. fallback-not-triggered modes against a hand-rolled exact reference.
Changes:
- New parameter row + dedicated subsection in
docs/docs/{en,zh}/src/indexes/hgraph.mdanddocs/hgraph.md, plus mentions inindex_parameters.mdand a cross-link fromfiltered_search.md(en/zh). - New example
examples/cpp/322_feature_hgraph_brute_force_threshold.cppregistered inCMakeLists.txtandREADME.md.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| examples/cpp/README.md | Lists the new 322 example. |
| examples/cpp/CMakeLists.txt | Registers the new example executable. |
| examples/cpp/322_feature_hgraph_brute_force_threshold.cpp | New runnable example demonstrating the fallback. |
| docs/hgraph.md | Adds brute_force_threshold reference entry and extra JSON snippet. |
| docs/docs/en/src/indexes/hgraph.md | Adds parameter row and dedicated subsection (English). |
| docs/docs/zh/src/indexes/hgraph.md | Adds parameter row and dedicated subsection (Chinese). |
| docs/docs/en/src/resources/index_parameters.md | Mentions the new key in HGraph search-params snippet (English). |
| docs/docs/zh/src/resources/index_parameters.md | Mentions the new key in HGraph search-params snippet (Chinese). |
| docs/docs/en/src/advanced/filtered_search.md | Cross-links to the new option from Performance Notes (English). |
| docs/docs/zh/src/advanced/filtered_search.md | Cross-links to the new option from Performance Notes (Chinese). |
| #include <vsag/vsag.h> | ||
|
|
||
| #include <iostream> | ||
| #include <random> | ||
| #include <vector> |
Change Type
Linked Issue
Closes: #2123
(Tracks the docs/example gap left by the now-merged feature issue #1631,
which only covered the code change for
brute_force_thresholdand did notinclude user-facing documentation or an example.)
What Changed
The HGraph search-time parameter
brute_force_thresholdalready ships incode (
src/algorithm/hgraph/hgraph_parameter.{h,cpp},src/algorithm/hgraph/hgraph_search.cpp, tests attests/test_hgraph.cpp:2480-2619) but is not mentioned in any user-facingdoc and has no example. This PR closes that discoverability gap without
touching any code path.
docs/docs/{en,zh}/src/indexes/hgraph.md— new row in the Searchparameters table plus a dedicated subsection explaining trigger
condition, storage-preference order, reorder-skip behavior, the
intentional iterator-search opt-out, and how to pick a value.
docs/hgraph.md— newbrute_force_thresholdreference entry and anextra JSON snippet.
docs/docs/{en,zh}/src/resources/index_parameters.md— mention the keyin the HGraph search-params snippet.
docs/docs/{en,zh}/src/advanced/filtered_search.md— cross-link from thePerformance Notes section so users hitting "very selective filter"
guidance see the new option alongside the existing "raise
ef_search"advice.
examples/cpp/322_feature_hgraph_brute_force_threshold.cppwith three runs (baseline graph search, threshold-below-ratio →
fallback NOT triggered, threshold-above-ratio → fallback triggered) plus
a hand-rolled exact reference. Registered in
examples/cpp/CMakeLists.txt; listed inexamples/cpp/README.md.Test Evidence
make fmtmake lintmake testmake cov, run tests, and collect coverageTest details:
Compatibility Impact
(
0.0, disabled), and its dispatch logic are unchanged.Performance and Concurrency Impact
Documentation Impact
README.mdDEVELOPMENT.mdCONTRIBUTING.mddocs/docs/{en,zh}/src/indexes/hgraph.md,docs/docs/{en,zh}/src/resources/index_parameters.md,docs/docs/{en,zh}/src/advanced/filtered_search.md,docs/hgraph.md,examples/cpp/README.md,examples/cpp/322_feature_hgraph_brute_force_threshold.cpp,examples/cpp/CMakeLists.txtRisk and Rollback
target changed besides adding a new
add_executable).Checklist
Closes: #2123)(N/A — pre-existing functional tests cover the parameter; the new
example serves as additional manual verification)