|
| 1 | +def explainPrams(st): |
| 2 | + st.markdown("## descriptions") |
| 3 | + st.markdown("### 1.Overview") |
| 4 | + st.markdown( |
| 5 | + """ |
| 6 | +- **VectorDBBench** is an open-source benchmarking tool designed specifically for vector databases. Its main features include: |
| 7 | + - (1) An easy-to-use **web UI** for configuration of tests and visual analysis of results. |
| 8 | + - (2) A comprehensive set of **standards for testing and metric collection**. |
| 9 | + - (3) Support for **various scenarios**, including additional support for **Filter** and **Streaming** based on standard tests. |
| 10 | +- VectorDBBench embraces open-source and welcome contributions of code and test result submissions. The testing process and extended scenarios of VectorDBBench, as well as the intention behind our design will be introduced as follows. |
| 11 | +""" |
| 12 | + ) |
| 13 | + st.markdown("### 2.Dataset") |
| 14 | + st.markdown( |
| 15 | + """ |
| 16 | +- We provide two embedding datasets: |
| 17 | + - (1)*[Cohere 768dim](https://huggingface.co/datasets/Cohere/wikipedia-22-12)*, generated using the **Cohere** model based on the Wikipedia corpus. |
| 18 | + - (2)*[Cohere 1024dim](https://huggingface.co/datasets/Cohere/beir-embed-english-v3)*, generated using the **Cohere** embed-english-v3.0 model based on the bioasq corpus. |
| 19 | + - (3)*OpenAI 1536dim*, generated using the **OpenAI** model based on the [C4 corpus](https://huggingface.co/datasets/legacy-datasets/c4). |
| 20 | +""" |
| 21 | + ) |
| 22 | + st.markdown("### 3.Standard Test") |
| 23 | + st.markdown( |
| 24 | + """ |
| 25 | +The test is actually divided into 3 sub-processes |
| 26 | +- **3.1 Test Part 1 - Load (Insert + Optimize)** |
| 27 | + - (1) Use a single process to perform serial inserts until all data is inserted, and record the time taken as **insert_duration**. |
| 28 | + - (2) For most vector databases, index construction requires additional time to optimize to achieve an optimal state, and record the time taken as **optimize_duration**. |
| 29 | + - (3) **Load_duration (insert_duration + optimize_duration)** can be understood as the time from the start of insertion until the database is ready to query. |
| 30 | + - load_duration can serve as a reference for the insert capability of a vector database to some extent. However, it should be noted that some vector databases may perform better under **concurrent insert operations**. |
| 31 | +- **3.2 Test Part 2 - Serial Search Test** |
| 32 | + - (1) Use a single process to perform serial searches, record the results and time taken for each search, and calculate **recall** and **latency**. |
| 33 | + - (2) **Recall**: For vector databases, most searches are approximately nearest neighbor(ANN) searches rather than perfectly accurate results. In production environments, commonly targeted recall rates are 0.9 or 0.95. |
| 34 | + - Note that there is a **trade-off** between **accuracy** and **search performance**. By adjusting parameters, it is possible to sacrifice some accuracy in exchange for better performance. We recommend comparing performance while ensuring that the recall rates remain reasonably close. |
| 35 | + - (3) **Latency**:**p99** rather than average. **latency_p99** focuses on **the slowest 1% of requests**. In many high-demand applications, ensuring that most user requests stay within acceptable latency limits is critical, whereas **latency_avg** can be skewed by faster requests. |
| 36 | + - **serial_latency** can serve as a reference for a database's search capability to some extent. However, serial_latency is significantly affected by network conditions. We recommend running the test client and database server within the same local network. |
| 37 | +- **3.3 Test Part 3 - Concurrent Search Test** |
| 38 | + - (1) Create multiple processes, each perform serial searches independently to test the database's **maximum throughput(max-qps)**. |
| 39 | + - (2) Since different databases may reach peak throughput under different conditions, we conduct multiple test rounds. The number of processes **starts at 1 by default and gradually increases up to 80**, with each test group running for **30 seconds**. |
| 40 | + - Detailed latency and QPS metrics at different concurrency levels can be viewed on the <a href="concurrent" target="_self" style="text-decoration: none;">*concurrent*</a> page. |
| 41 | + - The highest recorded QPS value from these tests will be selected as the final max-qps. |
| 42 | +""", |
| 43 | + unsafe_allow_html=True, |
| 44 | + ) |
| 45 | + st.markdown("### 4.Filter Search Test") |
| 46 | + st.markdown( |
| 47 | + """ |
| 48 | +- Compared to the Standard Test, the **Filter Search** introduces additional scalar constraints (e.g. **color == red**) during the Search Test. Different **filter_ratios** present varying levels of challenge to the VectorDB's search performance. |
| 49 | +- We provide an additional **string column** containing 10 labels with different distribution ratios (50%,20%,10%,5%,2%,1%,0.5%,0.2%,0.1%). For each label, we conduct both a **Serial Test** and a **Concurrency Test** to observe the VectorDB's performance in terms of **QPS, latency, and recall** under different filtering conditions. |
| 50 | +""" |
| 51 | + ) |
| 52 | + st.markdown("### 5.Streaming Search Test") |
| 53 | + st.markdown( |
| 54 | + """ |
| 55 | +Different from Standard's load and search separation, Streaming Search Test primarily focuses on **search performance during the insertion process**. |
| 56 | +Different **base dataset sizes** and varying **insertion rates** set distinct challenges to the VectorDB's search capabilities. |
| 57 | +VectorDBBench will send insert requests at a **fixed rate**, maintaining consistent insertion pressure. The search test consists of three steps as follows: |
| 58 | +- 1.**Streaming Search** |
| 59 | + - Users can configure **multiple search stages**. When the inserted data volume reaches a specified stage, a **Serial Test** and a **Concurrent Test** will be conducted, recording qps, latency, and recall performance. |
| 60 | +- 2.**Streaming Final Search** |
| 61 | + - After all of the data is inserted, a Serial Test and a Concurrent Test are immediately performed, recording qps, latency, and recall performance. |
| 62 | + - Note: at this time, the insertion pressure drops to zero since data insertion is complete. |
| 63 | +- 3.**Optimized Search (Optional)** |
| 64 | + - Users can optionally perform an additional optimization step followed by a Serial Test and a Concurrent Test, recording qps, latency, and recall performance. This step **compares performance in Streaming section with the theoretically optimal performance**. |
| 65 | +""" |
| 66 | + ) |
0 commit comments