fix(pgvector): normalize index_type to lowercase in _create_index to …#760
fix(pgvector): normalize index_type to lowercase in _create_index to …#760foxspy merged 3 commits intozilliztech:mainfrom
Conversation
…match PostgreSQL access method names PostgreSQL pgvector extension registers index access methods in lowercase (e.g. "hnsw", "ivfflat"), but the frontend passes IndexType.HNSW.value which is uppercase "HNSW", causing "access method HNSW does not exist" error.
|
/assign @jamesgao-jpg |
|
/assign @XuanYang-cn |
Replaced index_param['index_type'] with index_type_lower for consistency.
| else sql.Identifier("embedding") | ||
| ), | ||
| index_type=sql.Identifier(index_param["index_type"]), | ||
| [FIX] Use lowercase index_type_lower instead of original index_param["index_type"] |
There was a problem hiding this comment.
I should add a # before [FIX] and have submitted the latest minor version.
lastest minor version url:1a2c0c1
|
@shaohuasong-fang Mind running this locally first to make sure it works? Thanks! |
|
I have submitted a minor version,url :1a2c0c1 |
|
I forgot copy sign "#" when i copied that line,sorry。 |
|
lastest minor version is 1a2c0c1 |
I have added the # before [FIX]
shaohuasong-fang
left a comment
There was a problem hiding this comment.
I have added the comment sign '#' before [FIX],please review lastest code.
| else sql.Identifier("embedding") | ||
| ), | ||
| index_type=sql.Identifier(index_param["index_type"]), | ||
| [FIX] Use lowercase index_type_lower instead of original index_param["index_type"] |
There was a problem hiding this comment.
I should add a # before [FIX] and have submitted the latest minor version.
lastest minor version url:1a2c0c1
|
lastest version:a9d32b8 |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: foxspy, shaohuasong-fang The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
I believe this could fixed the #759 |
For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps max_workers to 1, so there is always exactly one worker thread. There is no need to deepcopy self.db per thread — the single worker can use self.db directly via the connection already opened by task()'s `with self.db.init():`. The original code called deepcopy(self.db) inside _get_thread_db() after task() had already opened a live psycopg C-extension Connection on self.db. C-extension objects cannot be deep-copied, causing: TypeError: no default __reduce__ due to non-trivial __cinit__ Fix: remove the deepcopy branch entirely. All workers (thread-safe or not) now use self.db directly; thread-safety is guaranteed for non-thread-safe DBs by the max_workers=1 clamp. Also clean up stale comments in pgvector.py left over from zilliztech#760/zilliztech#763. Adds tests/test_pgvector.py with: - unit test that reproduces the bug (fails on original, passes on fix) - e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset See also: zilliztech#756 Signed-off-by: yangxuan <xuan.yang@zilliz.com>
For non-thread-safe DBs (e.g. PgVector), ConcurrentInsertRunner clamps max_workers to 1, so there is always exactly one worker thread. There is no need to deepcopy self.db per thread — the single worker can use self.db directly via the connection already opened by task()'s `with self.db.init():`. The original code called deepcopy(self.db) inside _get_thread_db() after task() had already opened a live psycopg C-extension Connection on self.db. C-extension objects cannot be deep-copied, causing: TypeError: no default __reduce__ due to non-trivial __cinit__ Fix: remove the deepcopy branch entirely. All workers (thread-safe or not) now use self.db directly; thread-safety is guaranteed for non-thread-safe DBs by the max_workers=1 clamp. Also clean up stale comments in pgvector.py left over from #760/#763. Adds tests/test_pgvector.py with: - unit test that reproduces the bug (fails on original, passes on fix) - e2e regression test via ConcurrentInsertRunner + OpenAI 50K dataset See also: #756 Signed-off-by: yangxuan <xuan.yang@zilliz.com>
…match PostgreSQL access method names
PostgreSQL pgvector extension registers index access methods in lowercase (e.g. "hnsw", "ivfflat"), but the frontend passes IndexType.HNSW.value which is uppercase "HNSW", causing "access method HNSW does not exist" error.