feat: Support List in storage#157
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Review Summary by QodoAdd List type support in storage layer with ListView and ListColumn
WalkthroughsDescription• Add comprehensive List type support in storage layer • Implement ListView and ListViewBuilder for efficient list serialization • Create ListColumn for persistent list property storage • Add conversion functions between execution Values and storage ListViews • Support nested lists and both POD and variable-length element types Diagramflowchart LR
A["Execution Value::LIST"] -->|ListViewToValue| B["ListView"]
B -->|get_view| C["ListColumn"]
C -->|set_value| D["ListViewBuilder"]
D -->|finish_pod/varlen| E["Binary Blob"]
E -->|storage| F["Persistent Storage"]
A -->|ValueToListBlob| E
G["Property"] -->|as_list_data| E
E -->|from_list_data| G
File Changes1. include/neug/utils/property/list_view.h
|
Code Review by Qodo
1. Dangling list blob view
|
|
Currently the implementation is finished. It turns out the it used the |
36b6978 to
df9110a
Compare
cd tools/python_bind
python -m pytest tests/test_ddl.py -k test_list_type_with_default_values
# got error
)
E RuntimeError: Failed to execute query: CREATE NODE TABLE TaggedNode(id INT64 PRIMARY KEY, tags STRING[] DEFAULT ['default_tag']);. Error code: 3000, Error Message: ERR_COMPILATION: Default value expression should be a constant at /Users/zhanglei/code/alibaba/neug/src/compiler/gopt/g_expr_converter.cpp:301 func: convertDefaultValue, Error Code: 9999 |
d5d2cc5 to
d4ca7d1
Compare
14caa51 to
f5bd0c9
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds end-to-end support for List/array properties in storage and execution, including serialization, column storage, planner conversions, and tests (fixes #159).
Changes:
- Introduces
ListView/ListViewBuilderblob format and aListColumnimplementation for list-typed properties. - Updates Value↔Property conversions with
OwnedPropertyto safely handle non-owningPropertyviews for var-len types (esp. list blobs). - Adds storage/execution/compiler + Python binding tests for list and nested-list behavior and persistence.
Reviewed changes
Copilot reviewed 42 out of 42 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/python_bind/tests/test_ddl.py | Adds Python DDL/query tests for list columns, defaults, and persistence. |
| tools/python_bind/tests/test_db_list.py | Adds Python binding test for nested list properties. |
| tests/storage/test_list_column.cc | New C++ storage tests for list blobs, list columns, persistence, and conversions. |
| tests/storage/CMakeLists.txt | Registers the new test_list_column target and fixes formatting. |
| tests/execution/test_value.cc | Updates conversion tests to use OwnedProperty + adds list round-trip coverage. |
| tests/compiler/gopt_test.h | Updates GPhysicalConvertor construction to pass client context. |
| src/utils/yaml_utils.cc | Adds YAML encoding for list types (array.component_type). |
| src/utils/property/property.cc | Adds default list value + serialization/deserialization support for list properties. |
| src/utils/property/column.cc | Enables ListColumn/ref-column creation when schema type is kList. |
| src/utils/pb_utils.cc | Adds proto-to-DataType support for arrays + value conversion for array values. |
| src/execution/expression/exprs/path_expr.cc | Passes expected DataType into property_to_value for list-aware decoding. |
| src/execution/expression/accessors/vertex_accessor.cc | Passes expected DataType into property_to_value. |
| src/execution/expression/accessors/record_accessor.cc | Passes expected DataType into property_to_value. |
| src/execution/expression/accessors/edge_accessor.cc | Passes expected DataType into property_to_value. |
| src/execution/execute/ops/retrieve/scan_utils.cc | Uses OwnedProperty when converting parameter values to Properties. |
| src/execution/execute/ops/ddl/create_vertex_type.cc | Keeps default property blobs alive via OwnedProperty while building schema params. |
| src/execution/execute/ops/ddl/create_edge_type.cc | Keeps default property blobs alive via OwnedProperty while building schema params. |
| src/execution/execute/ops/ddl/add_vertex_property.cc | Keeps default property blobs alive via OwnedProperty while building alter params. |
| src/execution/execute/ops/ddl/add_edge_property.cc | Keeps default property blobs alive via OwnedProperty while building alter params. |
| src/execution/execute/ops/batch/batch_update_vertex.cc | Uses OwnedProperty during updates and fixes type-mismatch reporting. |
| src/execution/common/types/value.cc | Implements list blob (de)serialization + switches value_to_property to OwnedProperty. |
| src/execution/common/operators/retrieve/join.cc | Uses OwnedProperty for PK join lookups. |
| src/execution/common/operators/insert/create_vertex.cc | Uses OwnedProperty to keep PK/property buffers alive during inserts. |
| src/execution/common/operators/insert/create_edge.cc | Uses OwnedProperty to keep edge property buffers alive during inserts. |
| src/compiler/planner/gopt_planner.cc | Passes client context into GPhysicalConvertor. |
| src/compiler/gopt/g_query_converter.cpp | Threads client context through query conversion; updates nested convertor usage. |
| src/compiler/gopt/g_expr_converter.cpp | Adds constant folding for list default expressions; allows empty array literals. |
| src/compiler/gopt/g_ddl_converter.cpp | Uses logical-type conversion that can represent list types. |
| src/compiler/binder/expression_visitor.cpp | Treats list creation functions as constant if children are constant. |
| include/neug/utils/property/property.h | Adds list accessors + PropUtils<ListView> + list stringification. |
| include/neug/utils/property/list_view.h | New list blob view/builder with POD and var-len encodings (incl. nested lists). |
| include/neug/utils/property/column.h | Refactors var-len column logic, adds ListColumn + TypedRefColumn<ListView>. |
| include/neug/storages/graph/schema.h | Preserves ownership of default list blobs similarly to default strings. |
| include/neug/execution/common/types/value.h | Updates API: value_to_property→OwnedProperty, property_to_value requires DataType. |
| include/neug/execution/common/types/owned_property.h | New wrapper to keep var-len buffers alive for non-owning Property views. |
| include/neug/compiler/gopt/g_type_utils.h | Adds YAML conversion for LIST logical types. |
| include/neug/compiler/gopt/g_scalar_type.h | Removes noisy logging for list/struct scalar type detection. |
| include/neug/compiler/gopt/g_query_converter.h | Extends converter to carry client context. |
| include/neug/compiler/gopt/g_physical_convertor.h | Extends physical convertor to carry client context and pass it through. |
| include/neug/compiler/gopt/g_expr_converter.h | Updates expr converter to accept client context and expose folding helper. |
| include/neug/compiler/gopt/g_ddl_converter.h | Updates DDL converter to pass client context to expr converter. |
| include/neug/common/types.h | Adds is_pod_type() helper for list encoding/decoding decisions. |
Comments suppressed due to low confidence (1)
tools/python_bind/tests/test_ddl.py:1
- These new tests write to hard-coded
/tmp/...paths. This can make the suite flaky under parallel runs and can fail in environments without/tmpor with restricted permissions. Suggested fix: use pytest’stmp_pathfixture (as used in other tests) to create per-test temporary directories, and avoidshutil.rmtreeon fixed paths.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1345597 to
d225b9c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 43 out of 43 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (5)
tools/python_bind/tests/test_db_list.py:1
- The pattern syntax uses a space in the label (
(p: PERSON)/MATCH (p: PERSON)), which is inconsistent with the other tests in this PR (:TestNode) and is likely invalid in Cypher-like grammars. Also,MATCH ... RETURNwithoutORDER BYcan yield nondeterministic row ordering, making the test flaky. Remove the label whitespace (e.g.,:PERSON) and addORDER BY p.id(or equivalent) before asserting on row order.
tools/python_bind/tests/test_ddl.py:1 - These tests write to fixed
/tmp/...directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g.,tmp_path) or generating a unique temp directory per test (e.g.,tempfile.mkdtemp) and cleaning up reliably.
tools/python_bind/tests/test_ddl.py:1 - These tests write to fixed
/tmp/...directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g.,tmp_path) or generating a unique temp directory per test (e.g.,tempfile.mkdtemp) and cleaning up reliably.
tools/python_bind/tests/test_ddl.py:1 - These tests write to fixed
/tmp/...directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g.,tmp_path) or generating a unique temp directory per test (e.g.,tempfile.mkdtemp) and cleaning up reliably.
tests/storage/test_list_column.cc:1 - The storage tests use predictable
/tmp/test_list_column_*directories, which can conflict when tests are run concurrently (or if a prior run left behind partial state). Use a per-test unique directory understd::filesystem::temp_directory_path()(e.g., include PID + random suffix) or a test framework temp-dir helper, and consider cleaning up inTearDown()to minimize leftover state.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // if the function can be folded to a constant, return true, otherwise return | ||
| // false | ||
| bool canFold(std::shared_ptr<binder::Expression> expr) { | ||
| if (expr->expressionType != common::ExpressionType::FUNCTION) { | ||
| return false; | ||
| } | ||
|
|
||
| auto& funcExpr = expr->constCast<binder::ScalarFunctionExpression>(); | ||
| auto children = funcExpr.getChildren(); | ||
| for (auto child : children) { | ||
| if (child->expressionType != common::ExpressionType::LITERAL && | ||
| !canFold(child)) { | ||
| return false; | ||
| } | ||
| } | ||
| return true; | ||
| } |
There was a problem hiding this comment.
canFold is defined with external linkage in the .cpp and folds any function expression whose inputs are literal/foldable, without checking whether the function is deterministic/immutable. For default values, folding non-deterministic functions at compile time can change semantics (e.g., time/random-like functions). Recommend putting helpers in an anonymous namespace (or making them static) and restricting folding to a known-safe function set (or checking function metadata for foldability/determinism).
support to_list expression in compiler Committed-by: Xiaoli Zhou from Dev container resolve some comments use owned property to hold the string memory temporarily when passing it to storage fix default value of list type in ddl Committed-by: Xiaoli Zhou from Dev container fix fix fold constant bugs Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container todo: resolve the test cases minor fix Revert "fix fold constant bugs" This reverts commit 287e2a9. Committed-by: Xiaoli Zhou from Dev container convert list const value to to_list expression Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container Committed-by: Xiaoli Zhou from Dev container fix bug fix
Support Type List in storage.
Fix #159