Skip to content

feat: Support List in storage#157

Open
zhanglei1949 wants to merge 8 commits into
alibaba:mainfrom
zhanglei1949:zl/feat-list
Open

feat: Support List in storage#157
zhanglei1949 wants to merge 8 commits into
alibaba:mainfrom
zhanglei1949:zl/feat-list

Conversation

@zhanglei1949
Copy link
Copy Markdown
Member

@zhanglei1949 zhanglei1949 commented Apr 1, 2026

Support Type List in storage.
Fix #159

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add List type support in storage layer with ListView and ListColumn

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add comprehensive List type support in storage layer
• Implement ListView and ListViewBuilder for efficient list serialization
• Create ListColumn for persistent list property storage
• Add conversion functions between execution Values and storage ListViews
• Support nested lists and both POD and variable-length element types
Diagram
flowchart LR
  A["Execution Value::LIST"] -->|ListViewToValue| B["ListView"]
  B -->|get_view| C["ListColumn"]
  C -->|set_value| D["ListViewBuilder"]
  D -->|finish_pod/varlen| E["Binary Blob"]
  E -->|storage| F["Persistent Storage"]
  A -->|ValueToListBlob| E
  G["Property"] -->|as_list_data| E
  E -->|from_list_data| G
Loading

Grey Divider

File Changes

1. include/neug/utils/property/list_view.h ✨ Enhancement +241/-0

New ListView and ListViewBuilder implementations

include/neug/utils/property/list_view.h


2. include/neug/utils/property/column.h ✨ Enhancement +199/-0

Add ListColumn class for list storage

include/neug/utils/property/column.h


3. include/neug/utils/property/property.h ✨ Enhancement +31/-0

Add list data accessors to Property class

include/neug/utils/property/property.h


View more (19)
4. include/neug/execution/common/types/value.h ✨ Enhancement +6/-0

Add ListViewToValue conversion function declaration

include/neug/execution/common/types/value.h


5. src/execution/common/types/value.cc ✨ Enhancement +122/-0

Implement list serialization and conversion functions

src/execution/common/types/value.cc


6. src/utils/pb_utils.cc ✨ Enhancement +66/-2

Add protobuf to list type conversion support

src/utils/pb_utils.cc


7. src/utils/property/column.cc ✨ Enhancement +3/-0

Register ListColumn in CreateColumn factory

src/utils/property/column.cc


8. src/utils/property/property.cc ✨ Enhancement +6/-0

Add list default value support

src/utils/property/property.cc


9. src/utils/yaml_utils.cc ✨ Enhancement +5/-0

Add YAML serialization for list types

src/utils/yaml_utils.cc


10. src/compiler/gopt/g_ddl_converter.cpp ✨ Enhancement +5/-5

Replace convertSimpleLogicalType with convertLogicalType

src/compiler/gopt/g_ddl_converter.cpp


11. src/compiler/gopt/g_expr_converter.cpp 🐞 Bug fix +2/-4

Allow empty lists in array conversion

src/compiler/gopt/g_expr_converter.cpp


12. tests/storage/test_list_column.cc 🧪 Tests +527/-0

Comprehensive test suite for list functionality

tests/storage/test_list_column.cc


13. tests/storage/CMakeLists.txt 🧪 Tests +3/-1

Register test_list_column test target

tests/storage/CMakeLists.txt


14. tools/python_bind/tests/test_ddl.py 🧪 Tests +15/-0

Add test for list type DDL and queries

tools/python_bind/tests/test_ddl.py


15. tools/python_bind/tests/test_lsqb.py Formatting +0/-1

Minor formatting cleanup

tools/python_bind/tests/test_lsqb.py


16. tools/python_bind/neug/proto/basic_type_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/basic_type_pb2.py


17. tools/python_bind/neug/proto/common_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/common_pb2.py


18. tools/python_bind/neug/proto/error_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/error_pb2.py


19. tools/python_bind/neug/proto/expr_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/expr_pb2.py


20. tools/python_bind/neug/proto/results_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/results_pb2.py


21. tools/python_bind/neug/proto/type_pb2.py Formatting +0/-1

Remove unnecessary blank line in proto file

tools/python_bind/neug/proto/type_pb2.py


22. tools/python_bind/setup.py Formatting +0/-1

Remove unnecessary blank line in setup

tools/python_bind/setup.py


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 1, 2026

Code Review by Qodo

🐞 Bugs (7) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Dangling list blob view 🐞 Bug ≡ Correctness
Description
execution::value_to_property builds a temporary std::string blob for list values and returns
Property::from_list_data(blob), but Property stores list data as std::string_view so it dangles
immediately. This can cause use-after-free when inserting into storage (Table::insert ->
ColumnBase::set_any) and can corrupt stored list data or crash.
Code

src/execution/common/types/value.cc[R817-820]

+  case DataTypeId::kList: {
+    std::string blob = ValueToListBlob(value);
+    return Property::from_list_data(blob);
+  }
Evidence
The list Property stores only a string_view (no ownership), but value_to_property constructs the
blob as a local std::string and returns a Property that references it; later, insert paths store the
Property in vectors and then call set_any which reads the dangling view to memcpy into column
buffers.

src/execution/common/types/value.cc[817-820]
include/neug/utils/property/property.h[146-166]
src/execution/common/operators/insert/create_vertex.cc[60-81]
src/utils/property/table.cc[278-285]
include/neug/utils/property/column.h[573-575]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`Property` stores list blobs as `std::string_view`, but `execution::value_to_property` creates the blob as a temporary `std::string` and returns `Property::from_list_data(blob)`. The returned `Property` immediately holds a dangling pointer.

### Issue Context
List blobs are commonly produced dynamically (serialization from `Value::LIST`), so the system needs an ownership strategy. The current Property implementation uses `memcpy`-based assignment, so adding owning fields requires updating copy/assignment semantics.

### Fix Focus Areas
- src/execution/common/types/value.cc[817-820]
- include/neug/utils/property/property.h[146-166]
- src/utils/property/table.cc[278-285]

### Suggested fix
1. Introduce ownership for variable-length payloads (at minimum for `kList`, and ideally unify with `kVarchar`). Options:
  - Add an owning buffer inside `Property` (e.g., `std::shared_ptr<std::string>` or `std::string` + proper copy/move/assignment/destructor) and make `as_list_data()` return a view into owned storage.
  - Or redesign the insert path to avoid storing list blobs inside `Property` as a view (e.g., pass an owned blob alongside the Property vector).
2. Update `Property::operator=` and any serialization/deserialization code to handle the new ownership model safely (remove `memcpy`-based copying for non-trivial members).
3. Add tests that insert list properties via the normal execution/storage insert path to catch lifetime bugs (not just unit tests over builders).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. WAL cannot serialize lists 🐞 Bug ☼ Reliability
Description
WAL redo serialization writes Property values with arc << prop, but Property archive operators do
not handle DataTypeId::kList, so inserts/updates with list properties will throw during WAL writing
or fail during recovery ingestion. This breaks durability/recovery once list properties are used.
Code

src/utils/property/property.cc[R60-65]

+  case DataTypeId::kList:
+    // An empty list blob (no elements) serves as the default value for list
+    // properties.  ListView::size() returns 0 when the blob is shorter than
+    // 4 bytes, so an empty string_view is a valid representation.
+    default_value.set_list_data(std::string_view{});
+    break;
Evidence
WAL InsertVertexRedo serializes every Property via the Property operator<<, but that operator only
supports scalar types and throws for unsupported types; there is no kList case implemented, so any
list Property will fail serialization.

src/transaction/wal/wal.cc[374-383]
src/utils/property/property.cc[74-103]
include/neug/utils/property/property.h[146-158]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
WAL serialization uses `arc << Property`, but `Property` serialization/deserialization does not support `DataTypeId::kList`. This will throw when a list property is present.

### Issue Context
Redo logs serialize all properties on insert/update. With list properties enabled, WAL must be able to write/read list blobs.

### Fix Focus Areas
- src/utils/property/property.cc[74-160]
- src/transaction/wal/wal.cc[374-415]

### Suggested fix
1. Extend `InArchive& operator<<(InArchive&, const Property&)` to handle `DataTypeId::kList` by writing the raw list blob (`as_list_data()`) similarly to how varchar writes `as_string_view()`.
2. Extend `OutArchive& operator>>(OutArchive&, Property&)` to handle reading `kList` by reading a `std::string_view` blob and calling `set_list_data(...)` (or the new owning setter if you implement ownership).
3. Add a WAL roundtrip test that inserts a vertex with a list property, flushes WAL, and replays it to ensure list blobs survive recovery.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Missing type_traits include 🐞 Bug ≡ Correctness
Description
include/neug/utils/property/list_view.h uses std::is_trivially_copyable_v but does not include
<type_traits>, which can cause compilation failures depending on transitive include order. This
breaks header self-sufficiency.
Code

include/neug/utils/property/list_view.h[R167-170]

+  template <typename T>
+  void append_pod(const T& val) {
+    static_assert(std::is_trivially_copyable_v<T>,
+                  "append_pod requires a trivially copyable type");
Evidence
The header directly references std::is_trivially_copyable_v yet its include list omits
<type_traits>, so builds that include this header without other incidental includes will fail.

include/neug/utils/property/list_view.h[15-26]
include/neug/utils/property/list_view.h[166-173]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`list_view.h` uses `std::is_trivially_copyable_v` but doesn't include `<type_traits>`.

### Fix Focus Areas
- include/neug/utils/property/list_view.h[15-26]

### Suggested fix
Add:
```cpp
#include <type_traits>
```
near the other standard includes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
4. Missing POD list child types 🐞 Bug ≡ Correctness
Description
ValueToListBlob and ListViewToValue only support a subset of POD child types, but is_pod_type()
classifies int8/int16/uint8/uint16 as POD. Lists of these element types will fail conversion between
execution Values and storage blobs (either falling into the non-POD path and throwing, or being
treated as unsupported).
Code

src/execution/common/types/value.cc[R753-775]

+  if (is_pod_type(child_type.id())) {
+    switch (child_type.id()) {
+#define APPEND_POD(type_enum, cpp_type)           \
+  case DataTypeId::type_enum:                     \
+    for (const auto& c : children) {              \
+      builder.append_pod(c.GetValue<cpp_type>()); \
+    }                                             \
+    return builder.finish_pod<cpp_type>();
+      APPEND_POD(kBoolean, bool)
+      APPEND_POD(kInt32, int32_t)
+      APPEND_POD(kInt64, int64_t)
+      APPEND_POD(kUInt32, uint32_t)
+      APPEND_POD(kUInt64, uint64_t)
+      APPEND_POD(kFloat, float)
+      APPEND_POD(kDouble, double)
+      APPEND_POD(kDate, date_t)
+      APPEND_POD(kTimestampMs, timestamp_ms_t)
+      APPEND_POD(kInterval, interval_t)
+#undef APPEND_POD
+    default:
+      break;
+    }
+  }
Evidence
is_pod_type() returns true for kInt8/kInt16/kUInt8/kUInt16, but ValueToListBlob’s POD switch does
not handle them; ListViewToValue’s switch similarly omits them, so these list types cannot roundtrip
between execution and storage.

include/neug/utils/property/list_view.h[35-55]
src/execution/common/types/value.cc[752-790]
src/execution/common/types/value.cc[943-997]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`is_pod_type()` includes `kInt8`, `kInt16`, `kUInt8`, `kUInt16`, but `ValueToListBlob` / `ListViewToValue` don't implement these cases. This makes some list schemas unusable.

### Fix Focus Areas
- src/execution/common/types/value.cc[745-790]
- src/execution/common/types/value.cc[937-1000]
- include/neug/utils/property/list_view.h[35-55]

### Suggested fix
1. Add APPEND_POD cases in `ValueToListBlob` for:
  - `kInt8` -> `int8_t`
  - `kInt16` -> `int16_t`
  - `kUInt8` -> `uint8_t`
  - `kUInt16` -> `uint16_t`
2. Add corresponding cases in `ListViewToValue` to construct `Value::INT8/INT16/UINT8/UINT16` if those constructors exist; otherwise, implement them or decide to explicitly mark these child types unsupported and remove them from `is_pod_type()`.
3. Add unit tests for these missing element types.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. ListColumn ignores non-empty defaults 🐞 Bug ≡ Correctness
Description
ListColumn::resize(size, default_value) validates the type and then does not apply default_value to
newly-added rows, leaving them empty lists. This violates the ColumnBase contract used elsewhere
(TypedColumn fills new rows with the provided default) and breaks schema DEFAULT list values.
Code

include/neug/utils/property/column.h[R538-545]

+  void resize(size_t size, const Property& default_value) override {
+    if (default_value.type() != DataTypeId::kList &&
+        default_value.type() != DataTypeId::kEmpty) {
+      THROW_RUNTIME_ERROR("Default value type does not match list column");
+    }
+    resize(size);
+    // Leave entries zero-initialized (empty lists) for new slots.
+  }
Evidence
TypedColumn::resize(size, default_value) explicitly writes default_value into each newly added row,
and VertexTable grows storage by calling Table::resize(capacity, schema default values).
ListColumn’s implementation discards non-empty defaults, so newly allocated rows will not match the
schema defaults for list-typed properties.

include/neug/utils/property/column.h[140-153]
include/neug/utils/property/column.h[538-545]
src/storages/graph/vertex_table.cc[185-187]
src/utils/property/table.cc[295-305]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ListColumn::resize(size, default_value)` does not populate new rows with `default_value`, unlike other column implementations.

### Issue Context
Schema default values are used when tables grow (e.g., `EnsureCapacity`). For list columns, non-empty defaults will be silently lost.

### Fix Focus Areas
- include/neug/utils/property/column.h[538-545]
- include/neug/utils/property/column.h[554-575]
- src/storages/graph/vertex_table.cc[185-187]

### Suggested fix
1. Implement default filling for newly-added indices similar to `TypedColumn<T>::resize`:
  - Compute `old_size`, resize storage, then for i in [old_size, size): set items_ to reference the default blob.
2. To avoid duplicating blob bytes N times, you can store the default blob once in `data_` and set all new `items_[i]` to the same `{offset,length}`.
3. If the default blob is empty, leave the zero-initialized entries as-is.
4. Add a unit/integration test that defines a list column with a non-empty DEFAULT and verifies inserted rows (or resized capacity rows) read back that default.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

6. ListView lacks blob validation 🐞 Bug ☼ Reliability
Description
ListView parses raw bytes with reinterpret_cast and uses only asserts for bounds;
malformed/corrupted blobs can cause out-of-bounds reads or undefined behavior (including unaligned
loads). This can crash the process when reading corrupted storage or untrusted inputs.
Code

include/neug/utils/property/list_view.h[R93-139]

+  // Number of elements stored in this list.
+  size_t size() const {
+    if (data_.size() < sizeof(uint32_t)) {
+      return 0;
+    }
+    return static_cast<size_t>(
+        *reinterpret_cast<const uint32_t*>(data_.data()));
+  }
+
+  // Direct element access for POD child types.
+  // The template argument T must match the actual element C++ type.
+  template <typename T>
+  const T& GetElem(size_t idx) const {
+    assert(is_pod_type(ListType::GetChildType(type_).id()));
+    assert(idx < size());
+    return reinterpret_cast<const T*>(data_.data() + sizeof(uint32_t))[idx];
+  }
+
+  // Element access for varchar (string) child type.
+  std::string_view GetChildStringView(size_t idx) const {
+    return child_span(idx);
+  }
+
+  // Element access for a nested List child type.
+  // The returned ListView borrows from the same underlying buffer.
+  ListView GetChildListView(size_t idx) const {
+    const DataType& child = ListType::GetChildType(type_);
+    return ListView(child, child_span(idx));
+  }
+
+  const DataType& type_;
+  std::string_view data_;
+
+ private:
+  // Returns the raw byte span for the variable-length element at index idx.
+  // Must only be called for non-POD encodings.
+  std::string_view child_span(size_t idx) const {
+    assert(!data_.empty());
+    const uint32_t count = *reinterpret_cast<const uint32_t*>(data_.data());
+    assert(idx < static_cast<size_t>(count));
+    const uint32_t* offsets =
+        reinterpret_cast<const uint32_t*>(data_.data() + sizeof(uint32_t));
+    // data region starts right after the (count+1) offsets
+    const char* data_start =
+        data_.data() + sizeof(uint32_t) + (count + 1) * sizeof(uint32_t);
+    return std::string_view(data_start + offsets[idx],
+                            offsets[idx + 1] - offsets[idx]);
Evidence
ListView reads count and offsets via reinterpret_cast<const uint32_t*> and then indexes into the
buffer without verifying that the buffer length is sufficient for the declared count/offsets. These
checks are only asserts, so release builds may read past the end of the buffer on corrupted data.

include/neug/utils/property/list_view.h[93-109]
include/neug/utils/property/list_view.h[129-140]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ListView` does not validate blob size/layout before reading count/offsets/elements; it relies on `assert` and performs potentially unaligned `reinterpret_cast` loads.

### Fix Focus Areas
- include/neug/utils/property/list_view.h[93-140]

### Suggested fix
1. Replace direct dereferences of `reinterpret_cast<uint32_t*>` with `std::memcpy` into a local `uint32_t` to avoid alignment UB.
2. Add explicit size checks:
  - Ensure `data_.size() >= 4` before reading count.
  - For POD: ensure `data_.size() >= 4 + count*sizeof(T)`.
  - For varlen: ensure `data_.size() >= 4 + (count+1)*4`, then validate `off[count]` and each `off[i]` is monotonic and within bounds.
3. Decide on behavior for invalid blobs (throw exception vs. return empty list) and apply consistently.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. List length truncation risk 🐞 Bug ☼ Reliability
Description
ListColumn stores blob length in a uint32_t without checking overflow, so blobs larger than 4GiB
will wrap/truncate and later reads will view incorrect spans. This can lead to corrupted reads and
potential memory errors.
Code

include/neug/utils/property/column.h[R569-571]

+    items_.set(idx, {static_cast<uint64_t>(offset),
+                     static_cast<uint32_t>(blob.size())});
+  }
Evidence
set_value casts blob.size() to uint32_t when writing list_storage_item.length, but the API accepts
arbitrary std::string_view sizes; without validation, large lists can be stored with incorrect
length metadata.

include/neug/utils/property/column.h[436-440]
include/neug/utils/property/column.h[554-571]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ListColumn` truncates blob sizes to `uint32_t`.

### Fix Focus Areas
- include/neug/utils/property/column.h[436-440]
- include/neug/utils/property/column.h[554-571]

### Suggested fix
Either:
1. Add an explicit guard in `set_value`:
  - If `blob.size() > std::numeric_limits<uint32_t>::max()`, throw a runtime error.

Or:
2. Change `list_storage_item.length` to `uint64_t` (and update serialization/layout accordingly) if you want to support blobs >4GiB.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@zhanglei1949 zhanglei1949 marked this pull request as draft April 1, 2026 08:43
@zhanglei1949
Copy link
Copy Markdown
Member Author

zhanglei1949 commented Apr 1, 2026

Currently the implementation is finished.
@shirly121 Could you please implement the compiler side code? Currently, it fails at compiler side for the test:
python3 -m pytest -sv tests/test_ddl.py -k test_list_type

E           RuntimeError: Failed to execute query: CREATE (:TestNode {id: 1, tags: ['tag1', 'tag2']});. Error code: 1011, Error Message: ERR_NOT_SUPPORTED: Not supported: Unexpected type: 100 at /data/zhanglei/alibaba/neug/src/execution/common/types/value.cc:822 func: value_to_property, Error Code: 1011

It turns out the it used the to_tuple message: https://github.com/alibaba/neug/blob/main/src/compiler/gopt/g_expr_converter.cpp#L717

Comment thread src/execution/common/types/value.cc
Comment thread src/utils/property/property.cc
Comment thread include/neug/utils/property/list_view.h
Comment thread src/execution/common/types/value.cc
Comment thread include/neug/utils/property/column.h
@zhanglei1949 zhanglei1949 force-pushed the zl/feat-list branch 2 times, most recently from 36b6978 to df9110a Compare April 27, 2026 02:23
@zhanglei1949
Copy link
Copy Markdown
Member Author

cd tools/python_bind
python -m pytest tests/test_ddl.py -k test_list_type_with_default_values

# got error
            )
E           RuntimeError: Failed to execute query: CREATE NODE TABLE TaggedNode(id INT64 PRIMARY KEY, tags STRING[] DEFAULT ['default_tag']);. Error code: 3000, Error Message: ERR_COMPILATION: Default value expression should be a constant at /Users/zhanglei/code/alibaba/neug/src/compiler/gopt/g_expr_converter.cpp:301 func: convertDefaultValue, Error Code: 9999

@shirly121

@zhanglei1949 zhanglei1949 marked this pull request as ready for review April 28, 2026 02:39
@zhanglei1949 zhanglei1949 force-pushed the zl/feat-list branch 3 times, most recently from 14caa51 to f5bd0c9 Compare April 28, 2026 08:14
@zhanglei1949 zhanglei1949 requested a review from Copilot April 28, 2026 08:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds end-to-end support for List/array properties in storage and execution, including serialization, column storage, planner conversions, and tests (fixes #159).

Changes:

  • Introduces ListView/ListViewBuilder blob format and a ListColumn implementation for list-typed properties.
  • Updates Value↔Property conversions with OwnedProperty to safely handle non-owning Property views for var-len types (esp. list blobs).
  • Adds storage/execution/compiler + Python binding tests for list and nested-list behavior and persistence.

Reviewed changes

Copilot reviewed 42 out of 42 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tools/python_bind/tests/test_ddl.py Adds Python DDL/query tests for list columns, defaults, and persistence.
tools/python_bind/tests/test_db_list.py Adds Python binding test for nested list properties.
tests/storage/test_list_column.cc New C++ storage tests for list blobs, list columns, persistence, and conversions.
tests/storage/CMakeLists.txt Registers the new test_list_column target and fixes formatting.
tests/execution/test_value.cc Updates conversion tests to use OwnedProperty + adds list round-trip coverage.
tests/compiler/gopt_test.h Updates GPhysicalConvertor construction to pass client context.
src/utils/yaml_utils.cc Adds YAML encoding for list types (array.component_type).
src/utils/property/property.cc Adds default list value + serialization/deserialization support for list properties.
src/utils/property/column.cc Enables ListColumn/ref-column creation when schema type is kList.
src/utils/pb_utils.cc Adds proto-to-DataType support for arrays + value conversion for array values.
src/execution/expression/exprs/path_expr.cc Passes expected DataType into property_to_value for list-aware decoding.
src/execution/expression/accessors/vertex_accessor.cc Passes expected DataType into property_to_value.
src/execution/expression/accessors/record_accessor.cc Passes expected DataType into property_to_value.
src/execution/expression/accessors/edge_accessor.cc Passes expected DataType into property_to_value.
src/execution/execute/ops/retrieve/scan_utils.cc Uses OwnedProperty when converting parameter values to Properties.
src/execution/execute/ops/ddl/create_vertex_type.cc Keeps default property blobs alive via OwnedProperty while building schema params.
src/execution/execute/ops/ddl/create_edge_type.cc Keeps default property blobs alive via OwnedProperty while building schema params.
src/execution/execute/ops/ddl/add_vertex_property.cc Keeps default property blobs alive via OwnedProperty while building alter params.
src/execution/execute/ops/ddl/add_edge_property.cc Keeps default property blobs alive via OwnedProperty while building alter params.
src/execution/execute/ops/batch/batch_update_vertex.cc Uses OwnedProperty during updates and fixes type-mismatch reporting.
src/execution/common/types/value.cc Implements list blob (de)serialization + switches value_to_property to OwnedProperty.
src/execution/common/operators/retrieve/join.cc Uses OwnedProperty for PK join lookups.
src/execution/common/operators/insert/create_vertex.cc Uses OwnedProperty to keep PK/property buffers alive during inserts.
src/execution/common/operators/insert/create_edge.cc Uses OwnedProperty to keep edge property buffers alive during inserts.
src/compiler/planner/gopt_planner.cc Passes client context into GPhysicalConvertor.
src/compiler/gopt/g_query_converter.cpp Threads client context through query conversion; updates nested convertor usage.
src/compiler/gopt/g_expr_converter.cpp Adds constant folding for list default expressions; allows empty array literals.
src/compiler/gopt/g_ddl_converter.cpp Uses logical-type conversion that can represent list types.
src/compiler/binder/expression_visitor.cpp Treats list creation functions as constant if children are constant.
include/neug/utils/property/property.h Adds list accessors + PropUtils<ListView> + list stringification.
include/neug/utils/property/list_view.h New list blob view/builder with POD and var-len encodings (incl. nested lists).
include/neug/utils/property/column.h Refactors var-len column logic, adds ListColumn + TypedRefColumn<ListView>.
include/neug/storages/graph/schema.h Preserves ownership of default list blobs similarly to default strings.
include/neug/execution/common/types/value.h Updates API: value_to_propertyOwnedProperty, property_to_value requires DataType.
include/neug/execution/common/types/owned_property.h New wrapper to keep var-len buffers alive for non-owning Property views.
include/neug/compiler/gopt/g_type_utils.h Adds YAML conversion for LIST logical types.
include/neug/compiler/gopt/g_scalar_type.h Removes noisy logging for list/struct scalar type detection.
include/neug/compiler/gopt/g_query_converter.h Extends converter to carry client context.
include/neug/compiler/gopt/g_physical_convertor.h Extends physical convertor to carry client context and pass it through.
include/neug/compiler/gopt/g_expr_converter.h Updates expr converter to accept client context and expose folding helper.
include/neug/compiler/gopt/g_ddl_converter.h Updates DDL converter to pass client context to expr converter.
include/neug/common/types.h Adds is_pod_type() helper for list encoding/decoding decisions.
Comments suppressed due to low confidence (1)

tools/python_bind/tests/test_ddl.py:1

  • These new tests write to hard-coded /tmp/... paths. This can make the suite flaky under parallel runs and can fail in environments without /tmp or with restricted permissions. Suggested fix: use pytest’s tmp_path fixture (as used in other tests) to create per-test temporary directories, and avoid shutil.rmtree on fixed paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread include/neug/utils/property/column.h
Comment thread include/neug/utils/property/column.h Outdated
Comment thread include/neug/utils/property/list_view.h
Comment thread src/compiler/gopt/g_expr_converter.cpp Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 43 out of 43 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (5)

tools/python_bind/tests/test_db_list.py:1

  • The pattern syntax uses a space in the label ((p: PERSON) / MATCH (p: PERSON)), which is inconsistent with the other tests in this PR (:TestNode) and is likely invalid in Cypher-like grammars. Also, MATCH ... RETURN without ORDER BY can yield nondeterministic row ordering, making the test flaky. Remove the label whitespace (e.g., :PERSON) and add ORDER BY p.id (or equivalent) before asserting on row order.
    tools/python_bind/tests/test_ddl.py:1
  • These tests write to fixed /tmp/... directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g., tmp_path) or generating a unique temp directory per test (e.g., tempfile.mkdtemp) and cleaning up reliably.
    tools/python_bind/tests/test_ddl.py:1
  • These tests write to fixed /tmp/... directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g., tmp_path) or generating a unique temp directory per test (e.g., tempfile.mkdtemp) and cleaning up reliably.
    tools/python_bind/tests/test_ddl.py:1
  • These tests write to fixed /tmp/... directories, which can collide under parallel test execution (or when rerunning after a crash), leading to flaky failures and cross-test interference. Prefer using a pytest-provided temporary directory (e.g., tmp_path) or generating a unique temp directory per test (e.g., tempfile.mkdtemp) and cleaning up reliably.
    tests/storage/test_list_column.cc:1
  • The storage tests use predictable /tmp/test_list_column_* directories, which can conflict when tests are run concurrently (or if a prior run left behind partial state). Use a per-test unique directory under std::filesystem::temp_directory_path() (e.g., include PID + random suffix) or a test framework temp-dir helper, and consider cleaning up in TearDown() to minimize leftover state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread include/neug/utils/property/list_view.h
Comment thread include/neug/utils/property/list_view.h
Comment thread include/neug/utils/property/list_view.h
Comment thread include/neug/utils/property/list_view.h
Comment thread include/neug/utils/property/column.h
Comment thread src/execution/common/types/value.cc
Comment thread src/compiler/gopt/g_expr_converter.cpp Outdated
Comment on lines +271 to +287
// if the function can be folded to a constant, return true, otherwise return
// false
bool canFold(std::shared_ptr<binder::Expression> expr) {
if (expr->expressionType != common::ExpressionType::FUNCTION) {
return false;
}

auto& funcExpr = expr->constCast<binder::ScalarFunctionExpression>();
auto children = funcExpr.getChildren();
for (auto child : children) {
if (child->expressionType != common::ExpressionType::LITERAL &&
!canFold(child)) {
return false;
}
}
return true;
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

canFold is defined with external linkage in the .cpp and folds any function expression whose inputs are literal/foldable, without checking whether the function is deterministic/immutable. For default values, folding non-deterministic functions at compile time can change semantics (e.g., time/random-like functions). Recommend putting helpers in an anonymous namespace (or making them static) and restricting folding to a known-safe function set (or checking function metadata for foldability/determinism).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 44 out of 44 changed files in this pull request and generated 2 comments.

Comment thread src/execution/common/operators/retrieve/join.cc Outdated
Comment thread src/execution/execute/ops/batch/batch_update_vertex.cc Outdated
@zhanglei1949 zhanglei1949 requested a review from liulx20 May 6, 2026 07:34
support to_list expression in compiler

Committed-by: Xiaoli Zhou from Dev container

resolve some comments

use owned property to hold the string memory temporarily when passing it to storage

fix default value of list type in ddl

Committed-by: Xiaoli Zhou from Dev container

fix

fix fold constant bugs

Committed-by: Xiaoli Zhou from Dev container

Committed-by: Xiaoli Zhou from Dev container

todo: resolve the test cases

minor fix

Revert "fix fold constant bugs"

This reverts commit 287e2a9.

Committed-by: Xiaoli Zhou from Dev container

convert list const value to to_list expression

Committed-by: Xiaoli Zhou from Dev container

Committed-by: Xiaoli Zhou from Dev container

Committed-by: Xiaoli Zhou from Dev container

fix bug

fix
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 48 out of 48 changed files in this pull request and generated 3 comments.

Comment thread src/utils/pb_utils.cc Outdated
Comment thread tools/python_bind/tests/test_ddl.py
Comment thread src/execution/common/types/value.cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce support for List type in storage.

2 participants