Skip to content

Support uninitialized tensors in file IO reads#1187

Open
cliffburdick wants to merge 1 commit into
mainfrom
cburdick/io-uninitialized-tensors
Open

Support uninitialized tensors in file IO reads#1187
cliffburdick wants to merge 1 commit into
mainfrom
cburdick/io-uninitialized-tensors

Conversation

@cliffburdick
Copy link
Copy Markdown
Collaborator

@cliffburdick cliffburdick commented May 26, 2026

Allocate destination tensors during NumPy-backed file reads when the tensor has no data pointer, using the shape discovered from the file before copying data.

Add CSV, MAT, and NPY regression coverage for reading into default-constructed tensors.

Closes #816

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cliffburdick cliffburdick force-pushed the cburdick/io-uninitialized-tensors branch from bc496a8 to 061a97e Compare May 26, 2026 20:37
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR enables file IO read functions (read_csv, read_mat, read_npy) to accept default-constructed (uninitialized) tensors, allocating storage from the file's discovered shape before copying data. It also tightens the existing path with an exact_shape flag and adds regression coverage for all three formats.

  • NumpyToTensorView is changed to a forwarding-reference signature so that make_tensor can modify the caller's tensor in-place; a has_is_initialized_v concept check gates the auto-allocation path.
  • IsInitialized() is added to both tensor_impl.h (shared base) and dynamic_tensor.h, and all three file-IO read helpers pass exact_shape=true; three new "uninitialized" test cases plus a shape-mismatch throw test are included.

Confidence Score: 4/5

Safe to merge for the common dynamic-tensor path, but adding IsInitialized() to the shared tensor_impl base class introduces a compile error for any caller that uses static tensors with file IO.

The allocation and copy logic in NumpyToTensorView is correct for dynamic tensors (the only type tested). However, because IsInitialized() now lives on the tensor_impl base class, the has_is_initialized_v guard evaluates true for every tensor type. The compiler is then required to instantiate make_tensor(ten, shape) for non-dynamic types inside the non-constexpr if body, and no matching overload exists for those types.

include/matx/core/pybind.h — the auto-allocation guard needs is_dynamic_tensor_v added alongside has_is_initialized_v.

Important Files Changed

Filename Overview
include/matx/core/pybind.h Adds auto-allocation of uninitialized tensors in NumpyToTensorView; the has_is_initialized_v guard is too broad and causes a hard compile error when file IO is used with non-dynamic tensor types.
include/matx/core/dynamic_tensor.h Adds IsInitialized() helper to dynamic_tensor_t; straightforward null-pointer check on ldata_.
include/matx/core/tensor_impl.h Adds IsInitialized() to the shared tensor_impl base, which makes has_is_initialized_v true for all tensor types (including static), widening the allocation guard unintentionally.
include/matx/file_io/file_io.h Threads exact_shape=true through all three read paths; no issues in these call sites.
test/00_io/FileIOTests.cu Adds uninitialized-tensor regression tests for CSV, MAT, and NPY reads, plus a shape-mismatch throw test; coverage looks correct.

Reviews (5): Last reviewed commit: "Support uninitialized tensors in file IO..." | Re-trigger Greptile

Comment thread include/matx/core/pybind.h Outdated
Comment thread include/matx/core/pybind.h Outdated
@cliffburdick cliffburdick force-pushed the cburdick/io-uninitialized-tensors branch 3 times, most recently from 5a4b775 to 589b922 Compare May 26, 2026 20:58
@cliffburdick
Copy link
Copy Markdown
Collaborator Author

@greptile review

@cliffburdick
Copy link
Copy Markdown
Collaborator Author

/build

@cliffburdick cliffburdick force-pushed the cburdick/io-uninitialized-tensors branch from 589b922 to 64dd792 Compare May 27, 2026 17:01
@cliffburdick
Copy link
Copy Markdown
Collaborator Author

/build

Add IsInitialized helpers to tensor types so callers do not inspect Data() directly.

Allocate destination tensors during NumPy-backed file reads when the tensor has no data pointer, using the shape discovered from the file before copying data.

Document CSV, MAT, and NPY IO examples, and clarify that MAT helpers operate on MAT-file variables rather than providing a general HDF5 interface.

Add CSV, MAT, NPY, and tensor initialization regression coverage.
@cliffburdick cliffburdick force-pushed the cburdick/io-uninitialized-tensors branch from 64dd792 to c30d2e3 Compare May 28, 2026 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Smart Tensor Creation with File Reading APIs

1 participant