Skip to content

[codex] Improve dynamic table validation and colnames sync#818

Draft
ehennestad wants to merge 3 commits intomainfrom
improve-dynamic-table-validation
Draft

[codex] Improve dynamic table validation and colnames sync#818
ehennestad wants to merge 3 commits intomainfrom
improve-dynamic-table-validation

Conversation

@ehennestad
Copy link
Copy Markdown
Collaborator

@ehennestad ehennestad commented Apr 17, 2026

Motivation

This PR fixes gaps in DynamicTable validation and synchronization around colnames.

It adds validation that materialized table columns are listed in colnames, enforces uniqueness of colnames, and updates generated setters for schema-defined DynamicTable columns so direct assignments keep colnames in sync. It also re-checks DynamicTable configuration during export so low-level drift is caught before writing.

How to test the behavior?

nwbtest('Name','tests.unit.dynamicTableTest')
nwbtest('Name','tests.system.DynamicTableTest')

Checklist

  • Have you ensured the PR description clearly describes the problem and solutions?
  • Have you checked to ensure that there aren't other open or previously closed Pull Requests for the same change?
  • If this PR fixes an issue, is the first line of the PR description fix #XX where XX is the issue number?

🤖 Generated with Codex

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens DynamicTable invariants around colnames by adding normalization/validation, syncing colnames when schema-defined column properties are assigned, and re-checking table configuration during export to catch drift.

Changes:

  • Added validateColnames (uniqueness + normalization) and syncNamedColumn utilities under types.util.dynamictable.
  • Updated checkConfig to validate colnames and detect materialized columns missing from colnames.
  • Updated class generation to inject validator/setter hooks (and added/updated tests) so direct assignments keep colnames consistent and export re-validates constraints.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
+types/+util/+dynamictable/validateColnames.m New helper to normalize and enforce uniqueness of colnames.
+types/+util/+dynamictable/syncNamedColumn.m New helper to ensure schema-defined columns are registered in colnames on assignment.
+types/+util/+dynamictable/checkConfig.m Extends DynamicTable config validation (now checks for missing detected columns and normalizes colnames).
+types/+hdmf_common/DynamicTable.m Uses validateColnames in validate_colnames and adds checkCustomConstraint to re-run checkConfig (e.g., on export).
+types/+core/Units.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/TimeIntervals.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/SweepTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/SimultaneousRecordingsTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/SequentialRecordingsTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/RepetitionsTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/PlaneSegmentation.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/IntracellularStimuliTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/IntracellularResponsesTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/IntracellularElectrodesTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/FrequencyBandsTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/ExperimentalConditionsTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+types/+core/ElectrodesTable.m Adds post-set hooks to sync schema-defined columns into colnames on direct assignment.
+tests/+unit/dynamicTableTest.m Updates/extends tests for duplicate colnames, direct-assignment syncing, and export-time drift detection.
+file/isDynamicTableDescendant.m New generator helper to detect DynamicTable inheritance.
+file/getPropertyHooks.m New generator hook registry for custom validator/setter behavior.
+file/fillValidators.m Injects additional validator lines via generator hooks.
+file/fillSetters.m Adds optional post-set hook generation via generator hooks.
+file/fillCustomConstraint.m Adds custom constraint generation for DynamicTable to re-check config.
+file/fillConstructor.m Uses new file.isDynamicTableDescendant helper when injecting DynamicTable validation into constructors.
+file/fillClass.m Passes type/namespace into fillSetters for hook-aware setter generation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +8 to +16
colnames = cleanColumnNames(colnames);
if ischar(colnames)
colnames = {colnames};
end

assert(iscellstr(colnames), ...
'NWB:DynamicTable:InvalidColumnNames', ...
'Column names must be a cell array of character vectors.');

Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_colnames accepts MATLAB string arrays because types.util.checkDtype(...,'char',val) allows isstring(val), but validateColnames later asserts iscellstr(colnames)/ischar(colnames) and will therefore reject string input. Consider normalizing string (and cell arrays of strings) to cellstr early (e.g., colnames = cellstr(colnames) when isstring(colnames)), so the validator is consistent with checkDtype and common MATLAB calling patterns.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +15
if ischar(colnames)
colnames = {colnames};
end

assert(iscellstr(colnames), ...
'NWB:DynamicTable:InvalidColumnNames', ...
'Column names must be a cell array of character vectors.');
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateColnames will also fail for an N-by-M char array (multiple column names stored as padded rows), because it wraps any char input into a 1-element cell and then requires iscellstr. If colnames can be a char matrix (e.g., from some HDF5 string array reads or user input), consider converting multi-row char arrays via cellstr(colnames) before validating uniqueness.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +37
detectedColumnNames = getDetectedColumnNames(DynamicTable);
if isempty(DynamicTable.colnames)
assert(isempty(getDetectedColumnNames(DynamicTable)), ...
assert(isempty(detectedColumnNames), ...
'NWB:DynamicTable:CheckConfig:ColumnNamesMismatch', ...
'All Vector Data/Index columns must have their name ordered in the `colnames` property.');
return;
end

% remove null characters from column names
DynamicTable.colnames = cleanColumnNames(DynamicTable.colnames);
DynamicTable.colnames = types.util.dynamictable.validateColnames(DynamicTable.colnames);
columns = DynamicTable.colnames;

missingColumnNames = setdiff(detectedColumnNames, columns, 'stable');
assert(isempty(missingColumnNames), ...
'NWB:DynamicTable:CheckConfig:ColumnNamesMismatch', ...
['All materialized DynamicTable columns must be listed in `colnames`.\n' ...
'Missing from `colnames`: %s'], ...
strjoin(missingColumnNames, ', '));
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignoreList is documented as excluding columns from checks, but the new missingColumnNames assertion runs before ignoreList is applied. This means ignored columns that exist on the object but are intentionally omitted from colnames will still trigger ColumnNamesMismatch. Consider removing ignoreList entries from detectedColumnNames (or from missingColumnNames) before asserting.

Copilot uses AI. Check for mistakes.
|| isa(value, 'types.core.VectorData');
isVectorIndex = isa(value, 'types.hdmf_common.VectorIndex') ...
|| isa(value, 'types.core.VectorIndex');
tf = ~isempty(value) && isVectorData && ~isVectorIndex;
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isMaterializedColumn currently treats any non-empty VectorData object as “materialized”, regardless of whether its .data is empty / zero-length (the previous implementation in getDetectedColumnNames explicitly skipped empty Vector.data for DataStub, DataPipe, and in-memory arrays). If the intent is to validate only columns with actual row data, reintroduce the data-emptiness check (or rename the helper/message to reflect the new semantics).

Suggested change
tf = ~isempty(value) && isVectorData && ~isVectorIndex;
hasData = false;
if ~isempty(value) && isVectorData && ~isVectorIndex
hasData = ~isempty(value.data);
end
tf = ~isempty(value) && isVectorData && ~isVectorIndex && hasData;

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +33
try
timeIntervals = types.core.TimeIntervals(...
'description', 'test error', ...
'start_time', types.hdmf_common.VectorData( ...
'description', 'start time column', ...
'data', (1:5)' ...
));
testCase.verifyTrue(contains(timeIntervals.colnames, 'start_time'), ...
'Expected colnames to include start_time')
catch
testCase.verifyFail(['Failed to create an empty TimeIntervals ' ...
'table with start_time column.'])
end
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test uses a broad try/catch and then verifyFail on any exception, which loses the exception identifier/message (making failures harder to diagnose). Consider replacing with testCase.verifyWarningFree(@() ... ) (or verifyError with a specific identifier if an error is expected), then separately asserting colnames contains start_time.

Suggested change
try
timeIntervals = types.core.TimeIntervals(...
'description', 'test error', ...
'start_time', types.hdmf_common.VectorData( ...
'description', 'start time column', ...
'data', (1:5)' ...
));
testCase.verifyTrue(contains(timeIntervals.colnames, 'start_time'), ...
'Expected colnames to include start_time')
catch
testCase.verifyFail(['Failed to create an empty TimeIntervals ' ...
'table with start_time column.'])
end
timeIntervals = testCase.verifyWarningFree(@() types.core.TimeIntervals(...
'description', 'test error', ...
'start_time', types.hdmf_common.VectorData( ...
'description', 'start time column', ...
'data', (1:5)' ...
)));
testCase.verifyTrue(contains(timeIntervals.colnames, 'start_time'), ...
'Expected colnames to include start_time')

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants