itk::SWCMeshIO stores SWC sample/parent/type identifiers as float (SampleIdentifierType, TypeIdentifierType, ParentIdentifierType in Modules/IO/IOMeshSWC/include/itkSWCMeshIO.h). SWC identifiers are integers by specification; representing them as float causes precision loss and is used directly as std::unordered_map keys for topology lookup. This is a design/API change deferred from the parser-hardening work, tracked here separately because it alters the exposed point-data component type and the Python wrapping.
Problem detail
SampleIdentifierType = float and ParentIdentifierType = float are used as keys in m_SampleIdentifierToPointIndex (std::unordered_map<float, IdentifierType>). Integer sample ids above 2^24 are not exactly representable in float, so two distinct ids just above 2^24 collapse to the same key, mismapping parent->point connectivity in ReadCells.
- Root detection uses an exact float comparison
parentIdentifier != -1; while -1 is exactly representable, the broader pattern of exact float equality on identifiers is fragile.
- Machine-generated tracings routinely use large monotonic ids, so this is reachable with real data, not just adversarial input.
The header already carries commented-out integer alternatives:
// using SampleIdentifierType = int16_t;
using SampleIdentifierType = float; // "For Python wrapping"
suggesting the float choice was a wrapping workaround, not a deliberate data-model decision.
Why this is not a drop-in fix
Changing the identifier types to an integer type is an API and wrapping change, not a localized bug fix:
ReadPointData exposes these containers as mesh point data; m_PointPixelComponentType is set to FLOAT. Switching the type changes the component type seen by consumers reading SampleIdentifier/ParentIdentifier point data.
Modules/IO/IOMeshSWC/wrapping/itkSWCMeshIO.wrap and the Python interface would need corresponding updates.
- Round-trip and baseline expectations may shift.
It therefore needs a deliberate data-model decision (which integer width) plus wrapping review, rather than riding along with the defensive parser-hardening changes.
Proposed direction
- Choose an integer identifier type wide enough for real SWC ids (e.g.
IdentifierType / uint32_t / int64_t) for SampleIdentifierType / ParentIdentifierType (and likely TypeIdentifierType, which is a small enum-like code).
- Keep
RadiusType = double.
- Update the unordered_map key types,
ReadPointData component type, and the .wrap accordingly.
- Add a regression test reading an SWC file with ids above 2^24 and asserting connectivity is preserved.
itk::SWCMeshIOstores SWC sample/parent/type identifiers asfloat(SampleIdentifierType,TypeIdentifierType,ParentIdentifierTypeinModules/IO/IOMeshSWC/include/itkSWCMeshIO.h). SWC identifiers are integers by specification; representing them asfloatcauses precision loss and is used directly asstd::unordered_mapkeys for topology lookup. This is a design/API change deferred from the parser-hardening work, tracked here separately because it alters the exposed point-data component type and the Python wrapping.Problem detail
SampleIdentifierType = floatandParentIdentifierType = floatare used as keys inm_SampleIdentifierToPointIndex(std::unordered_map<float, IdentifierType>). Integer sample ids above 2^24 are not exactly representable infloat, so two distinct ids just above 2^24 collapse to the same key, mismapping parent->point connectivity inReadCells.parentIdentifier != -1; while-1is exactly representable, the broader pattern of exact float equality on identifiers is fragile.The header already carries commented-out integer alternatives:
suggesting the float choice was a wrapping workaround, not a deliberate data-model decision.
Why this is not a drop-in fix
Changing the identifier types to an integer type is an API and wrapping change, not a localized bug fix:
ReadPointDataexposes these containers as mesh point data;m_PointPixelComponentTypeis set toFLOAT. Switching the type changes the component type seen by consumers readingSampleIdentifier/ParentIdentifierpoint data.Modules/IO/IOMeshSWC/wrapping/itkSWCMeshIO.wrapand the Python interface would need corresponding updates.It therefore needs a deliberate data-model decision (which integer width) plus wrapping review, rather than riding along with the defensive parser-hardening changes.
Proposed direction
IdentifierType/uint32_t/int64_t) forSampleIdentifierType/ParentIdentifierType(and likelyTypeIdentifierType, which is a small enum-like code).RadiusType = double.ReadPointDatacomponent type, and the.wrapaccordingly.