Skip to content

Commit fb4b7dc

Browse files
committed
refactor(iterators): NDIterator now iterates lazily — no materialized copy
Previous rewrite (commit 87f90a2b) backed NDIterator<TOut> with an eagerly-materialized NDArray buffer — it ran NpyIter.Copy at construction time into a contiguous TOut-typed buffer and walked that buffer on MoveNext. Simple, but allocated O(size * sizeof(TOut)) up front even for callers that read one element and walk away, or abandon iteration early. This commit drops the materialization. MoveNext now reads each element lazily from the source layout: - Same-type, contiguous, offset == 0: Direct `*((TOut*)addr + cursor++)`. One pointer increment per call, no coordinate arithmetic, no branch. Matches the legacy contiguous fast path. - Same-type, strided / sliced / broadcast / offset != 0: Walks offsets with ValueOffsetIncrementor (or ValueOffsetIncrementorAutoresetting when AutoReset is set). The incrementor updates one coordinate per call amortized O(1), with occasional O(ndim) carry-propagation for wrap-around. Same algorithm the legacy code used for its Matrix/Tensor sliced paths. - Cross-type (source dtype != TOut): Offset-walks the source at its native dtype, reads a TSrc element, and passes it through `Converts.FindConverter<TSrc, TOut>()` before returning TOut. One switch at construction dispatches to a typed BuildCastingMoveNext<TSrc>() helper — the per-element hot path is then a `TSrc v = *(...)` read followed by a `conv(v)` delegate call, matching the legacy cast-iterator performance profile. For consistency with the legacy path, MoveNextReference throws when a cast is involved — you can't hand out a stable ref to a converted value. AutoReset is implemented inline (`if (cursor >= size) cursor = 0` in the contig path, ValueOffsetIncrementorAutoresetting in the strided path) rather than via modulo-per-call so the steady-state cost is a single predictable branch per MoveNext. Memory: iteration now costs O(1) for contig, O(ndim) for the incrementor's Index[] and internal state on strided. No full-array allocation regardless of source size. Test impact: 6,748 / 6,748 passing on net8.0 + net10.0 with the CI filter (TestCategory!=OpenBugs&TestCategory!=HighMemory). Smoke test covering contig / strided / transposed / cross-type / auto-reset / Reset / foreach round-trip all match expected element sequences.
1 parent bb205d3 commit fb4b7dc

1 file changed

Lines changed: 169 additions & 76 deletions

File tree

src/NumSharp.Core/Backends/Iterators/NDIterator.cs

Lines changed: 169 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -3,28 +3,34 @@
33
using System.Collections.Generic;
44
using System.Runtime.CompilerServices;
55
using NumSharp.Backends;
6-
using NumSharp.Backends.Iteration;
76
using NumSharp.Backends.Unmanaged;
87
using NumSharp.Utilities;
98

109
namespace NumSharp
1110
{
1211
/// <summary>
13-
/// Legacy per-element iterator surface preserved for backward compatibility.
12+
/// Lazy per-element iterator. Supports contiguous/sliced/strided/broadcast
13+
/// source layouts and any source-to-TOut numeric dtype cast, without
14+
/// materializing a copy of the iterated data.
1415
///
15-
/// Internally this is now a thin wrapper over the modern <see cref="NpyIter"/>
16-
/// machinery — the iteration is pre-materialized into a flat TOut buffer via
17-
/// <see cref="NpyIter.Copy(UnmanagedStorage, UnmanagedStorage)"/> so that
18-
/// source layout (contiguous, sliced, broadcast, transposed) and source-to-
19-
/// TOut dtype casting are both handled once up front. The resulting buffer
20-
/// is then walked by the <see cref="MoveNext"/>, <see cref="HasNext"/>,
21-
/// and <see cref="Reset"/> delegates.
16+
/// Path selection at construction time picks the fastest MoveNext for the
17+
/// concrete layout + cast combination:
2218
///
23-
/// Trade-off: iteration allocates O(size) memory for the materialized buffer.
24-
/// In exchange, per-element MoveNext is a simple pointer index with no
25-
/// delegate dispatch or coordinate arithmetic in the hot path, and the
26-
/// dtype-dispatch switch that used to live in the 12 partial
27-
/// <c>NDIterator.Cast.&lt;T&gt;.cs</c> files is gone entirely.
19+
/// <list type="bullet">
20+
/// <item>Same-type contiguous (offset = 0, no AutoReset): direct
21+
/// <c>*(TOut*)(addr + cursor++)</c> — one pointer increment per call.</item>
22+
/// <item>Same-type strided or offset != 0: walks offsets via
23+
/// <see cref="ValueOffsetIncrementor"/> / <see cref="ValueOffsetIncrementorAutoresetting"/>,
24+
/// reads <c>*(TOut*)(addr + offset)</c>.</item>
25+
/// <item>Cross-type: reads the source bytes as the actual src dtype, passes
26+
/// through <see cref="Converts.FindConverter{TIn, TOut}"/>, and returns
27+
/// the converted TOut. MoveNextReference throws — references into a
28+
/// cast value don't exist.</item>
29+
/// </list>
30+
///
31+
/// AutoReset on non-broadcast iteration is implemented via the incrementor's
32+
/// auto-resetting wrapper (or modulo on the contig-scalar-cursor path) so
33+
/// iteration cycles forever without allocating.
2834
/// </summary>
2935
public unsafe class NDIterator<TOut> : NDIterator, IEnumerable<TOut>, IDisposable
3036
where TOut : unmanaged
@@ -46,7 +52,7 @@ public unsafe class NDIterator<TOut> : NDIterator, IEnumerable<TOut>, IDisposabl
4652
/// <summary>Moves to next iteration and returns the next value. Always check <see cref="HasNext"/> first.</summary>
4753
public Func<TOut> MoveNext;
4854

49-
/// <summary>Moves to next iteration and returns a reference to the next value.</summary>
55+
/// <summary>Moves to next iteration and returns a reference to the next value. Throws when iteration involves a dtype cast.</summary>
5056
public MoveNextReferencedDelegate<TOut> MoveNextReference;
5157

5258
/// <summary>Returns whether there are more elements to iterate.</summary>
@@ -55,9 +61,6 @@ public unsafe class NDIterator<TOut> : NDIterator, IEnumerable<TOut>, IDisposabl
5561
/// <summary>Resets the internal cursor to the beginning.</summary>
5662
public Action Reset;
5763

58-
// NpyIter-materialized backing storage. Owned by this iterator and released in Dispose().
59-
private NDArray _materialized;
60-
private long _cursor;
6164
private bool _disposed;
6265

6366
public NDIterator(IMemoryBlock block, Shape shape, Shape? broadcastedShape, bool autoReset = false)
@@ -68,12 +71,10 @@ public NDIterator(IMemoryBlock block, Shape shape, Shape? broadcastedShape, bool
6871
Block = block ?? throw new ArgumentNullException(nameof(block));
6972
Shape = shape;
7073
BroadcastedShape = broadcastedShape;
71-
long effSize = broadcastedShape?.size ?? shape.size;
72-
size = effSize;
74+
size = broadcastedShape?.size ?? shape.size;
7375
AutoReset = (broadcastedShape.HasValue && shape.size != broadcastedShape.Value.size) || autoReset;
7476

75-
Materialize(block, shape, broadcastedShape);
76-
SetDelegates();
77+
SetDefaults();
7778
}
7879

7980
public NDIterator(IArraySlice slice, Shape shape, Shape? broadcastedShape, bool autoReset = false)
@@ -85,90 +86,183 @@ public NDIterator(UnmanagedStorage storage, bool autoReset = false)
8586
public NDIterator(NDArray arr, bool autoReset = false)
8687
: this(arr?.Storage.InternalArray, arr?.Shape ?? default, null, autoReset) { }
8788

88-
/// <summary>
89-
/// Reconfigure after construction. Any non-default <paramref name="reshape"/>
90-
/// triggers a re-materialization of the backing buffer at the new shape.
91-
/// </summary>
89+
/// <summary>Reconfigure the iterator after construction.</summary>
9290
public void SetMode(bool autoreset, Shape reshape = default)
9391
{
9492
AutoReset = autoreset;
9593
if (!reshape.IsEmpty)
9694
{
9795
Shape = reshape;
9896
size = BroadcastedShape?.size ?? Shape.size;
99-
Materialize(Block, Shape, BroadcastedShape);
100-
SetDelegates();
10197
}
98+
SetDefaults();
10299
}
103100

104-
private void Materialize(IMemoryBlock srcBlock, Shape srcShape, Shape? broadcastedShape)
101+
private void SetDefaults()
105102
{
106-
var srcSlice = srcBlock as IArraySlice
107-
?? throw new ArgumentException(
108-
$"NDIterator expected source block to implement IArraySlice; got {srcBlock.GetType()}.");
109-
110-
// Use CreateBroadcastedUnsafe to bypass the UnmanagedStorage ctor's
111-
// "shape.size == slice.Count" check — our srcShape can carry stride=0
112-
// broadcast axes whose logical size exceeds the backing slice.
113-
var srcStorage = UnmanagedStorage.CreateBroadcastedUnsafe(srcSlice, srcShape);
114-
115-
// Destination must be freshly C-order-contiguous and writeable, even
116-
// when srcShape (or broadcastedShape) carries broadcast stride=0. Drop
117-
// the stride metadata by constructing the target shape from dimensions
118-
// only — this gives a fresh, writeable, row-major shape.
119-
var srcDims = broadcastedShape ?? srcShape;
120-
var targetShape = new Shape((long[])srcDims.dimensions.Clone());
121-
var targetTypeCode = InfoOf<TOut>.NPTypeCode;
122-
123-
// NpyIter.Copy broadcasts src -> targetShape and casts
124-
// src.typecode -> TOut in one pass.
125-
_materialized = new NDArray(targetTypeCode, targetShape, false);
126-
NpyIter.Copy(_materialized.Storage, srcStorage);
103+
var srcType = Block.TypeCode;
104+
var dstType = InfoOf<TOut>.NPTypeCode;
105+
106+
if (srcType == dstType)
107+
{
108+
SetDefaults_NoCast();
109+
return;
110+
}
111+
112+
SetDefaults_WithCast(srcType);
127113
}
128114

129-
private void SetDelegates()
115+
// ---------------------------------------------------------------------
116+
// Same-type (no cast) — direct pointer reads. Four sub-paths depending
117+
// on whether the shape is contiguous-with-zero-offset and whether
118+
// AutoReset is active.
119+
// ---------------------------------------------------------------------
120+
121+
private void SetDefaults_NoCast()
130122
{
131-
_cursor = 0;
132-
MoveNext = DefaultMoveNext;
133-
HasNext = DefaultHasNext;
134-
Reset = DefaultReset;
135-
MoveNextReference = DefaultMoveNextReference;
123+
var localBlock = Block;
124+
var localShape = Shape;
125+
126+
if (localShape.IsContiguous && localShape.offset == 0)
127+
{
128+
if (AutoReset)
129+
{
130+
long localSize = localShape.size;
131+
long cursor = 0;
132+
MoveNext = () =>
133+
{
134+
TOut ret = *((TOut*)localBlock.Address + cursor);
135+
cursor++;
136+
if (cursor >= localSize) cursor = 0;
137+
return ret;
138+
};
139+
MoveNextReference = () =>
140+
{
141+
ref TOut r = ref Unsafe.AsRef<TOut>((TOut*)localBlock.Address + cursor);
142+
cursor++;
143+
if (cursor >= localSize) cursor = 0;
144+
return ref r;
145+
};
146+
Reset = () => cursor = 0;
147+
HasNext = () => true;
148+
}
149+
else
150+
{
151+
long localSize = size;
152+
long cursor = 0;
153+
MoveNext = () => *((TOut*)localBlock.Address + cursor++);
154+
MoveNextReference = () => ref Unsafe.AsRef<TOut>((TOut*)localBlock.Address + cursor++);
155+
Reset = () => cursor = 0;
156+
HasNext = () => cursor < localSize;
157+
}
158+
return;
159+
}
160+
161+
// Strided / sliced / broadcast — walk offsets via the incrementor.
162+
if (AutoReset)
163+
{
164+
var incr = new ValueOffsetIncrementorAutoresetting(localShape);
165+
MoveNext = () => *((TOut*)localBlock.Address + incr.Next());
166+
MoveNextReference = () => ref Unsafe.AsRef<TOut>((TOut*)localBlock.Address + incr.Next());
167+
Reset = () => incr.Reset();
168+
HasNext = () => true;
169+
}
170+
else
171+
{
172+
var incr = new ValueOffsetIncrementor(localShape);
173+
MoveNext = () => *((TOut*)localBlock.Address + incr.Next());
174+
MoveNextReference = () => ref Unsafe.AsRef<TOut>((TOut*)localBlock.Address + incr.Next());
175+
Reset = () => incr.Reset();
176+
HasNext = () => incr.HasNext;
177+
}
136178
}
137179

138-
private TOut DefaultMoveNext()
180+
// ---------------------------------------------------------------------
181+
// Cross-type — same offset-walking strategy, plus a Converts.FindConverter
182+
// step that turns the bytes at the source pointer into TOut. MoveNextReference
183+
// is not meaningful when a conversion happens, so it throws.
184+
// ---------------------------------------------------------------------
185+
186+
private void SetDefaults_WithCast(NPTypeCode srcType)
139187
{
140-
if (_cursor >= size)
188+
MoveNextReference = () => throw new NotSupportedException(
189+
"Unable to return references during iteration when casting is involved.");
190+
191+
switch (srcType)
141192
{
142-
if (AutoReset) _cursor = 0;
143-
else throw new InvalidOperationException("NDIterator: no more elements.");
193+
case NPTypeCode.Boolean: BuildCastingMoveNext<bool>(); break;
194+
case NPTypeCode.Byte: BuildCastingMoveNext<byte>(); break;
195+
case NPTypeCode.Int16: BuildCastingMoveNext<short>(); break;
196+
case NPTypeCode.UInt16: BuildCastingMoveNext<ushort>(); break;
197+
case NPTypeCode.Int32: BuildCastingMoveNext<int>(); break;
198+
case NPTypeCode.UInt32: BuildCastingMoveNext<uint>(); break;
199+
case NPTypeCode.Int64: BuildCastingMoveNext<long>(); break;
200+
case NPTypeCode.UInt64: BuildCastingMoveNext<ulong>(); break;
201+
case NPTypeCode.Char: BuildCastingMoveNext<char>(); break;
202+
case NPTypeCode.Single: BuildCastingMoveNext<float>(); break;
203+
case NPTypeCode.Double: BuildCastingMoveNext<double>(); break;
204+
case NPTypeCode.Decimal: BuildCastingMoveNext<decimal>(); break;
205+
default: throw new NotSupportedException($"NDIterator: source dtype {srcType} not supported.");
144206
}
145-
return *((TOut*)_materialized.Address + _cursor++);
146207
}
147208

148-
private bool DefaultHasNext() => AutoReset || _cursor < size;
209+
private void BuildCastingMoveNext<TSrc>() where TSrc : unmanaged
210+
{
211+
var conv = Converts.FindConverter<TSrc, TOut>();
212+
var localBlock = Block;
213+
var localShape = Shape;
149214

150-
private void DefaultReset() => _cursor = 0;
215+
if (localShape.IsContiguous && localShape.offset == 0)
216+
{
217+
if (AutoReset)
218+
{
219+
long localSize = localShape.size;
220+
long cursor = 0;
221+
MoveNext = () =>
222+
{
223+
TSrc v = *((TSrc*)localBlock.Address + cursor);
224+
cursor++;
225+
if (cursor >= localSize) cursor = 0;
226+
return conv(v);
227+
};
228+
Reset = () => cursor = 0;
229+
HasNext = () => true;
230+
}
231+
else
232+
{
233+
long localSize = size;
234+
long cursor = 0;
235+
MoveNext = () => conv(*((TSrc*)localBlock.Address + cursor++));
236+
Reset = () => cursor = 0;
237+
HasNext = () => cursor < localSize;
238+
}
239+
return;
240+
}
151241

152-
private ref TOut DefaultMoveNextReference()
153-
{
154-
if (_cursor >= size)
242+
if (AutoReset)
155243
{
156-
if (AutoReset) _cursor = 0;
157-
else throw new InvalidOperationException("NDIterator: no more elements.");
244+
var incr = new ValueOffsetIncrementorAutoresetting(localShape);
245+
MoveNext = () => conv(*((TSrc*)localBlock.Address + incr.Next()));
246+
Reset = () => incr.Reset();
247+
HasNext = () => true;
248+
}
249+
else
250+
{
251+
var incr = new ValueOffsetIncrementor(localShape);
252+
MoveNext = () => conv(*((TSrc*)localBlock.Address + incr.Next()));
253+
Reset = () => incr.Reset();
254+
HasNext = () => incr.HasNext;
158255
}
159-
return ref Unsafe.AsRef<TOut>((TOut*)_materialized.Address + _cursor++);
160256
}
161257

162258
public IEnumerator<TOut> GetEnumerator()
163259
{
164-
long n = size;
165-
for (long i = 0; i < n; i++)
166-
yield return ReadAt(i);
260+
var next = MoveNext;
261+
var hasNext = HasNext;
262+
while (hasNext())
263+
yield return next();
167264
}
168265

169-
[MethodImpl(MethodImplOptions.AggressiveInlining)]
170-
private TOut ReadAt(long i) => *((TOut*)_materialized.Address + i);
171-
172266
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
173267

174268
public void Dispose()
@@ -178,7 +272,6 @@ public void Dispose()
178272
Reset = null;
179273
HasNext = null;
180274
MoveNextReference = null;
181-
_materialized = null;
182275
_disposed = true;
183276
}
184277

0 commit comments

Comments
 (0)