Skip to content

Commit e048152

Browse files
Merge pull request #17 from thomasisensee/linear-stencil
Linear and cubic interpolation stencils
2 parents 672abf0 + 440c03c commit e048152

11 files changed

Lines changed: 569 additions & 121 deletions

File tree

README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,20 @@ available before `cmake --build build`.
108108
When `ndtbl_ENABLE_MMAP=ON`, the C++ `read_group()` path uses read-only memory
109109
mapping on supported POSIX platforms.
110110

111+
## 📐 Interpolation
112+
113+
The standard C++ lookup path uses multilinear interpolation through explicit
114+
`evaluate_all_linear()` and `Grid::prepare_linear()` calls. This path uses
115+
`2^Dim` table points per query and keeps the hot path allocation-free.
116+
117+
The C++ API also exposes experimental local tensor-product cubic interpolation
118+
through explicit `evaluate_all_cubic()` and `Grid::prepare_cubic()` calls. Cubic
119+
interpolation uses `4^Dim` table points, can be much more expensive in high
120+
dimensions, and may overshoot smooth-looking table data enough to produce
121+
unwanted values. Bounds handling is independent of interpolation order:
122+
queries outside the table domain can either clamp or throw according to the
123+
selected `bounds_policy`.
124+
111125
## 🐍 Python Package
112126

113127
The repository also ships a separate Python package in `python/ndtbl/`.

benchmarks/README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ the next precomputed query modulo the ring size.
2828

2929
## Measured Operations
3030

31-
`bench_prepare` measures only `Grid::prepare(query)`. This isolates axis
31+
`bench_prepare` measures only `Grid::prepare_linear(query)`. This isolates axis
3232
bracketing and interpolation-stencil construction. It is registered once per
3333
dimension and axis layout because field count does not affect stencil
3434
preparation.
@@ -39,15 +39,21 @@ stencil and isolates interpolation over all fields. It is registered with `2`,
3939
`4`, and `8` fields.
4040

4141
`bench_typed_combined` measures
42-
`FieldGroup::evaluate_all_into(query, results)`. This is the typed end-to-end
43-
path from query coordinates to interpolated field values. It is registered with
44-
`2`, `4`, and `8` fields.
42+
`FieldGroup::evaluate_all_linear_into(query, results)`. This is the typed
43+
end-to-end path from query coordinates to interpolated field values. It is
44+
registered with `2`, `4`, and `8` fields.
4545

4646
`bench_runtime_combined` measures
47-
`RuntimeFieldGroup::evaluate_all_into(query, results)`. This covers the
47+
`RuntimeFieldGroup::evaluate_all_linear_into(query, results)`. This covers the
4848
runtime-erased path, including its wrapper dispatch and scratch-buffer copy. It
4949
is registered with `2`, `4`, and `8` fields.
5050

51+
`bench_typed_cubic_combined` measures one focused cubic case:
52+
`FieldGroup::evaluate_all_cubic_into(query, results)` for a 4D uniform table
53+
with `4` fields. Cubic interpolation uses `4^Dim` table points, so a 4D case
54+
already exercises the important 256-point stencil cost. The suite intentionally
55+
does not expand cubic across every dimension, axis layout, or field count.
56+
5157
## Build And Run
5258

5359
Configure and build the benchmark target:

benchmarks/lookup_benchmarks.cpp

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ struct LookupContext
5353
ndtbl::FieldGroup<double, Dim> group;
5454
ndtbl::RuntimeFieldGroup<Dim> runtime_group;
5555
std::vector<std::array<double, Dim>> queries;
56-
ndtbl::PreparedQuery<Dim> prepared;
56+
ndtbl::LinearStencil<Dim> prepared;
5757

5858
LookupContext(const ndtbl::Grid<Dim>& grid_in,
5959
const ndtbl::FieldGroup<double, Dim>& group_in,
@@ -62,7 +62,7 @@ struct LookupContext
6262
, group(group_in)
6363
, runtime_group(group)
6464
, queries(queries_in)
65-
, prepared(grid.prepare(queries.front()))
65+
, prepared(grid.prepare_linear(queries.front()))
6666
{
6767
}
6868
};
@@ -287,7 +287,7 @@ context(std::size_t extent, ndtbl::axis_kind axis_kind, std::size_t field_count)
287287
/**
288288
* @brief Benchmark axis bracketing and interpolation-stencil preparation.
289289
*
290-
* Measures `Grid::prepare(query)` only.
290+
* Measures `Grid::prepare_linear(query)` only.
291291
*
292292
* @tparam Dim Benchmark dimensionality.
293293
* @param state Google Benchmark state.
@@ -305,7 +305,8 @@ bench_prepare(benchmark::State& state,
305305
std::size_t query = 0;
306306

307307
for (auto _ : state) {
308-
ndtbl::PreparedQuery<Dim> prepared = data.grid.prepare(data.queries[query]);
308+
ndtbl::LinearStencil<Dim> prepared =
309+
data.grid.prepare_linear(data.queries[query]);
309310
benchmark::DoNotOptimize(prepared);
310311
query = (query + 1) % data.queries.size();
311312
}
@@ -342,8 +343,8 @@ bench_prepared_evaluate(benchmark::State& state,
342343
/**
343344
* @brief Benchmark typed end-to-end lookup from query coordinates.
344345
*
345-
* Measures `FieldGroup::evaluate_all_into(query, results)`, including stencil
346-
* preparation and interpolation.
346+
* Measures `FieldGroup::evaluate_all_linear_into(query, results)`, including
347+
* stencil preparation and interpolation.
347348
*
348349
* @tparam Dim Benchmark dimensionality.
349350
* @param state Google Benchmark state.
@@ -363,7 +364,38 @@ bench_typed_combined(benchmark::State& state,
363364
std::size_t query = 0;
364365

365366
for (auto _ : state) {
366-
data.group.evaluate_all_into(data.queries[query], results.data());
367+
data.group.evaluate_all_linear_into(data.queries[query], results.data());
368+
benchmark::DoNotOptimize(results.data());
369+
benchmark::ClobberMemory();
370+
query = (query + 1) % data.queries.size();
371+
}
372+
}
373+
374+
/**
375+
* @brief Benchmark typed end-to-end cubic lookup from query coordinates.
376+
*
377+
* Measures `FieldGroup::evaluate_all_cubic_into(query, results)`, including
378+
* cubic stencil preparation and interpolation.
379+
*
380+
* @tparam Dim Benchmark dimensionality.
381+
* @param state Google Benchmark state.
382+
* @param extent Number of support points per axis.
383+
* @param axis_kind Axis representation to benchmark.
384+
* @param field_count Number of fields to evaluate at each lookup.
385+
*/
386+
template<std::size_t Dim>
387+
void
388+
bench_typed_cubic_combined(benchmark::State& state,
389+
std::size_t extent,
390+
ndtbl::axis_kind axis_kind,
391+
std::size_t field_count)
392+
{
393+
const LookupContext<Dim>& data = context<Dim>(extent, axis_kind, field_count);
394+
std::vector<double> results(data.group.field_count(), 0.0);
395+
std::size_t query = 0;
396+
397+
for (auto _ : state) {
398+
data.group.evaluate_all_cubic_into(data.queries[query], results.data());
367399
benchmark::DoNotOptimize(results.data());
368400
benchmark::ClobberMemory();
369401
query = (query + 1) % data.queries.size();
@@ -373,8 +405,8 @@ bench_typed_combined(benchmark::State& state,
373405
/**
374406
* @brief Benchmark runtime-erased end-to-end lookup from query coordinates.
375407
*
376-
* Measures `RuntimeFieldGroup::evaluate_all_into(query, results)`, including
377-
* wrapper dispatch and scratch-buffer handling.
408+
* Measures `RuntimeFieldGroup::evaluate_all_linear_into(query, results)`,
409+
* including wrapper dispatch and scratch-buffer handling.
378410
*
379411
* @tparam Dim Benchmark dimensionality.
380412
* @param state Google Benchmark state.
@@ -394,7 +426,8 @@ bench_runtime_combined(benchmark::State& state,
394426
std::size_t query = 0;
395427

396428
for (auto _ : state) {
397-
data.runtime_group.evaluate_all_into(data.queries[query], results.data());
429+
data.runtime_group.evaluate_all_linear_into(data.queries[query],
430+
results.data());
398431
benchmark::DoNotOptimize(results.data());
399432
benchmark::ClobberMemory();
400433
query = (query + 1) % data.queries.size();
@@ -440,6 +473,11 @@ NDTBL_REGISTER_LOOKUP_BENCHMARKS(4,
440473
default_extent<4>(),
441474
ndtbl::axis_kind::uniform,
442475
d4_uniform);
476+
BENCHMARK_CAPTURE(bench_typed_cubic_combined<4>,
477+
d4_uniform_fields_4_cubic_combined,
478+
default_extent<4>(),
479+
ndtbl::axis_kind::uniform,
480+
4);
443481
NDTBL_REGISTER_LOOKUP_BENCHMARKS(4,
444482
default_extent<4>(),
445483
ndtbl::axis_kind::explicit_coordinates,

codecov.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ coverage:
88
status:
99
project:
1010
cpp:
11-
target: auto
11+
target: 90
1212
flags:
1313
- cpp
1414
python:
15-
target: auto
15+
target: 90
1616
flags:
1717
- python
1818

doc/cppapi.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,13 @@ API reference
1212
.. doxygenclass:: ndtbl::Grid
1313
:members:
1414

15-
.. doxygenclass:: ndtbl::PreparedQuery
15+
.. doxygenclass:: ndtbl::TensorStencil
1616
:members:
1717

18+
.. doxygentypedef:: ndtbl::LinearStencil
19+
20+
.. doxygentypedef:: ndtbl::CubicStencil
21+
1822
.. doxygenclass:: ndtbl::FieldGroup
1923
:members:
2024

include/ndtbl/field_group.hpp

Lines changed: 69 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,9 @@ namespace ndtbl {
1818
* memory.
1919
*
2020
* The storage layout is point-major in row-major grid order:
21-
* `point0.field0, point0.field1, ..., point1.field0, ...`
22-
* where the last grid axis varies fastest before stepping to the next field
23-
* tuple.
24-
* so that one prepared interpolation query can accumulate all fields together.
21+
* `point0.field0, point0.field1, ..., point1.field0, ...` where the last grid
22+
* axis varies fastest before stepping to the next field tuple. One prepared
23+
* interpolation stencil can accumulate all fields together.
2524
*/
2625
template<class Value, std::size_t Dim>
2726
class FieldGroup
@@ -137,34 +136,37 @@ class FieldGroup
137136
* @brief Evaluate all fields using a previously prepared interpolation
138137
* stencil.
139138
*
140-
* @param prepared Prepared query to reuse across fields.
139+
* @tparam Stencil Fixed-size interpolation stencil type.
140+
* @param stencil Prepared stencil to reuse across fields.
141141
* @return Interpolated field values in storage order.
142-
* @see Grid::prepare
143-
* @see evaluate_all(const std::array<double, Dim>&)
142+
* @see Grid::prepare_linear
143+
* @see Grid::prepare_cubic
144+
* @see evaluate_all_linear(const std::array<double, Dim>&)
144145
*/
145-
std::vector<Value> evaluate_all(const PreparedQuery<Dim>& prepared) const
146+
template<class Stencil>
147+
std::vector<Value> evaluate_all(const Stencil& stencil) const
146148
{
147149
std::vector<Value> results(field_count(), Value(0));
148-
evaluate_all_into(prepared, results.data());
150+
evaluate_all_into(stencil, results.data());
149151
return results;
150152
}
151153

152154
/**
153155
* @brief Evaluate all fields using a previously prepared interpolation
154156
* stencil into caller-provided storage.
155157
*
156-
* @param prepared Prepared query to reuse across fields.
158+
* @tparam Stencil Fixed-size interpolation stencil type.
159+
* @param stencil Prepared stencil to reuse across fields.
157160
* @param results Output buffer with space for `field_count()` values.
158-
* @see evaluate_all(const PreparedQuery<Dim>&)
161+
* @see evaluate_all(const Stencil&)
159162
*/
160-
void evaluate_all_into(const PreparedQuery<Dim>& prepared,
161-
Value* results) const
163+
template<class Stencil>
164+
void evaluate_all_into(const Stencil& stencil, Value* results) const
162165
{
163166
std::fill(results, results + field_count(), Value(0));
164-
for (std::size_t corner = 0; corner < PreparedQuery<Dim>::corners;
165-
++corner) {
166-
const double weight = prepared.weight(corner);
167-
const std::size_t base = prepared.point_index(corner) * field_count();
167+
for (std::size_t point = 0; point < Stencil::points; ++point) {
168+
const double weight = stencil.weight(point);
169+
const std::size_t base = stencil.point_index(point) * field_count();
168170
for (std::size_t field = 0; field < field_count(); ++field) {
169171
results[field] +=
170172
static_cast<Value>(weight * interleaved_values_[base + field]);
@@ -173,34 +175,73 @@ class FieldGroup
173175
}
174176

175177
/**
176-
* @brief Evaluate all fields directly from query coordinates.
178+
* @brief Evaluate all fields directly from query coordinates using
179+
* multilinear interpolation.
177180
*
178181
* @param coordinates Query coordinates in grid axis order.
179182
* @return Interpolated field values in storage order.
180183
* @param policy Bounds handling behavior for out-of-domain coordinates.
181-
* @see evaluate_all(const PreparedQuery<Dim>&)
184+
* @see evaluate_all(const Stencil&)
182185
*/
183-
std::vector<Value> evaluate_all(
186+
std::vector<Value> evaluate_all_linear(
184187
const std::array<double, Dim>& coordinates,
185188
bounds_policy policy = bounds_policy::clamp) const
186189
{
187-
return evaluate_all(grid_.prepare(coordinates, policy));
190+
return evaluate_all(grid_.prepare_linear(coordinates, policy));
188191
}
189192

190193
/**
191-
* @brief Evaluate all fields directly from query coordinates into
192-
* caller-provided storage.
194+
* @brief Evaluate all fields directly from query coordinates using
195+
* multilinear interpolation into caller-provided storage.
193196
*
194197
* @param coordinates Query coordinates in grid axis order.
195198
* @param results Output buffer with space for `field_count()` values.
196199
* @param policy Bounds handling behavior for out-of-domain coordinates.
197-
* @see evaluate_all_into(const PreparedQuery<Dim>&, Value*)
200+
* @see evaluate_all_into(const Stencil&, Value*)
198201
*/
199-
void evaluate_all_into(const std::array<double, Dim>& coordinates,
200-
Value* results,
201-
bounds_policy policy = bounds_policy::clamp) const
202+
void evaluate_all_linear_into(
203+
const std::array<double, Dim>& coordinates,
204+
Value* results,
205+
bounds_policy policy = bounds_policy::clamp) const
206+
{
207+
evaluate_all_into(grid_.prepare_linear(coordinates, policy), results);
208+
}
209+
210+
/**
211+
* @brief Evaluate all fields directly from query coordinates using local
212+
* cubic interpolation.
213+
*
214+
* Cubic interpolation uses four support points per axis and is therefore
215+
* intended for experiments where the additional cost and possible overshoot
216+
* are acceptable.
217+
*
218+
* @param coordinates Query coordinates in grid axis order.
219+
* @param policy Bounds handling behavior for out-of-domain coordinates.
220+
* @return Cubically interpolated field values in storage order.
221+
* @see Grid::prepare_cubic
222+
*/
223+
std::vector<Value> evaluate_all_cubic(
224+
const std::array<double, Dim>& coordinates,
225+
bounds_policy policy = bounds_policy::clamp) const
226+
{
227+
return evaluate_all(grid_.prepare_cubic(coordinates, policy));
228+
}
229+
230+
/**
231+
* @brief Evaluate all fields directly from query coordinates using local
232+
* cubic interpolation into caller-provided storage.
233+
*
234+
* @param coordinates Query coordinates in grid axis order.
235+
* @param results Output buffer with space for `field_count()` values.
236+
* @param policy Bounds handling behavior for out-of-domain coordinates.
237+
* @see evaluate_all_cubic
238+
*/
239+
void evaluate_all_cubic_into(
240+
const std::array<double, Dim>& coordinates,
241+
Value* results,
242+
bounds_policy policy = bounds_policy::clamp) const
202243
{
203-
evaluate_all_into(grid_.prepare(coordinates, policy), results);
244+
evaluate_all_into(grid_.prepare_cubic(coordinates, policy), results);
204245
}
205246

206247
private:

0 commit comments

Comments
 (0)