Skip to content

Fix resampling of gapped recordings.#4499

Open
grahamfindlay wants to merge 2 commits intoSpikeInterface:mainfrom
grahamfindlay:feature/gap_tolerant_resampling
Open

Fix resampling of gapped recordings.#4499
grahamfindlay wants to merge 2 commits intoSpikeInterface:mainfrom
grahamfindlay:feature/gap_tolerant_resampling

Conversation

@grahamfindlay
Copy link
Copy Markdown
Contributor

This handles an edge case that I should have thought of in #4429 : When a recording containing gaps is resampled, interpolation of timestamps and data should never span a gap. They should be done on a section-by-section basis, where "section" = "a gapless (contiguously sampled) stretch of data" (I chose the term "section" to avoid confusion with neo/spikeinterface "blocks" or "segments", which are are not required to be gapless).

This PR does that, and introduces a new gap_tolerance_ms parameter to resample() that controls the behavior:

sp.resample(recording, resample_rate, gap_tolerance_ms=None, ...)

I tried to make this analogous to and consistent with the gap_tolerance_ms parameter in neo, discussed [here] (NeuralEnsemble/python-neo#1773) by @h-mayorquin and @samuelgarcia and implemented in this PR. So the behavior for the following values of gap_tolerance_ms is:

  • None (default): If timestamp gaps are detected in the parent time
    vector, a ValueError is raised with a report of the gaps (count, sizes,
    positions).
  • <value> (e.g., 1.0): Opt in to section-wise resampling. Gaps
    larger than the threshold trigger section splitting; smaller gaps are
    treated as continuous.
  • 0.0: Strict mode — split on any gap >= 1.5 sample periods.

Internally, gap detection identifies contiguous sections of samples.
Each section is resampled independently --
both timestamps and traces -- then results are concatenated. Margins for
edge-artifact reduction are clamped to section boundaries so that FFT
processing never crosses a gap. This is the most obnoxious part. I really wanted to find a way to let scipy.signal handle reflect padding in this case, but I could not, so I do it, and that adds a fair bit of complexity to the code.

When no gaps are detected (eg if the recording has no time vector), the existing code path executes unchanged.

I tested this on a bunch of my own gapped recordings, and it seems to work. Then I had my buddy Claude ;) write a bunch of tests (included here) based on those, using synthetic gapped recordings.

Because of the antialiasing filter applied during decimate(), a similar approach should technically be taken there (so the AA filter applied to one section isn't influenced by data from a neighboring section), but the scope of this PR is just limited to resample(). Errors introduced by gaps in decimate() should be small and mostly negligible.

@alejoe91 alejoe91 requested a review from h-mayorquin April 9, 2026 09:53
Copy link
Copy Markdown
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. I can't still fully wrap my head around exactly how resampling goes bad at the gaps but of course we should account for them. Thanks as wel for taking a look at the neo implmeentation regarding gaps it is great to unify the API.

Some comments on the implementation:

  1. Pre-allocate memory instead of concatenating. In _get_traces_gapped, you know the output size upfront (end_frame - start_frame). Pre-allocating a single array and filling slices would avoid the intermediate allocations from appending to pieces and the final np.concatenate. This is the pattern we use elsewhere, e.g. in NoiseGeneratorRecording.

  2. Unify the code paths. Rather than having get_traces dispatch to _get_traces_gapped, could _get_traces_gapped just become the single get_traces? A gap-free recording is simply one section spanning the entire trace, so the gapped logic handles it naturally with one iteration of the loop. This would eliminate the duplication between the two methods.

  3. Reuse get_chunk_with_margin. The margin clamping, fetching, and reflect-padding logic in _get_traces_gapped reimplements what get_chunk_with_margin already does. The difference is that you clamp to section boundaries instead of recording boundaries, but I think this could be handled by passing the section length as the effective boundary. Worth exploring to avoid duplicating that logic.

  4. Extract gap detection into a shared utility. The gap detection and section boundary computation (~30 lines in __init__) could live as a standalone function in recording_tools.py, something like detect_sections(time_vector, sampling_frequency, gap_tolerance_ms) that returns the (n_sections, 2) boundary array. This would make the __init__ easier to read and the utility reusable if other preprocessing steps that use filters (e.g., FilterRecording) need gap awareness later. What do you think?

  5. Tests. The coverage is solid. A couple of minor notes: there's no test for integer ratio with gaps on traces (the trace-correctness tests all use non-integer ratio), and test_resample_traces_across_gap accesses private attributes (_sec_n_out) to find the split point between sections. You could derive that from the output time vector instead (find the gap in resampled timestamps), which would decouple the test from the internal representation.

All of these are strict improvements and I think the current implementation is good to go as-is. We can track these as a follow-up issue. The only one that I kind of feel should be done here is the pre-allocation (1) and maybe the tests (5) but I will leave that to @alejoe91 and you to decide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants