Skip to content

Changes to Ephys Schema#65

Open
judewerth wants to merge 2 commits into
dj-sciops:mainfrom
judewerth:main
Open

Changes to Ephys Schema#65
judewerth wants to merge 2 commits into
dj-sciops:mainfrom
judewerth:main

Conversation

@judewerth

Copy link
Copy Markdown
  1. map channel to electrode function
  2. Fix bug in trace extraction (currently doesn't account for file boundaries
  3. Add impedance measurement tables
  4. Add STTC tables

@MilagrosMarin

1) map channel to electrode function
2) Fix bug in trace extraction (currently doesn't account for file boundaries
3) Add impedance measurement tables
4) Add STTC tables
MilagrosMarin added a commit that referenced this pull request Mar 26, 2026
…ies, STTC, ImpedanceMeasurements)

- Move neo/quantities/elephant imports to lazy imports inside STTC.make()
  to avoid breaking the module in environments without these packages
- Fix map_channel_to_electrode: restrict ElectrodeConfig.Electrode fetch
  by probe_type (was unrestricted), add conflict detection across multiple
  configs, and deduplicate consistent mappings safely
- Fix LFP boundary trimming: replace elif with two independent if blocks
  so single-file sessions correctly trim both start and end boundaries
- Replace # REMOVE LATER comment in STTC.make() with proper explanation
- Fix ImpedanceMeasurements.make(): move EphysSessionProbe existence check
  before the fetch to prevent KeyError on empty result

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@MilagrosMarin MilagrosMarin self-requested a review March 27, 2026 01:43
@MilagrosMarin

MilagrosMarin commented Mar 27, 2026

Copy link
Copy Markdown

Potential memory crash in STTC.make()np.subtract.outer is O(n²)

The spike time latency calculation uses:

diff_matrix = np.abs(np.subtract.outer(spikes_A, spikes_B))
closest_spikes = np.min(diff_matrix, axis=1)

This allocates a full (len(spikes_A) × len(spikes_B)) matrix in memory. Checked against the current pipeline data:

  • O09 session (already in STTC): max firing rate ~70 Hz over 15 min → ~63k spikes → ~30 GB matrix for the two highest-firing units
  • MB03 session (in QualityMetrics): units at ~1,386 Hz → ~415k spikes → matrix would be in the terabyte range

The existing STTC rows only cover low-firing units (unit 0 at ~1.8 Hz) so it hasn't crashed yet, but it will as soon as higher-firing units are included.

Suggestion only: a memory-safe alternative using binary search (O(n log n) time, O(n) memory), but please test and validate against your existing results before adopting:

sorted_B = np.sort(spikes_B)
idx = np.searchsorted(sorted_B, spikes_A, side='left')
left_idx = np.clip(idx - 1, 0, len(sorted_B) - 1)
right_idx = np.clip(idx, 0, len(sorted_B) - 1)
closest_spikes = np.minimum(
    np.abs(spikes_A - sorted_B[left_idx]),
    np.abs(spikes_A - sorted_B[right_idx]),
)
spike_time_latencies = closest_spikes[closest_spikes <= dt]

@MilagrosMarin

Copy link
Copy Markdown

Minor: .astype(int) truncates spike time precision

Spike times are converted to integer milliseconds before being passed to neo.SpikeTrain:

spikes_A = (spike_times[i] * ...).astype(int)

Two concerns:

  1. Precision loss: At 30 kHz, one sample = 0.033 ms — truncation loses up to ~1 ms per spike. With dt=20ms that's ~5% error at the boundary. For a statistical metric this is minor, but worth a methods note in the manuscript.
  2. Duplicate spike times: At high firing rates (e.g. MB03 at ~1,386 Hz), two spikes within the same millisecond window map to the same integer, which could cause errors in neo.SpikeTrain (expects monotonically increasing times).

neo.SpikeTrain accepts float milliseconds, so removing .astype(int) would fix both. Suggestion only — please validate against existing results before changing.

@MilagrosMarin MilagrosMarin left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: ephys. prefix in foreign keys within ephys_no_curation.py

Two new table definitions in this PR use -> ephys.X to reference tables defined in this same file:

  • ImpedanceFile (line 518): -> ephys.EphysRawFile
  • STTC (line 1262): -> ephys.CuratedClustering

Why this is fragile: ephys is not imported in ephys_no_curation.py. The reference resolves only because the workflow's ephys.py sets linking_module=__name__ where from element_array_ephys import ephys_no_curation as ephys. Any other project activating this element without that exact alias will get a runtime resolution error.

Both tables should reference these classes directly since they live in the same module:

# ImpedanceFile (line 518)
-> EphysRawFile

# STTC (line 1262)
-> CuratedClustering

Note: STTC.make() already uses CuratedClustering.Unit & key (no prefix), confirming the inconsistency.

"""

definition = """
-> ephys.CuratedClustering

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> ephys.CuratedClustering uses ephys which is not defined in this file — it only resolves via the workflow's linking module alias (import ephys_no_curation as ephys). Since CuratedClustering is defined in this same file, use -> CuratedClustering directly. Same issue at line 518 for ImpedanceFile -> ephys.EphysRawFile → should be -> EphysRawFile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants