Skip to content

Latest commit

 

History

History
72 lines (57 loc) · 3.19 KB

File metadata and controls

72 lines (57 loc) · 3.19 KB

Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset

This repository accompanies Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset by Gereon Elvers, Gilad Landau, and Oiwi Parker Jones. It contains the tutorial, experiment notebooks, and supporting assets used to demonstrate non-invasive neural keyword spotting on the LibriBrain MEG corpus.

Repository Layout

  • tutorial/ – Colab-ready walkthrough that trains a keyword spotter on a 10% LibriBrain subset within one hour on a T4 GPU.
  • experiments/ – Reproducibility notebooks covering scaling, buffer length, keyword vocabulary, and result cleanup (see experiments/README.md).

Prerequisites

  1. Python: Python 3.10 or later is recommended. Use a dedicated virtual environment (e.g., python -m venv .venv && source .venv/bin/activate).
  2. PNPL library: Keyword spotting support now ships with the main PNPL package—no fork is required. Install the latest release directly from the existing repository:
    pip install "git+https://github.com/neural-processing-lab/pnpl.git"
  3. LibriBrain dataset: Will be automatically downloaded using the PNPL package. Alternatively, download it manually here.
  4. Experiment logging (optional): Some experiment notebooks log to Neptune. Export NEPTUNE_API_TOKEN and NEPTUNE_PROJECT before running if you wish to capture metrics (see experiments/README.md).

Working with LibriBrainWord

The PNPL LibriBrainWord dataset object offers full-signal, single-keyword, and multi-keyword configurations. After installing PNPL, the following examples cover common use cases:

from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    tmin=0.0,
    tmax=0.8,
)
from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    keyword_detection="watson",
)
from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    keyword_detection=["sherlock", "holmes"],
)

Notes

  • For keyword detection, the window length adapts to the longest keyword; extend with positive_buffer / negative_buffer or override via tmin and tmax.
  • Full signal-to-word mode keeps sensible window defaults by disabling tmin and tmax overrides unless explicitly provided.
  • Keyword-aware splits validate that requested keywords exist and fall back to the sessions with the highest prevalence for validation/test partitions.

Citation

If this work helps your research, please cite:

@inproceedings{elvers2025elementary,
  title = {Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset},
  author = {Elvers, Gereon and Landau, Gilad and Parker Jones, Oiwi},
  booktitle = {Data on the Brain \& Mind Workshop at NeurIPS 2025},
  year = {2025},
  url = {https://data-brain-mind.github.io/},
}

Contact

For questions or collaboration opportunities, please open an issue or contact the authors through the Neural Processing Lab.