Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset

This repository accompanies Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset by Gereon Elvers, Gilad Landau, and Oiwi Parker Jones. It contains the tutorial, experiment notebooks, and supporting assets used to demonstrate non-invasive neural keyword spotting on the LibriBrain MEG corpus.

Repository Layout

tutorial/ – Colab-ready walkthrough that trains a keyword spotter on a 10% LibriBrain subset within one hour on a T4 GPU.
experiments/ – Reproducibility notebooks covering scaling, buffer length, keyword vocabulary, and result cleanup (see experiments/README.md).

Prerequisites

Python: Python 3.10 or later is recommended. Use a dedicated virtual environment (e.g., python -m venv .venv && source .venv/bin/activate).
PNPL library: Keyword spotting support now ships with the main PNPL package—no fork is required. Install the latest release directly from the existing repository:
```
pip install "git+https://github.com/neural-processing-lab/pnpl.git"
```
LibriBrain dataset: Will be automatically downloaded using the PNPL package. Alternatively, download it manually here.
Experiment logging (optional): Some experiment notebooks log to Neptune. Export NEPTUNE_API_TOKEN and NEPTUNE_PROJECT before running if you wish to capture metrics (see experiments/README.md).

Working with `LibriBrainWord`

The PNPL LibriBrainWord dataset object offers full-signal, single-keyword, and multi-keyword configurations. After installing PNPL, the following examples cover common use cases:

from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    tmin=0.0,
    tmax=0.8,
)

from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    keyword_detection="watson",
)

from pnpl.datasets import LibriBrainWord

dataset = LibriBrainWord(
    data_path="./data/",
    partition="train",
    keyword_detection=["sherlock", "holmes"],
)

Notes

For keyword detection, the window length adapts to the longest keyword; extend with positive_buffer / negative_buffer or override via tmin and tmax.
Full signal-to-word mode keeps sensible window defaults by disabling tmin and tmax overrides unless explicitly provided.
Keyword-aware splits validate that requested keywords exist and fall back to the sessions with the highest prevalence for validation/test partitions.

Citation

If this work helps your research, please cite:

@inproceedings{elvers2025elementary,
  title = {Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset},
  author = {Elvers, Gereon and Landau, Gilad and Parker Jones, Oiwi},
  booktitle = {Data on the Brain \& Mind Workshop at NeurIPS 2025},
  year = {2025},
  url = {https://data-brain-mind.github.io/},
}

Contact

For questions or collaboration opportunities, please open an issue or contact the authors through the Neural Processing Lab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset

Repository Layout

Prerequisites

Working with `LibriBrainWord`

Notes

Citation

Contact

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset

Repository Layout

Prerequisites

Working with LibriBrainWord

Notes

Citation

Contact

Working with `LibriBrainWord`