Skip to content

Commit 219be79

Browse files
committed
Docs: add Jupyter Book in docs/ with Pages workflow; initial content (install, quickstart, datasets, LibriBrain)
1 parent 86381da commit 219be79

11 files changed

Lines changed: 335 additions & 1 deletion

File tree

.github/workflows/docs.yml

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
name: Docs
2+
3+
on:
4+
push:
5+
branches: [ main ]
6+
workflow_dispatch:
7+
8+
permissions:
9+
contents: read
10+
pages: write
11+
id-token: write
12+
13+
concurrency:
14+
group: pages
15+
cancel-in-progress: true
16+
17+
jobs:
18+
build:
19+
runs-on: ubuntu-latest
20+
steps:
21+
- name: Checkout
22+
uses: actions/checkout@v4
23+
24+
- name: Setup Python
25+
uses: actions/setup-python@v5
26+
with:
27+
python-version: '3.11'
28+
29+
- name: Install doc deps
30+
run: |
31+
python -m pip install --upgrade pip
32+
pip install -r docs/requirements.txt
33+
34+
- name: Build Jupyter Book
35+
run: |
36+
jupyter-book build docs/
37+
38+
- name: Upload artifact
39+
uses: actions/upload-pages-artifact@v3
40+
with:
41+
path: docs/_build/html
42+
43+
deploy:
44+
environment:
45+
name: github-pages
46+
url: ${{ steps.deployment.outputs.page_url }}
47+
runs-on: ubuntu-latest
48+
needs: build
49+
steps:
50+
- name: Deploy to GitHub Pages
51+
id: deployment
52+
uses: actions/deploy-pages@v4
53+

README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,4 +48,10 @@ print("Label shape:", label.shape)
4848
```
4949

5050
## Support
51-
In case of any questions or problems, please get in touch through [our Discord server](https://discord.gg/Fqr8gJnvSh).
51+
In case of any questions or problems, please get in touch through [our Discord server](https://discord.gg/Fqr8gJnvSh).
52+
## Documentation
53+
54+
We publish documentation with Jupyter Book and GitHub Pages.
55+
56+
- Live site: enable “GitHub Pages → Source: GitHub Actions” in repo settings (first run deploys automatically).
57+
- Build locally: `pip install -r docs/requirements.txt && jupyter-book build docs/`

docs/_config.yml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
################################################################################
2+
# Jupyter Book configuration
3+
################################################################################
4+
5+
title: PNPL
6+
author: Neural Processing Lab
7+
logo: null
8+
only_build_toc_files: true
9+
10+
repository:
11+
url: https://github.com/neural-processing-lab/pnpl-public
12+
path_to_book: docs
13+
branch: main
14+
15+
html:
16+
favicon: null
17+
use_repository_button: true
18+
use_issues_button: true
19+
use_edit_page_button: true
20+
extra_navbar: "PNPL public package"
21+
extra_footer: "© Neural Processing Lab"
22+
23+
execute:
24+
execute_notebooks: cache
25+
26+
parse:
27+
myst_enable_extensions:
28+
- colon_fence
29+
- deflist
30+
- dollarmath
31+
- html_image
32+
- linkify
33+
- substitution
34+

docs/_toc.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
format: jb-book
2+
root: index
3+
chapters:
4+
- file: install
5+
- file: quickstart
6+
- file: datasets
7+
- file: libribrain
8+
- file: development
9+

docs/datasets.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
title: Datasets
3+
---
4+
5+
# Public Datasets
6+
7+
The `pnpl.datasets` package provides dataset classes designed for deep learning workflows (PyTorch `Dataset`).
8+
9+
## GroupedDataset
10+
11+
Utility dataset to group multiple datasets and expose a unified interface.
12+
13+
```python
14+
from pnpl.datasets import GroupedDataset, LibriBrainSpeech, LibriBrainPhoneme
15+
```
16+
17+
## HDF5Dataset (base)
18+
19+
`pnpl.datasets.hdf5.HDF5Dataset` is a simple base for datasets backed by MEG signals serialized as HDF5, with standardization and slicing support.
20+
21+
Key features:
22+
- windowed access `(channels, time)`
23+
- channel-wise standardization
24+
- optional clipping
25+
26+
## LibriBrain 2025
27+
28+
- `LibriBrainPhoneme`: phoneme classification from MEG segments.
29+
- `LibriBrainSpeech`: speech/silence time-series labels over a window.
30+
31+
Both rely on a BIDS-like directory structure and can download needed files from Hugging Face.
32+

docs/development.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
title: Development
3+
---
4+
5+
# Development
6+
7+
## Running Tests
8+
9+
```bash
10+
python -m venv .venv && source .venv/bin/activate
11+
pip install -e .
12+
pip install pytest
13+
pytest -q
14+
```
15+
16+
## Building this documentation locally
17+
18+
```bash
19+
python -m venv .venv && source .venv/bin/activate
20+
pip install -r docs/requirements.txt
21+
jupyter-book build docs/
22+
open docs/_build/html/index.html
23+
```
24+

docs/index.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
---
2+
title: PNPL
3+
---
4+
5+
# PNPL
6+
7+
PNPL is a public Python package for loading and processing brain datasets for deep learning.
8+
9+
- PyPI package: `pnpl`
10+
- Source: {repo:link}`neural-processing-lab/pnpl-public`
11+
12+
This site documents the public package and shows how to install and use the datasets.
13+
14+
```{note}
15+
For internal/private datasets, install the overlay package `pnpl-internal` in addition to `pnpl`.
16+
The overlay contributes extra modules under the same `pnpl.*` namespace and is documented privately.
17+
```
18+
19+
## What’s inside
20+
21+
- A lightweight top-level namespace `pnpl` with lazily exposed symbols
22+
- `pnpl.datasets` with public datasets and helpers
23+
- LibriBrain 2025 datasets: phoneme- and speech-based tasks
24+
25+
Use the navigation to explore installation and examples.
26+

docs/install.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
title: Install
3+
---
4+
5+
# Install
6+
7+
PNPL requires Python 3.10+ and installs from PyPI:
8+
9+
```bash
10+
pip install pnpl
11+
```
12+
13+
Core scientific dependencies include `numpy`, `pandas`, `torch`, `h5py`, `mne`, `mne_bids`, and `huggingface_hub`.
14+
15+
```{tip}
16+
To use private/internal datasets as part of the same `pnpl` namespace, also install the overlay package `pnpl-internal` from your private index (or editable checkout). The overlay depends on `pnpl` and contributes additional modules under `pnpl.*`.
17+
```
18+
19+
## Development install (editable)
20+
21+
```bash
22+
git clone https://github.com/neural-processing-lab/pnpl-public.git
23+
cd pnpl-public
24+
python -m venv .venv && source .venv/bin/activate
25+
pip install -e .
26+
```
27+

docs/libribrain.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: LibriBrain
3+
---
4+
5+
# LibriBrain Datasets
6+
7+
The LibriBrain 2025 datasets provide MEG-based tasks with convenient download and caching from Hugging Face.
8+
9+
## Common Arguments
10+
11+
- `data_path`: local root where files are stored / downloaded
12+
- `preprocessing_str`: expected preprocessing string in filenames
13+
- `tmin`, `tmax`: window relative to event (seconds)
14+
- `standardize`: z-score channels using per-run stats
15+
- `include_run_keys`: list of run keys to include (see constants.RUN_KEYS)
16+
- `include_info`: include an info dict in each sample
17+
- `download`: if True (default), fetch missing files via Hugging Face
18+
19+
## Speech (binary time series)
20+
21+
```python
22+
from pnpl.datasets import LibriBrainSpeech
23+
from pnpl.datasets.libribrain2025 import constants
24+
25+
ds = LibriBrainSpeech(
26+
data_path="./data/LibriBrain",
27+
preprocessing_str="bads+headpos+sss+notch+bp+ds",
28+
include_run_keys=[constants.RUN_KEYS[0]],
29+
tmin=0.0,
30+
tmax=0.2,
31+
include_info=True,
32+
)
33+
34+
print(len(ds))
35+
```
36+
37+
Each item returns `(data: float32[channels,time], labels: int[time], info: dict)`.
38+
39+
## Phoneme (classification)
40+
41+
```python
42+
from pnpl.datasets import LibriBrainPhoneme
43+
from pnpl.datasets.libribrain2025 import constants
44+
45+
ds = LibriBrainPhoneme(
46+
data_path="./data/LibriBrain",
47+
preprocessing_str="bads+headpos+sss+notch+bp+ds",
48+
include_run_keys=[constants.RUN_KEYS[0]],
49+
tmin=-0.2,
50+
tmax=0.6,
51+
)
52+
print(len(ds))
53+
```
54+
55+
Each item returns `(data: float32[channels,time], label_id: int64)`.
56+

docs/quickstart.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
title: Quickstart
3+
---
4+
5+
# Quickstart
6+
7+
This page shows short examples loading datasets and iterating samples.
8+
9+
## LibriBrain Speech (public)
10+
11+
```python
12+
from pnpl.datasets.libribrain2025 import constants
13+
from pnpl.datasets import LibriBrainSpeech
14+
15+
# pick one run to keep it quick
16+
include_run_keys = [constants.RUN_KEYS[0]] # e.g. ('0','1','Sherlock1','1')
17+
18+
ds = LibriBrainSpeech(
19+
data_path="./data/LibriBrain",
20+
preprocessing_str="bads+headpos+sss+notch+bp+ds",
21+
include_run_keys=include_run_keys,
22+
tmin=0.0,
23+
tmax=0.2,
24+
standardize=True,
25+
include_info=True,
26+
)
27+
28+
print(len(ds), "samples")
29+
x, y, info = ds[0]
30+
print(x.shape, y.shape, info["dataset"]) # (channels,time), (time,), "libribrain2025"
31+
```
32+
33+
## LibriBrain Phoneme (public)
34+
35+
```python
36+
from pnpl.datasets.libribrain2025 import constants
37+
from pnpl.datasets import LibriBrainPhoneme
38+
39+
include_run_keys = [constants.RUN_KEYS[0]]
40+
41+
ds = LibriBrainPhoneme(
42+
data_path="./data/LibriBrain",
43+
preprocessing_str="bads+headpos+sss+notch+bp+ds",
44+
include_run_keys=include_run_keys,
45+
tmin=-0.2,
46+
tmax=0.6,
47+
standardize=True,
48+
)
49+
50+
print(len(ds), "samples")
51+
x, y = ds[0]
52+
print(x.shape, y.item())
53+
```
54+
55+
```{note}
56+
The first time you instantiate a dataset with `download=True` (default), required files are downloaded from Hugging Face and cached under `data_path`.
57+
```
58+

0 commit comments

Comments
 (0)