Follow up Issue for Improving BCI2k reader, since #13699 is merged
What the spec says (Required Parameters - Section Source):
Both SourceChGain and SourceChOffset are listed as required floatlist parameters in every BCI2000 data file:
SourceChOffset - raw data zero offset in AD units
SourceChGain - factor to convert raw AD units into µV
What a real file actually looks like (from a recorded .dat header):
Two things to notice here:
-
The first token after = is the list length, not a value. 2 means "there are 2 channels, and 2 values follow". The actual per-channel values start from the second token onward.
-
The unit string (muV) is embedded inside each value token — e.g. 0.1muV is not a plain float. It must be split into a numeric part (0.1) and a unit part (muV) before any arithmetic.
What the current reader does:
value = right.strip().split()[0] # grabs "2" — the list length, not a gain value
params[name] = value
This stores only the list-length count (2) and throws away all the actual per-channel values silently. Because of this, SourceChGain and SourceChOffset are never applied, and the signal is returned in raw ADC counts with no warning to the user.
What the fix needs to do :
Parse list parameters correctly:
After stripping the // comment, detect that the first token is an integer count n, then collect the next n tokens as the value list:
# strip inline comment first
rhs = right.split("//")[0].strip()
tokens = rhs.split()
# first token is list length for floatlist params
n = int(tokens[0])
raw_values = tokens[1 : n + 1] # e.g. ["0.1muV", "0.1muV"]
Step 2 - Strip embedded unit strings and convert to volts:
The unit token (e.g. muV, mV, V) must be parsed out of each value before converting to a float. MNE stores EEG data internally in volts (V), so the final scaled value must be in V:
import re
_UNIT_SCALE = {"v": 1.0, "mv": 1e-3, "muv": 1e-6, "uv": 1e-6, "nv": 1e-9}
def _parse_value_with_unit(token):
"""Split '0.1muV' into (0.1, 1e-6). Returns (float, scale_to_V)."""
m = re.match(r"([-+]?\d*\.?\d+(?:[eE][-+]?\d+)?)\s*([a-zA-Z]*)", token)
num = float(m.group(1))
unit = m.group(2).lower().rstrip("v") + "v" if m.group(2) else "v"
scale = _UNIT_SCALE.get(unit, 1.0)
return num, scale
Then apply offset and gain per channel before passing to BaseRaw:
for ch in range(n_channels):
offset, _ = _parse_value_with_unit(offsets[ch]) # offset is dimensionless (AD units)
gain, scale = _parse_value_with_unit(gains[ch]) # gain * scale => V per AD count
signal[ch] = (signal[ch] + offset) * gain * scale # now in V
cc @larsoner
References
Follow up Issue for Improving BCI2k reader, since #13699 is merged
What the spec says (Required Parameters - Section Source):
Both
SourceChGainandSourceChOffsetare listed as requiredfloatlistparameters in every BCI2000 data file:SourceChOffset- raw data zero offset in AD unitsSourceChGain- factor to convert raw AD units into µVWhat a real file actually looks like (from a recorded
.datheader):Two things to notice here:
The first token after
=is the list length, not a value.2means "there are 2 channels, and 2 values follow". The actual per-channel values start from the second token onward.The unit string (
muV) is embedded inside each value token — e.g.0.1muVis not a plain float. It must be split into a numeric part (0.1) and a unit part (muV) before any arithmetic.What the current reader does:
This stores only the list-length count (
2) and throws away all the actual per-channel values silently. Because of this,SourceChGainandSourceChOffsetare never applied, and the signal is returned in raw ADC counts with no warning to the user.What the fix needs to do :
Parse list parameters correctly:
After stripping the
// comment, detect that the first token is an integer countn, then collect the nextntokens as the value list:Step 2 - Strip embedded unit strings and convert to volts:
The unit token (e.g.
muV,mV,V) must be parsed out of each value before converting to a float. MNE stores EEG data internally in volts (V), so the final scaled value must be in V:Then apply offset and gain per channel before passing to
BaseRaw:cc @larsoner
References