|
| 1 | +# msdial-eic-reader |
| 2 | + |
| 3 | +Unofficial reader for MS-DIAL aligned EIC archives (`AlignResult*.EIC.aef`). |
| 4 | + |
| 5 | +The goal is simple: inspect the chromatogram traces and integration boundaries |
| 6 | +that MS-DIAL already used, without recomputing EICs from raw files in a |
| 7 | +downstream QC dashboard. |
| 8 | + |
| 9 | +This project is not affiliated with MS-DIAL, RIKEN, or the MS-DIAL maintainers. |
| 10 | +The `.EIC.aef` format appears to be an internal binary format and may change in |
| 11 | +future MS-DIAL releases. |
| 12 | + |
| 13 | +## Why this exists |
| 14 | + |
| 15 | +MS-DIAL can show aligned chromatograms in the GUI, but downstream tools often |
| 16 | +only receive alignment tables and summary values. For QC, it is useful to see |
| 17 | +the per-sample traces behind an aligned feature: |
| 18 | + |
| 19 | +- did every replicate integrate the expected peak; |
| 20 | +- did one sample pick a smaller neighboring peak; |
| 21 | +- are the left, apex, and right markers plausible; |
| 22 | +- can an operator review a batch without reopening the full MS-DIAL GUI. |
| 23 | + |
| 24 | +Recomputing EICs outside MS-DIAL is not ideal for this use case, because the |
| 25 | +external viewer may show traces or peak boundaries that differ from what |
| 26 | +MS-DIAL actually integrated. |
| 27 | + |
| 28 | +## What it reads |
| 29 | + |
| 30 | +The reader targets MS-DIAL 5 `CSS1` EIC archives named like: |
| 31 | + |
| 32 | +```text |
| 33 | +AlignResult*.EIC.aef |
| 34 | +``` |
| 35 | + |
| 36 | +For each aligned feature and sample trace, the JSON output includes: |
| 37 | + |
| 38 | +- center values: RT, RI, m/z, drift, and chromatogram x-axis type; |
| 39 | +- sample/file ID; |
| 40 | +- peak markers: `left`, `top`, and `right`; |
| 41 | +- EIC points as `{ "x": ..., "intensity": ... }`; |
| 42 | +- original and returned point counts. |
| 43 | + |
| 44 | +Feature indices are zero-based archive indices. In observed outputs they match |
| 45 | +the row order of the paired MS-DIAL alignment table after its header, but treat |
| 46 | +that as a practical convention rather than an official guarantee. |
| 47 | + |
| 48 | +## Install from source |
| 49 | + |
| 50 | +```bash |
| 51 | +git clone https://github.com/Fraximov/msdial-eic-reader.git |
| 52 | +cd msdial-eic-reader |
| 53 | +python -m pip install . |
| 54 | +``` |
| 55 | + |
| 56 | +## Python usage |
| 57 | + |
| 58 | +```python |
| 59 | +from msdial_eic_reader import MsdialEicArchive |
| 60 | + |
| 61 | +archive = MsdialEicArchive("AlignResult-test.EIC.aef") |
| 62 | + |
| 63 | +print(archive.summary()) |
| 64 | + |
| 65 | +feature = archive.read_feature(42, max_points_per_trace=900) |
| 66 | +for peak in feature["peaks"]: |
| 67 | + print(peak["file_id"], peak["left"], peak["top"], peak["right"]) |
| 68 | +``` |
| 69 | + |
| 70 | +## Command line usage |
| 71 | + |
| 72 | +Print archive metadata: |
| 73 | + |
| 74 | +```bash |
| 75 | +msdial-eic-reader summary AlignResult-test.EIC.aef |
| 76 | +``` |
| 77 | + |
| 78 | +Read one aligned feature: |
| 79 | + |
| 80 | +```bash |
| 81 | +msdial-eic-reader feature AlignResult-test.EIC.aef 42 --max-points 900 |
| 82 | +``` |
| 83 | + |
| 84 | +Read a small window of features: |
| 85 | + |
| 86 | +```bash |
| 87 | +msdial-eic-reader window AlignResult-test.EIC.aef 40 5 --max-points 900 |
| 88 | +``` |
| 89 | + |
| 90 | +Set `--max-points 0` to return all points. The default downsamples long traces |
| 91 | +to keep browser dashboards responsive. |
| 92 | + |
| 93 | +## Rust CLI |
| 94 | + |
| 95 | +A dependency-free Rust CLI is included in `rust/` for batch or dashboard use: |
| 96 | + |
| 97 | +```bash |
| 98 | +cd rust |
| 99 | +cargo run -- --file ../AlignResult-test.EIC.aef --index 42 --max-points 900 |
| 100 | +``` |
| 101 | + |
| 102 | +It also supports window reads, which are useful when a UI slider moves through |
| 103 | +nearby features: |
| 104 | + |
| 105 | +```bash |
| 106 | +cargo run -- --file ../AlignResult-test.EIC.aef --start 40 --count 11 --max-points 900 |
| 107 | +``` |
| 108 | + |
| 109 | +## Format notes |
| 110 | + |
| 111 | +The inferred binary layout is documented in |
| 112 | +[`docs/css1-eic-aef-format.md`](docs/css1-eic-aef-format.md). |
| 113 | + |
| 114 | +Short version: |
| 115 | + |
| 116 | +```text |
| 117 | +10 bytes version string, null-padded ASCII, observed: CSS1 |
| 118 | +int32 feature count |
| 119 | +int64[] absolute offsets to aligned feature payloads |
| 120 | +
|
| 121 | +feature payload: |
| 122 | +float32 center RT |
| 123 | +float32 center RI |
| 124 | +float32 center m/z |
| 125 | +float32 center drift |
| 126 | +uint8 chromatogram x-axis type |
| 127 | +int32 sample trace count |
| 128 | +
|
| 129 | +sample trace: |
| 130 | +int32 file ID |
| 131 | +int32 point count |
| 132 | +float32 top/apex x position |
| 133 | +float32 left boundary x position |
| 134 | +float32 right boundary x position |
| 135 | +float32[] repeated x, intensity pairs |
| 136 | +``` |
| 137 | + |
| 138 | +All numeric values are little-endian in observed files. |
| 139 | + |
| 140 | +## Limitations |
| 141 | + |
| 142 | +- This is an unofficial reader for an internal MS-DIAL file. |
| 143 | +- It has been tested only against observed MS-DIAL 5 `CSS1` archives. |
| 144 | +- It does not parse the paired alignment table; use the table to map feature |
| 145 | + indices to metabolite names, average RT, average m/z, annotations, and sample |
| 146 | + names. |
| 147 | +- It is intended for QC visualization and traceability, not for replacing |
| 148 | + MS-DIAL's integration. |
| 149 | + |
| 150 | +## Contributing |
| 151 | + |
| 152 | +Real-world fixtures are the most useful contribution, but please do not commit |
| 153 | +large raw data. Small synthetic `.EIC.aef` examples, version information, and |
| 154 | +edge cases are ideal. |
| 155 | + |
| 156 | +If the MS-DIAL project later exposes an official export or API for this data, |
| 157 | +this reader should either adapt to that API or clearly point users to it. |
0 commit comments