Skip to content

Commit 85805fb

Browse files
committed
Fix sleep session identification in validation - REM sensitivity now 85.7%
Root cause: Sleep stages from Health Connect came in reverse chronological order. The validation was using sleepStages[0] as session start, which was actually the END of the most recent session. All stages were processed as one continuous session spanning multiple nights. Fix: - Added identifySleepSessions() to sort stages and separate by >4 hour gaps - Each session now processed independently with fresh state - Added debug logging for session identification Results: - Python reference: 81.3% REM sensitivity - TypeScript on device: 85.7% REM sensitivity - 9 sleep sessions correctly identified from 469 stages
1 parent bdb47c1 commit 85805fb

3 files changed

Lines changed: 554 additions & 41 deletions

File tree

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Session Notes: Sleep Stage Classifier Debugging (2026-01-14)
2+
3+
## RESULT: SUCCESS - REM Sensitivity: 85.7%
4+
5+
The fix worked! Classifier now achieves 85.7% REM sensitivity on device (vs 81.3% in Python reference).
6+
7+
## Problem
8+
9+
The hybrid sleep stage classifier achieves 81.3% REM sensitivity in Python but shows 0% on the Android device.
10+
11+
## Root Cause Found
12+
13+
**The TypeScript validation was not processing sleep sessions correctly:**
14+
15+
1. Sleep stages from Health Connect came in **reverse chronological order** (newest first)
16+
2. The validation used `sleepStages[0].startTime` as session start - which was actually the END of the most recent session
17+
3. All stages were processed as ONE continuous session instead of separate nights
18+
4. This caused `minutesSinceSleepStart` to be wildly incorrect (spanning days instead of hours)
19+
20+
## Fix Applied
21+
22+
Added `identifySleepSessions()` function to TypeScript that:
23+
24+
1. Sorts stages by startTime (ascending)
25+
2. Groups stages into sessions separated by >4 hour gaps
26+
3. Processes each session independently with fresh state
27+
4. Matches the Python implementation behavior
28+
29+
## Key Algorithm Parameters (must match between Python and TypeScript)
30+
31+
```
32+
CV_THRESHOLD = 0.20 // Lower CV = more stable = more likely REM
33+
REM_CONSECUTIVE_REQUIRED = 2 // Need 2 consecutive signals to predict REM
34+
MAX_RECENT_HR_SAMPLES = 20
35+
MAX_RMSSD_HISTORY = 10
36+
ULTRADIAN_CYCLE_MINUTES = 90
37+
FIRST_REM_LATENCY = 70 min // No REM predicted before 70 minutes
38+
```
39+
40+
## Data Issue Fixed
41+
42+
Fixed malformed timestamp in `notes/raw_sleep_data.json`:
43+
44+
- Before: `'2026-26-01-06T20:07:00Z'` (invalid month=26)
45+
- After: `'2026-01-06T20:07:00Z'`
46+
47+
## Files Modified
48+
49+
- `services/remOptimizedClassifier.ts` - Added `identifySleepSessions()` function, modified `runValidation()` to process sessions separately
50+
- `notes/raw_sleep_data.json` - Fixed malformed timestamp at index 4086
51+
- `scripts/debug_python_vs_ts.py` - Created debug script showing intermediate values
52+
53+
## Python Classifier Results (Reference)
54+
55+
```
56+
Best parameters: CV_thresh=0.2, time_weight=0.5
57+
58+
Confusion Matrix:
59+
Predicted: Awake NREM REM
60+
Actual awake: 0 56 60
61+
Actual nrem : 12 334 398
62+
Actual rem : 0 37 161
63+
64+
Metrics:
65+
Accuracy: 46.8%
66+
REM Sensitivity: 81.3%
67+
REM Specificity: 46.7%
68+
REM Precision: 26.0%
69+
REM F1: 39.4%
70+
```
71+
72+
## Next Steps
73+
74+
1. Run classifier training on device with the fixed code
75+
2. Verify REM sensitivity matches Python (~81%)
76+
3. If still 0%, add debug logging to TypeScript to compare intermediate values
77+
78+
## Key Insight
79+
80+
REM sleep has **MORE STABLE** heart rate variability than NREM:
81+
82+
- REM CV: 0.19 (lower = more stable)
83+
- NREM CV: 0.27 (higher = more variable)
84+
85+
This is counterintuitive - one might expect REM to have more variable HR due to dream activity, but physiologically REM has stable, regulated breathing.
86+
87+
## Final Results on Device
88+
89+
```
90+
VALIDATION RESULTS (Leave-one-out cross-validation)
91+
Overall Accuracy: 41.5%
92+
Total samples: 3223
93+
94+
REM DETECTION METRICS (Key for dream induction):
95+
Sensitivity: 85.7% (true REM detected)
96+
Specificity: 41.2% (non-REM correctly rejected)
97+
98+
Per-stage accuracy:
99+
Awake: 0.2%
100+
NREM: 41.0%
101+
REM: 85.7%
102+
103+
Confusion Matrix:
104+
Predicted: Awake NREM REM
105+
Actual Awake: 1 198 310
106+
Actual NREM: 15 906 1289
107+
Actual REM: 4 68 432
108+
```
109+
110+
Validation logs confirmed proper session identification:
111+
112+
```
113+
[Validation] Found 9 sleep sessions from 469 stages
114+
[Validation] Session 0: 66 stages, starts 2026-01-07T06:03:00.000Z
115+
[Validation] Session 2: 45 stages, starts 2026-01-08T06:54:30.000Z
116+
...
117+
[Validation] Session 8: 55 stages, starts 2026-01-14T05:27:00.000Z
118+
[Validation] Total samples processed: 3223
119+
[Validation] Confusion matrix: rem->rem=432, rem->nrem=68, nrem->rem=1289
120+
```
121+
122+
## Git Commits
123+
124+
- Previous: e313954 (Sort HR samples by time in validation)
125+
- This session: Fix session identification in validation (sort stages, process sessions separately)
126+
Continue debugging the sleep stage classifier. The main fix (session identification in TypeScript validation) has been applied but not yet tested on device.
127+
128+
The device needs Health Connect permissions granted first. After that:
129+
130+
1. Go to Settings > Train 3-Class (REM-Optimized)
131+
2. Wait for training to complete
132+
3. Check if REM sensitivity is now ~81% (matching Python)
133+
134+
If still 0%, add console.log statements to runValidation() printing:
135+
136+
- Number of sessions found
137+
- Session start times
138+
- Sample count per session
139+
- CV values when remScore > 0.25
140+
141+
Phone connected via ADB at ~/Library/Android/sdk/platform-tools/adb
142+
143+
```
144+
145+
```

0 commit comments

Comments
 (0)