You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: validate CTD numeric fields before sending to echosounder
- Coerce temperature/salinity/pressure/depth/sound_speed to float
before including in ctdOutput payload
- Skip corrupt values (e.g. '1.!60') instead of forwarding as strings
- Add ADCP file patterns to watch_patterns config
- README updates for CTD monitoring and config docs
@@ -84,20 +85,89 @@ Configuration is driven by IoT Hub module twin desired properties. All `EdgeConf
84
85
85
86
## Configuration
86
87
87
-
All config flows through the `EdgeConfig` dataclass in `config.py`:
88
+
All config flows through the `EdgeConfig` dataclass in `config.py`. In IoT Edge mode, every field is readable/writable via module twin desired properties (changes apply live). In standalone mode, fields come from environment variables and CLI args.
88
89
89
-
| Field | Default | Description |
90
-
|-------|---------|-------------|
91
-
|`input_mode`|`both`|`stream`, `file`, or `both`|
92
-
|`stream_format`|`auto`|`nmea`, `csv`, `hex`, or `auto`|
93
-
|`stream_port`|`9100`| TCP/UDP listen port |
94
-
|`watch_dir`|`/data/sensor`| Directory to watch for new files |
|`watch_polling`| bool |`false`| Use polling instead of inotify (required for SMB/NFS mounts) |
113
+
|`watch_poll_interval`| int |`2`| Seconds between polls when `watch_polling` is true |
114
+
|`backfill_minutes`| int |`0`| On startup, queue files modified within the last N minutes. `0` = skip all existing files (only process new arrivals). Set to e.g. `60` to reprocess the last hour of data after a restart |
115
+
116
+
### Batching
117
+
118
+
| Field | Type | Default | Description |
119
+
|-------|------|---------|-------------|
120
+
|`batch_interval_seconds`| int |`60`| Stream batch flush interval (seconds) |
121
+
|`batch_max_records`| int |`1000`| Stream batch flush when this many records buffered |
122
+
123
+
### Metadata
124
+
125
+
| Field | Type | Default | Description |
126
+
|-------|------|---------|-------------|
127
+
|`campaign_id`| string |`""`| Campaign identifier — used as blob container name and output path prefix |
By default (`backfill_minutes: 0`), the module does **not** process existing files when it starts. It only processes files that arrive after startup via the file watcher, stream listener, or IoT Edge messages. This prevents reprocessing the entire dataset on every container restart.
167
+
168
+
To backfill recent data after a restart, set `backfill_minutes` to the desired window (e.g. `60` for the last hour). Only files whose modification time falls within that window are queued. This is useful when the module was down briefly and you want to catch up on missed files.
169
+
170
+
See `config.py` for implementation details.
101
171
102
172
## Connecting Sensors
103
173
@@ -203,18 +273,23 @@ Test data is stored in Azure Blob Storage (`sensorstream-test` container) and do
203
273
204
274
**Output**: D2C telemetry to IoT Hub (rate-limited by `telemetry_downsample_seconds`), GeoParquet + metadata JSON to blob storage.
205
275
206
-
**Twin**: All config fields are readable/writable via module twin. Changes are applied live without restart.
276
+
**Twin**: All config fields are readable/writable via module twin desired properties. Changes are applied live without restart. Twin property names map 1:1 to config field names, except `Log_Level` → `log_level`.
277
+
278
+
**Backfill on restart**: By default the module skips existing files. Set `backfill_minutes` in the twin to process recent files after a restart (e.g. `60` for the last hour).
0 commit comments