Skip to content

fix: /healthz and /readyz wait for initial device connection (SOFIE-4325)#1730

Merged
jstarpl merged 2 commits into
Sofie-Automation:mainfrom
nrkno:contrib/mos-gw-init
Jun 2, 2026
Merged

fix: /healthz and /readyz wait for initial device connection (SOFIE-4325)#1730
jstarpl merged 2 commits into
Sofie-Automation:mainfrom
nrkno:contrib/mos-gw-init

Conversation

@jstarpl

@jstarpl jstarpl commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

About the Contributor

This pull request is posted on behalf of the NRK.

Type of Contribution

This is a:

Bug fix

Current Behavior

The Kubernetes /healthz and /readyz don't return an OK signal until devices configured at boot-up have connected.

New Behavior

The device connection is detached from boot-up, since it's not really necessary for the boot-up process. This allows the instance to stay up instead of being restarted.

Testing

  • I have added one or more unit tests for this PR
  • I have updated the relevant unit tests
  • No unit test changes are needed for this PR

Affected areas

This PR affects the MOS Gateway bootup process.

Time Frame

Not urgent, but we would like to get this merged into the in-development release.

Other Information

Status

  • PR is ready to be reviewed.
  • The functionality has been tested by the author.
  • Relevant unit tests has been added / updated.
  • Relevant documentation (code comments, system documentation) has been added / updated.

@jstarpl jstarpl added 🐛🔨 bugfix This is a fix for an issue Contribution from NRK Contributions sponsored by NRK (nrk.no) labels Apr 27, 2026
@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5dc93ad1-0d08-4261-87da-8a824c52927a

📥 Commits

Reviewing files that changed from the base of the PR and between 474e827 and 594ed23.

📒 Files selected for processing (1)
  • packages/mos-gateway/src/mosHandler.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/mos-gateway/src/mosHandler.ts

Walkthrough

MosHandler startup now awaits MOS connection, registers coreHandler.onConnected, sets up observers, and schedules a debounced device refresh via triggerUpdateDevices(). _coreHandler is made definite and the debounce timer field renamed to _triggerUpdateDevicesTimeout.

Changes

Initialization & debounce refactor

Layer / File(s) Summary
Fields and init guard
packages/mos-gateway/src/mosHandler.ts
_coreHandler becomes a definite CoreHandler field and the debounce timer field is renamed to _triggerUpdateDevicesTimeout. The runtime check in init that threw when coreHandler was undefined is removed.
Startup sequencing and debounced refresh
packages/mos-gateway/src/mosHandler.ts
init now awaits _initMosConnection(), registers coreHandler.onConnected, calls setupObservers(), and schedules a debounced refresh via triggerUpdateDevices() instead of awaiting _updateDevices() immediately. _deviceOptionsChanged now calls triggerUpdateDevices(); the helper clears existing timeouts and schedules _updateDevices() with error logging.
sequenceDiagram
  participant MosHandler
  participant MOSConnection
  participant CoreHandler
  participant UpdateDevices

  MosHandler->>MOSConnection: _initMosConnection()
  MOSConnection-->>MosHandler: connected
  MosHandler->>CoreHandler: register onConnected callback
  MosHandler->>MosHandler: setupObservers()
  MosHandler->>MosHandler: triggerUpdateDevices() (debounced)
  MosHandler->>UpdateDevices: _updateDevices() (scheduled after debounce)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • nytamin
  • rjmunro
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main bug fix: health check endpoints waiting for device connection, matching the core change in MosHandler initialization.
Description check ✅ Passed The description is well-related to the changeset, explaining the bug (health checks blocking until devices connect), the fix (decoupling device connection from bootup), and affected areas (MOS Gateway bootup).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Apr 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
packages/mos-gateway/src/mosHandler.ts 0.00% 12 Missing ⚠️

📢 Thoughts on this report? Let us know!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/mos-gateway/src/mosHandler.ts (2)

83-83: Non-null assertion makes existing _coreHandler undefined-checks dead code.

With _coreHandler!: CoreHandler the field can no longer be typed undefined, so the if (!this._coreHandler) throw … branches at lines 156-158, 294, 453-455, and 599 are unreachable per the type system. They're harmless, but worth removing in a follow-up to keep the runtime invariant aligned with the type. Not blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mos-gateway/src/mosHandler.ts` at line 83, The field _coreHandler is
declared with a non-null assertion but code contains runtime undefined checks,
so make the type reflect that by changing the declaration from "_coreHandler!:
CoreHandler" to "private _coreHandler?: CoreHandler" so existing guards (the if
(!this._coreHandler) throw ...) remain meaningful; update any call sites that
currently rely on non-null typing to either keep the guards or use explicit
non-null assertions where you truly know it is initialized (e.g., inside methods
that run after initialization) so the type and runtime behavior stay aligned.

209-218: Clear _triggerUpdateDevicesTimeout on dispose().

The deferred update is now scheduled via setTimeout and stored on the instance, but dispose() (lines 141-148) never clears it. The _disposed check inside _updateDevices() does prevent work from happening, but the pending timer still keeps the event loop alive and holds a closure reference to this. Cheap to fix:

♻️ Proposed cleanup
 	async dispose(): Promise<void> {
 		this._disposed = true
+		if (this._triggerUpdateDevicesTimeout) {
+			clearTimeout(this._triggerUpdateDevicesTimeout)
+			this._triggerUpdateDevicesTimeout = null
+		}
 		if (this.mos) {
 			return this.mos.dispose()
 		} else {
 			return Promise.resolve()
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mos-gateway/src/mosHandler.ts` around lines 209 - 218, The instance
schedules a deferred timer in triggerUpdateDevices (stores it in
_triggerUpdateDevicesTimeout) but dispose() does not clear it, leaving the timer
alive and holding a reference to this; update dispose() to
clearTimeout(this._triggerUpdateDevicesTimeout) and then null/undefined the
property (or otherwise reset it) so any pending timer is cancelled, and ensure
triggerUpdateDevices still overwrites the field when scheduling a new timer;
reference functions/fields: triggerUpdateDevices, _triggerUpdateDevicesTimeout,
dispose, and _updateDevices.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Around line 132-139: The ordering regression causes setupObservers() to run
before _initMosConnection() and can NPE when _deviceOptionsChanged() calls
this.mos.setDebug; fix by guarding _deviceOptionsChanged() to tolerate a missing
mos (e.g. early-return or defer calling this.mos.setDebug if this.mos is
undefined) or by moving setupObservers() to run only after
_initMosConnection()/_updateDevices() has initialized this.mos; update the code
paths in setupObservers, triggerUpdateDevices, _deviceOptionsChanged,
_updateDevices and _initMosConnection so observers never call this.mos methods
when this.mos is undefined (prefer adding a simple if (!this.mos) return or
queue the change until after _initMosConnection completes).

---

Nitpick comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Line 83: The field _coreHandler is declared with a non-null assertion but code
contains runtime undefined checks, so make the type reflect that by changing the
declaration from "_coreHandler!: CoreHandler" to "private _coreHandler?:
CoreHandler" so existing guards (the if (!this._coreHandler) throw ...) remain
meaningful; update any call sites that currently rely on non-null typing to
either keep the guards or use explicit non-null assertions where you truly know
it is initialized (e.g., inside methods that run after initialization) so the
type and runtime behavior stay aligned.
- Around line 209-218: The instance schedules a deferred timer in
triggerUpdateDevices (stores it in _triggerUpdateDevicesTimeout) but dispose()
does not clear it, leaving the timer alive and holding a reference to this;
update dispose() to clearTimeout(this._triggerUpdateDevicesTimeout) and then
null/undefined the property (or otherwise reset it) so any pending timer is
cancelled, and ensure triggerUpdateDevices still overwrites the field when
scheduling a new timer; reference functions/fields: triggerUpdateDevices,
_triggerUpdateDevicesTimeout, dispose, and _updateDevices.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7ad57a02-ca62-4d6c-bc26-bb7fb90c58f3

📥 Commits

Reviewing files that changed from the base of the PR and between bb2a27d and d126baa.

📒 Files selected for processing (1)
  • packages/mos-gateway/src/mosHandler.ts

Comment thread packages/mos-gateway/src/mosHandler.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/mos-gateway/src/mosHandler.ts`:
- Around line 140-141: init() currently returns before the first
_updateDevices() runs and swallows _initMosConnection() failures, so
MosHandler.create() can appear successful while MOS is broken; update init() to
await the initial device update by returning/awaiting
this.triggerUpdateDevices() (or directly await this._updateDevices()) so init()
only resolves after the first successful update, and change the
_initMosConnection() error handling to rethrow or propagate errors instead of
only logging (so callers of init()/MosHandler.create() receive the failure).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1af185af-b884-4d8c-9a2b-ae15055f4973

📥 Commits

Reviewing files that changed from the base of the PR and between d126baa and de1fece.

📒 Files selected for processing (1)
  • packages/mos-gateway/src/mosHandler.ts

Comment thread packages/mos-gateway/src/mosHandler.ts
@jstarpl jstarpl requested a review from nytamin May 14, 2026 12:39
@jstarpl jstarpl force-pushed the contrib/mos-gw-init branch from 474e827 to 594ed23 Compare June 2, 2026 08:38
@jstarpl jstarpl merged commit ab40675 into Sofie-Automation:main Jun 2, 2026
23 of 24 checks passed
@jstarpl jstarpl deleted the contrib/mos-gw-init branch June 2, 2026 09:01
@PeterC89

PeterC89 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

@jstarpl Not sure if this is an issue with our system but the MOS Gateway connections now show as disconnected all the time in the Sofie UI.
They are up and working and the health / readiness checks pass, any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐛🔨 bugfix This is a fix for an issue Contribution from NRK Contributions sponsored by NRK (nrk.no)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants