Skip to content

feat: wire manifest readinessProbe to remote session controller#1135

Merged
SalimKayal merged 9 commits into
mainfrom
salimkayal-feat-allow-readiness-probe-on-http
Jun 17, 2026
Merged

feat: wire manifest readinessProbe to remote session controller#1135
SalimKayal merged 9 commits into
mainfrom
salimkayal-feat-allow-readiness-probe-on-http

Conversation

@SalimKayal

@SalimKayal SalimKayal commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

The AmaltheaSession CRD already exposes spec.session.readinessProbe, spec.session.port, and spec.session.urlPath. These fields were previously only honored for local sessions (through Kubernetes container probes). This PR forwards the same values into the remote session controller sidecar so its /ready HTTP endpoint reports readiness based on whether the actual remote session is reachable.

Motivation and Context

For remote sessions managed via FirecREST / RunAI / Slurm, the controller pod was always reporting ready immediately, even if the IDE on the remote compute node had not started yet. We need the Kubernetes readiness state to reflect real user-facing session availability so that ingress and service routing only start once the session is actually usable.

Changes

  • Injects RSC_SESSION_PORT, RSC_SESSION_URL_PATH, and RSC_READINESS_PROBE_TYPE as environment variables into the remote session controller container (sessionContainerRemote).
  • The sidecar's /ready handler now interprets these values:
    • none — immediately returns 200 OK.
    • tcp — attempts a TCP dial to 127.0.0.1:<RSC_SESSION_PORT> (the local wstunnel endpoint); returns 200 on success, 503 otherwise.
    • http — performs an HTTP GET to 127.0.0.1:<RSC_SESSION_PORT><RSC_SESSION_URL_PATH>; returns 200 on 2xx/3xx responses, 503 otherwise.
  • Fully backward compatible: an empty or unknown probe type, as well as port 0, fall back to the previous behavior of returning 200 OK.

Tests

  • Added TestConfigNewFlags covering CLI flags, environment variables, and validation of the readiness probe type.
  • Added TestReadyEndpoint covering all probe modes and edge cases: open/closed TCP ports, HTTP 200/302/500, connection refused, and port-0 fallback.

Compatibility

  • No CRD changes required — the fields already exist in amaltheasession_types.go.
  • Existing remote sessions without these values continue to behave exactly as before.

Notes

@SalimKayal SalimKayal requested review from leafty and sambuc June 10, 2026 14:10
@SalimKayal SalimKayal requested review from a team and olevski as code owners June 10, 2026 14:10
@SalimKayal SalimKayal force-pushed the salimkayal-feat-allow-readiness-probe-on-http branch 2 times, most recently from 589f4a5 to c09c795 Compare June 11, 2026 14:17

@olevski olevski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Salim. Just a few minor things. You can ignore the note about finding a free port if you wish.

Comment thread internal/remote/server/server.go
Comment thread internal/remote/server/server.go
Comment on lines +56 to +72
{
name: "interactive tcp open port returns 200",
cfg: func() config.RemoteSessionControllerConfig {
c := baseCfg
c.ReadinessProbeType = string(amaltheadevv1alpha1.TCP)
c.SessionPort = 18000
return c
}(),
setupBackend: func() func() {
ln, err := net.Listen("tcp", "127.0.0.1:18000")
if err != nil {
panic(err)
}
return func() { _ = ln.Close() }
},
wantStatus: http.StatusOK,
},

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a way to not hardcode the port to run the test server on.

func getFreePortOrDie() int {
	var a *net.TCPAddr
	var err error
	if a, err = net.ResolveTCPAddr("tcp", "localhost:0"); err == nil {
		var l *net.TCPListener
		if l, err = net.ListenTCP("tcp", a); err == nil {
			defer func() {
				err := l.Close()
				if err != nil {
					panic(err)
				}
			}()
			return l.Addr().(*net.TCPAddr).Port
		}
		panic(err)
	}
	panic(err)
}

From https://github.com/SwissDataScienceCenter/renku-frontend-buildpacks/blob/main/tests/e2e/utils.go#L162C1-L179C2

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, added

Comment thread internal/remote/config/config.go Outdated
return err
}

cmd.Flags().String(readinessProbeTypeFlag, "", "readiness probe type: none, tcp, or http")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the default "" and not "none"?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment thread internal/remote/server/server.go Outdated
Comment on lines +123 to +125
if cfg.SessionPort == 0 {
return c.NoContent(http.StatusOK)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not happen in my opinion: using a port value of 0 is invalid. The default values should map to amaltheadevv1alpha1.None.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment thread internal/remote/server/server.go Outdated
Comment on lines +120 to +121
case string(amaltheadevv1alpha1.None):
return c.NoContent(http.StatusOK)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we also have a default case, you can remove the None case.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

cfg.ReadinessProbeType != string(amaltheadevv1alpha1.TCP) &&
cfg.ReadinessProbeType != string(amaltheadevv1alpha1.HTTP) {
return fmt.Errorf("invalid readiness probe type: %s", cfg.ReadinessProbeType)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to not update the configuration to have none as the default? This way you can remove the defensive code in the endpoint implementation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added default as none

@SalimKayal SalimKayal force-pushed the salimkayal-feat-allow-readiness-probe-on-http branch from 05e26bf to c5b20d9 Compare June 16, 2026 10:59
@SalimKayal SalimKayal requested a review from olevski June 16, 2026 14:24
SalimKayal and others added 8 commits June 17, 2026 11:57
Scope: Core feature — the RSC /ready endpoint becomes aware of the remote session's serving state.

- api/v1alpha1/amaltheasession_children.go: inject RSC_SESSION_PORT, RSC_SESSION_URL_PATH, and RSC_READINESS_PROBE_TYPE into the RSC container
- internal/remote/config/config.go: add SessionPort, SessionURLPath, ReadinessProbeType to RemoteSessionControllerConfig; register flags/env vars
- internal/remote/server/server.go: rework /ready handler with switch on ReadinessProbeType (none/tcp/http/default)
- internal/remote/server/server_test.go: add TestReadyEndpoint with 11 test cases
Scope: Prevent invalid probe types from being silently accepted; test the new config surface.

- internal/remote/config/config.go: add ReadinessProbeType enum validation in Validate() (reject values other than "", none, tcp, http)
- internal/remote/config/config_test.go: add TestConfigNewFlags covering defaults, CLI flags, env vars (RSC_READINESS_PROBE_TYPE, RSC_SESSION_PORT, RSC_SESSION_URL_PATH), and validation errors
Co-authored-by: Tasko Olevski <16360283+olevski@users.noreply.github.com>
@SalimKayal SalimKayal force-pushed the salimkayal-feat-allow-readiness-probe-on-http branch from 3f6ecab to a107f62 Compare June 17, 2026 09:57
Comment thread internal/remote/server/server.go Outdated
Comment thread internal/remote/server/server.go Outdated
Co-authored-by: Flora Thiebaut <flora@leafty.dev>
@SalimKayal SalimKayal force-pushed the salimkayal-feat-allow-readiness-probe-on-http branch from 6eafe8d to 8658bbb Compare June 17, 2026 10:34
@SalimKayal SalimKayal dismissed olevski’s stale review June 17, 2026 11:53

are the changes ok ?

@SalimKayal SalimKayal merged commit 1488d77 into main Jun 17, 2026
8 checks passed
@SalimKayal SalimKayal deleted the salimkayal-feat-allow-readiness-probe-on-http branch June 17, 2026 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants