feat: wire manifest readinessProbe to remote session controller#1135
Conversation
589f4a5 to
c09c795
Compare
olevski
left a comment
There was a problem hiding this comment.
Thanks Salim. Just a few minor things. You can ignore the note about finding a free port if you wish.
| { | ||
| name: "interactive tcp open port returns 200", | ||
| cfg: func() config.RemoteSessionControllerConfig { | ||
| c := baseCfg | ||
| c.ReadinessProbeType = string(amaltheadevv1alpha1.TCP) | ||
| c.SessionPort = 18000 | ||
| return c | ||
| }(), | ||
| setupBackend: func() func() { | ||
| ln, err := net.Listen("tcp", "127.0.0.1:18000") | ||
| if err != nil { | ||
| panic(err) | ||
| } | ||
| return func() { _ = ln.Close() } | ||
| }, | ||
| wantStatus: http.StatusOK, | ||
| }, |
There was a problem hiding this comment.
This is a way to not hardcode the port to run the test server on.
func getFreePortOrDie() int {
var a *net.TCPAddr
var err error
if a, err = net.ResolveTCPAddr("tcp", "localhost:0"); err == nil {
var l *net.TCPListener
if l, err = net.ListenTCP("tcp", a); err == nil {
defer func() {
err := l.Close()
if err != nil {
panic(err)
}
}()
return l.Addr().(*net.TCPAddr).Port
}
panic(err)
}
panic(err)
}
| return err | ||
| } | ||
|
|
||
| cmd.Flags().String(readinessProbeTypeFlag, "", "readiness probe type: none, tcp, or http") |
There was a problem hiding this comment.
Why is the default "" and not "none"?
| if cfg.SessionPort == 0 { | ||
| return c.NoContent(http.StatusOK) | ||
| } |
There was a problem hiding this comment.
This should not happen in my opinion: using a port value of 0 is invalid. The default values should map to amaltheadevv1alpha1.None.
| case string(amaltheadevv1alpha1.None): | ||
| return c.NoContent(http.StatusOK) |
There was a problem hiding this comment.
Since we also have a default case, you can remove the None case.
| cfg.ReadinessProbeType != string(amaltheadevv1alpha1.TCP) && | ||
| cfg.ReadinessProbeType != string(amaltheadevv1alpha1.HTTP) { | ||
| return fmt.Errorf("invalid readiness probe type: %s", cfg.ReadinessProbeType) | ||
| } |
There was a problem hiding this comment.
Is there any reason to not update the configuration to have none as the default? This way you can remove the defensive code in the endpoint implementation.
There was a problem hiding this comment.
added default as none
05e26bf to
c5b20d9
Compare
Scope: Core feature — the RSC /ready endpoint becomes aware of the remote session's serving state. - api/v1alpha1/amaltheasession_children.go: inject RSC_SESSION_PORT, RSC_SESSION_URL_PATH, and RSC_READINESS_PROBE_TYPE into the RSC container - internal/remote/config/config.go: add SessionPort, SessionURLPath, ReadinessProbeType to RemoteSessionControllerConfig; register flags/env vars - internal/remote/server/server.go: rework /ready handler with switch on ReadinessProbeType (none/tcp/http/default) - internal/remote/server/server_test.go: add TestReadyEndpoint with 11 test cases
Scope: Prevent invalid probe types from being silently accepted; test the new config surface. - internal/remote/config/config.go: add ReadinessProbeType enum validation in Validate() (reject values other than "", none, tcp, http) - internal/remote/config/config_test.go: add TestConfigNewFlags covering defaults, CLI flags, env vars (RSC_READINESS_PROBE_TYPE, RSC_SESSION_PORT, RSC_SESSION_URL_PATH), and validation errors
Co-authored-by: Tasko Olevski <16360283+olevski@users.noreply.github.com>
3f6ecab to
a107f62
Compare
Co-authored-by: Flora Thiebaut <flora@leafty.dev>
6eafe8d to
8658bbb
Compare
Summary
The AmaltheaSession CRD already exposes spec.session.readinessProbe, spec.session.port, and spec.session.urlPath. These fields were previously only honored for local sessions (through Kubernetes container probes). This PR forwards the same values into the remote session controller sidecar so its /ready HTTP endpoint reports readiness based on whether the actual remote session is reachable.
Motivation and Context
For remote sessions managed via FirecREST / RunAI / Slurm, the controller pod was always reporting ready immediately, even if the IDE on the remote compute node had not started yet. We need the Kubernetes readiness state to reflect real user-facing session availability so that ingress and service routing only start once the session is actually usable.
Changes
Tests
Compatibility
Notes