Skip to content

Commit 15c9553

Browse files
committed
fix(proxy): wait for in-flight agent-to-upstream replies before closing srcChan
When sshHandleChannel processes a session, two independent forwarders run in parallel: agent-to-upstream requests (e.g. exec, env, pty-req) go through sshForwardChannelRequests(srcReqs, dstChan), and upstream-to-agent requests (e.g. exit-status) go the other way. The agent-to-upstream loop is NOT awaited before sluice closes srcChan. A fast upstream that, in response to exec, immediately replies + writes data + sends exit-status + closes lets sluice's wait on the three upstream-to-agent goroutines complete and close srcChan while the agent-to-upstream forwarder is still mid-flight: it has received the upstream's CHANNEL_REQUEST_SUCCESS reply via dstChan.SendRequest but has not yet called req.Reply on the agent side. The agent then receives SSH_MSG_CHANNEL_CLOSE before its session.SendRequest("exec", true, ...) sees the SUCCESS message on ch.msg. gossh closes ch.msg on SSH_MSG_CHANNEL_CLOSE, which causes the blocked SendRequest to return io.EOF (channel.go:603-606). session.Output("cmd") surfaces this as "exec command via SSH: EOF" even though the upstream replied successfully. The fix wraps each iteration of the agent-to-upstream forwarder in a sync.WaitGroup. sshHandleChannel waits for the WaitGroup to drain after the three upstream-side signals, then closes srcChan. Any in-flight exec reply has been fully written to srcChan before the close arrives. The race was deterministically reproducible on the e2e-linux push-event runner since commit d27b05e moved srcChan.CloseWrite() out of the data goroutine, narrowing the window between the upstream's reply landing and sluice's close. Eight successive main-branch e2e runs passed before d27b05e; the next two (after PR #38 and PR #40 merges) both failed at TestCredential_SSHInjection with the EOF symptom.
1 parent 55092f4 commit 15c9553

1 file changed

Lines changed: 60 additions & 5 deletions

File tree

internal/proxy/ssh.go

Lines changed: 60 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ import (
1111
"net"
1212
"os"
1313
"path/filepath"
14+
"sync"
1415

1516
"github.com/nemirovsky/sluice/internal/vault"
1617
"golang.org/x/crypto/ssh"
@@ -263,8 +264,24 @@ func sshHandleChannel(newChan ssh.NewChannel, dst ssh.Conn) {
263264
// from upstream finish, each signaling on this channel.
264265
upstreamDone := make(chan struct{}, 3)
265266

266-
// Forward per-channel requests bidirectionally.
267-
go sshForwardChannelRequests(srcReqs, dstChan)
267+
// Track agent-to-upstream requests that are mid-flight. Each request
268+
// the agent sends has to be forwarded to upstream, awaited for a
269+
// reply (when WantReply is true), and replied to on the agent side
270+
// before sluice may close srcChan. Without this barrier, a fast
271+
// upstream that replies + writes data + sends exit-status + closes
272+
// in one burst lets sluice drain all three upstream-to-agent
273+
// goroutines and close srcChan while this forwarder is still
274+
// mid-reply for the original exec request. The agent then observes
275+
// SSH_MSG_CHANNEL_CLOSE before its SendRequest("exec", true, ...)
276+
// receives a SUCCESS/FAILURE on ch.msg, and the gossh client
277+
// surfaces the closed ch.msg as io.EOF.
278+
var inFlight sync.WaitGroup
279+
280+
// Forward per-channel requests bidirectionally. The agent-to-upstream
281+
// loop wraps each request in inFlight so sluice's pre-close barrier
282+
// can drain them. The upstream-to-agent loop signals upstreamDone
283+
// when dstReqs closes.
284+
go sshForwardChannelRequestsTracked(srcReqs, dstChan, &inFlight)
268285
go func() {
269286
sshForwardChannelRequests(dstReqs, srcChan)
270287
upstreamDone <- struct{}{}
@@ -304,10 +321,20 @@ func sshHandleChannel(newChan ssh.NewChannel, dst ssh.Conn) {
304321
<-upstreamDone
305322
<-upstreamDone
306323

324+
// Also drain any agent-to-upstream request that is mid-flight. A
325+
// pending WantReply=true request is waiting on dst.SendRequest to
326+
// return, after which it still has to call req.Reply on the agent
327+
// side. Closing srcChan before that reply is written would let the
328+
// agent see channel-close before the SUCCESS/FAILURE message on
329+
// ch.msg, which gossh surfaces as io.EOF from
330+
// session.SendRequest("exec", true, ...).
331+
inFlight.Wait()
332+
307333
// Now that exit-status has been forwarded (the dstReqs goroutine
308-
// has finished), signal stdout EOF to the agent and close the
309-
// channel. The agent's session.Wait() now sees the documented
310-
// order: data, exit-status, EOF, close.
334+
// has finished) and every pending agent-side reply has been
335+
// written, signal stdout EOF to the agent and close the channel.
336+
// The agent's session.Wait() now sees the documented order:
337+
// data, exit-status, EOF, close.
311338
_ = srcChan.CloseWrite()
312339
_ = srcChan.Close()
313340
_ = dstChan.Close()
@@ -328,3 +355,31 @@ func sshForwardChannelRequests(reqs <-chan *ssh.Request, dst ssh.Channel) {
328355
}
329356
}
330357
}
358+
359+
// sshForwardChannelRequestsTracked is the agent-to-upstream variant of
360+
// sshForwardChannelRequests. Each request is wrapped in the inFlight
361+
// WaitGroup so sshHandleChannel's pre-close barrier can wait for the
362+
// reply on the agent direction (req.Reply on srcChan) to be written
363+
// before sluice closes srcChan. Otherwise an agent that called
364+
// session.SendRequest("exec", WantReply=true, ...) can observe
365+
// SSH_MSG_CHANNEL_CLOSE before its ch.msg receives the
366+
// CHANNEL_REQUEST_SUCCESS reply — gossh surfaces a closed ch.msg as
367+
// io.EOF, and `session.Output("cmd")` fails with EOF even though the
368+
// upstream replied successfully.
369+
func sshForwardChannelRequestsTracked(reqs <-chan *ssh.Request, dst ssh.Channel, inFlight *sync.WaitGroup) {
370+
for req := range reqs {
371+
inFlight.Add(1)
372+
ok, err := dst.SendRequest(req.Type, req.WantReply, req.Payload)
373+
if err != nil {
374+
if req.WantReply {
375+
_ = req.Reply(false, nil)
376+
}
377+
inFlight.Done()
378+
continue
379+
}
380+
if req.WantReply {
381+
_ = req.Reply(ok, nil)
382+
}
383+
inFlight.Done()
384+
}
385+
}

0 commit comments

Comments
 (0)