Fix HTTP test port race and self-request origin#3570
Conversation
2b379f9 to
0c53025
Compare
Signed-off-by: Zhiwei Liang <zhiwei.liang@zliang.me>
0c53025 to
6b9e838
Compare
itowlson
left a comment
There was a problem hiding this comment.
I'll defer to @rylev on the test stuff, but I'd like to understand how often we are seeing port race failures - I don't think I've ever seen one myself. Unless you're being actively affected by them, I would tend to be conservative with changes.
I can see the issue with self-requests though. So we probably need to fix that and if the rest of it comes along for the ride well fair enough.
| .local_addr()? | ||
| .port()) | ||
| fn parse_serving_port(output: &str) -> Option<u16> { | ||
| parse_plain_serving_port(output).or_else(|| parse_json_serving_port(output)) |
There was a problem hiding this comment.
Wait, don't we know whether it's printing JSON or plain text?
There was a problem hiding this comment.
There was a test I added in the previous PR for testing JSON output, and all other tests are plain text. I can update the test code to force every test to output JSON, which is easier to handle. How does that sound?
There was a problem hiding this comment.
I was thinking more that we would be able to know when calling parse_serving_port whether the output was set to JSON. But maybe that info is buried in some Command arg that appears only in a callback somewhere and is inaccessible. It's fine to leave this as is.
|
@itowlson Thanks for the feedback! Actually, I saw more than once impact of this issue, sometimes it's very implicit, shown as 404 not found error, when a test code hits another test spin instance which doesn't have the expected path. |
Interesting. If it's happening that often, I'm surprised we don't get a lot more flakes in CI - I mean, yes, we do get flakes, but if they show up in manual testing then I'd expect them to absolutely wreck CI, which runs about a gazillion times on every PR. But of course we'd want it to be correct anyway! |
Signed-off-by: Zhiwei Liang <zhiwei.liang@zliang.me>
6b9e838 to
c4617b7
Compare
Actually, I saw that from one CI run in my another PR and started investigating into it. I didn't catch it locally |
Summary
Fixes flaky HTTP integration test startup caused by a test harness port race.
The harness previously selected a port by binding
localhost:0, reading the assigned port, dropping that listener, and later launchingspin up --listen 127.0.0.1:{port}. In parallel test runs, another process could claim that port before this Spin instance bound it. The old readiness check only proved that something accepted TCP connections on the port, so a test could accidentally send requests to the wrong Spin app and see a spurious404.Change
Production code
Test framework
--listen 127.0.0.1:0, then parse the actual bound port from this Spin process's startup output.Why the server change is needed
After the test harness switched to
127.0.0.1:0,HttpServer::listen_addrcould remain127.0.0.1:0even though the actual listener was bound to an assigned port. Self requests must use the real bound address, not the configured:0address. The server now storeslistener.local_addr()after binding and uses that address when configuringSelfRequestOrigin.For in-process tests,
InProcessSpinbuilds a server and driveshandle()directly without callingserve(), so no socket address is recorded. That path falls back to the configured listen address.