Skip to content

Commit b6a099f

Browse files
authored
test: Harden browser contract-test adapter against unanswered-command hangs (#1399)
## Summary The browser SDK contract-test **adapter** (`startAdapter.ts`) forwarded each command to the browser entity over WebSocket and waited **forever** for a reply. If the browser ever missed one reply, that REST request — and therefore the harness — hung indefinitely (even capability/status queries returned `000`), with no recovery path. This permanently wedged the service, which is especially painful when reusing one service across many runs (e.g. flake hunting). This PR makes the adapter resilient: - **Timeout on `send()`** (`ADAPTER_REQUEST_TIMEOUT_MS`, default **30s**). A stalled command now fails that single request with **HTTP 500** so the harness fails that one test and continues, instead of wedging the whole chain. - **Reject in-flight commands on WebSocket close**, so no REST request is left waiting on a dead socket. - **Error handler on REST `listen()`** so a silent bind failure (e.g. the port still being held → "WS up but REST never bound") is logged instead of disappearing. Changes are confined to test tooling; no SDK runtime code is touched. ## Verification 5 consecutive full suites against one reused adapter/browser stayed green — `ec=0, fail=0`, adapter responsive at ~1 ms throughout. The timeout path turns a previously-permanent hang into a single failed request the harness can recover from. ## Notes - An earlier revision of this branch also had the entity destroy leftover clients on `getCapabilities`. **That was removed**: `getCapabilities` is not a safe "new run" signal because multiple harnesses can drive one entity concurrently, so it could wipe another run's active clients. The lingering-final-client cleanup needs a different, run-aware approach and is out of scope here. - `packages/sdk/browser/example-fdv2/src/app.ts` has unrelated local debugging tweaks and is intentionally not part of this branch. ## Test coverage This tooling package has no unit-test harness; validation was integration-based (above). <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Low Risk** > Only internal contract-test adapter tooling is modified; no production SDK or auth/data paths are affected. > > **Overview** > The contract-test **adapter** (`startAdapter`) no longer waits indefinitely for browser-entity WebSocket replies. **`send()`** now uses a configurable timeout (`requestTimeoutMs` / `ADAPTER_REQUEST_TIMEOUT_MS`, default **30s**) and rejects with a clear error; **`waiters`** are scoped **per WebSocket** and include timers plus **reject on socket close**, so a dropped connection does not leave unrelated REST calls stuck. > > Async REST routes are wrapped with a **`guard`** that maps failures (timeout, closed socket, etc.) to **HTTP 500** instead of unhandled rejections that could hang the harness. The REST **`listen()`** server also gets an **`error`** handler so bind failures (e.g. `EADDRINUSE`) are logged. > > Changes are limited to internal contract-test tooling; SDK runtime is unchanged. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit d62dc65. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY -->
1 parent bce024c commit b6a099f

1 file changed

Lines changed: 123 additions & 51 deletions

File tree

packages/tooling/contract-test-utils/src/adapter/startAdapter.ts

Lines changed: 123 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -10,42 +10,97 @@ import { WebSocketServer } from 'ws';
1010
export interface AdapterOptions {
1111
restPort?: number;
1212
wsPort?: number;
13+
/**
14+
* How long (ms) the adapter waits for the browser entity to answer a command
15+
* before failing the REST request. Prevents a single unanswered command from
16+
* wedging the adapter (and therefore the test harness) indefinitely.
17+
* Override with ADAPTER_REQUEST_TIMEOUT_MS. Defaults to 30s.
18+
*/
19+
requestTimeoutMs?: number;
20+
}
21+
22+
/**
23+
* Tracks an outstanding command sent to the browser entity, awaiting its
24+
* response over the WebSocket.
25+
*/
26+
interface Waiter {
27+
resolve: (data: any) => void;
28+
reject: (err: Error) => void;
29+
timer: ReturnType<typeof setTimeout>;
1330
}
1431

1532
let server: http.Server | undefined;
1633

1734
export function startAdapter(options?: AdapterOptions) {
1835
const restPort = options?.restPort ?? 8000;
1936
const wsPort = options?.wsPort ?? 8001;
37+
const requestTimeoutMs =
38+
options?.requestTimeoutMs ?? Number(process.env.ADAPTER_REQUEST_TIMEOUT_MS ?? 30000);
2039

2140
const wss = new WebSocketServer({ port: wsPort });
22-
const waiters: Record<string, (data: unknown) => void> = {};
2341

2442
console.log('Running contract test harness adapter.');
2543
wss.on('connection', async (ws) => {
2644
ws.on('error', console.error);
2745

46+
// Outstanding commands awaiting a response over THIS socket. Scoped per
47+
// connection (not shared across the adapter) so that closing one socket
48+
// only fails its own in-flight commands and never rejects waiters that
49+
// belong to another, e.g. reconnecting, connection.
50+
const waiters: Record<string, Waiter> = {};
51+
2852
ws.on('message', (stringData: string) => {
2953
const data = JSON.parse(stringData);
30-
if (Object.prototype.hasOwnProperty.call(waiters, data.reqId)) {
31-
waiters[data.reqId](data);
54+
const waiter = waiters[data.reqId];
55+
if (waiter) {
56+
clearTimeout(waiter.timer);
3257
delete waiters[data.reqId];
58+
waiter.resolve(data);
3359
} else {
3460
console.error('Did not find outstanding request', data.reqId);
3561
}
3662
});
3763

38-
// eslint-disable-next-line @typescript-eslint/no-explicit-any
39-
const send = (data: { [key: string]: unknown; reqId: string }): Promise<any> => {
40-
let resolver: (data: unknown) => void;
41-
const waiter = new Promise((resolve) => {
42-
resolver = resolve;
64+
// When this socket drops, fail its outstanding commands instead of leaving
65+
// those REST requests (and the harness) hanging forever.
66+
ws.on('close', () => {
67+
Object.keys(waiters).forEach((reqId) => {
68+
const waiter = waiters[reqId];
69+
clearTimeout(waiter.timer);
70+
delete waiters[reqId];
71+
waiter.reject(new Error('WebSocket connection to entity closed before a response was received.'));
4372
});
44-
// @ts-expect-error The body of the above assignment runs sequentially.
45-
waiters[data.reqId] = resolver;
46-
ws.send(JSON.stringify(data));
47-
return waiter;
48-
};
73+
});
74+
75+
const send = (data: { [key: string]: unknown; reqId: string }): Promise<any> =>
76+
new Promise((resolve, reject) => {
77+
const timer = setTimeout(() => {
78+
delete waiters[data.reqId];
79+
reject(
80+
new Error(
81+
`Adapter timed out after ${requestTimeoutMs}ms awaiting entity response to '${data.command}' (reqId ${data.reqId}).`,
82+
),
83+
);
84+
}, requestTimeoutMs);
85+
waiters[data.reqId] = { resolve, reject, timer };
86+
ws.send(JSON.stringify(data));
87+
});
88+
89+
// Wrap async route handlers so an unanswered/timed-out command produces a
90+
// 500 for that single request rather than an unhandled rejection that
91+
// leaves the request — and the harness — hanging.
92+
const guard =
93+
(
94+
handler: (req: express.Request, res: express.Response) => Promise<void>,
95+
): express.RequestHandler =>
96+
(req, res) => {
97+
handler(req, res).catch((err) => {
98+
console.error('Adapter request failed:', err);
99+
if (!res.headersSent) {
100+
res.status(500).send(String(err));
101+
}
102+
});
103+
};
49104

50105
if (server) {
51106
await util.promisify(server.close).call(server);
@@ -63,54 +118,71 @@ export function startAdapter(options?: AdapterOptions) {
63118
);
64119
app.use(bodyParser.json());
65120

66-
app.get('/', async (_req, res) => {
67-
const commandResult = await send({ command: 'getCapabilities', reqId: randomUUID() });
68-
res.header('Content-Type', 'application/json');
69-
res.json(commandResult);
70-
});
121+
app.get(
122+
'/',
123+
guard(async (_req, res) => {
124+
const commandResult = await send({ command: 'getCapabilities', reqId: randomUUID() });
125+
res.header('Content-Type', 'application/json');
126+
res.json(commandResult);
127+
}),
128+
);
71129

72130
app.delete('/', () => {
73131
process.exit();
74132
});
75133

76-
app.post('/', async (req, res) => {
77-
const commandResult = await send({
78-
command: 'createClient',
79-
body: req.body,
80-
reqId: randomUUID(),
81-
});
82-
if (commandResult.resourceUrl) {
83-
res.set('Location', commandResult.resourceUrl);
84-
}
85-
if (commandResult.status) {
86-
res.status(commandResult.status);
87-
}
88-
res.send();
89-
});
134+
app.post(
135+
'/',
136+
guard(async (req, res) => {
137+
const commandResult = await send({
138+
command: 'createClient',
139+
body: req.body,
140+
reqId: randomUUID(),
141+
});
142+
if (commandResult.resourceUrl) {
143+
res.set('Location', commandResult.resourceUrl);
144+
}
145+
if (commandResult.status) {
146+
res.status(commandResult.status);
147+
}
148+
res.send();
149+
}),
150+
);
90151

91-
app.post('/clients/:id', async (req, res) => {
92-
const commandResult = await send({
93-
command: 'runCommand',
94-
id: req.params.id,
95-
body: req.body,
96-
reqId: randomUUID(),
97-
});
98-
if (commandResult.status) {
99-
res.status(commandResult.status);
100-
}
101-
if (commandResult.body) {
102-
res.write(JSON.stringify(commandResult.body));
103-
}
104-
res.send();
105-
});
152+
app.post(
153+
'/clients/:id',
154+
guard(async (req, res) => {
155+
const commandResult = await send({
156+
command: 'runCommand',
157+
id: req.params.id,
158+
body: req.body,
159+
reqId: randomUUID(),
160+
});
161+
if (commandResult.status) {
162+
res.status(commandResult.status);
163+
}
164+
if (commandResult.body) {
165+
res.write(JSON.stringify(commandResult.body));
166+
}
167+
res.send();
168+
}),
169+
);
106170

107-
app.delete('/clients/:id', async (req, res) => {
108-
await send({ command: 'deleteClient', id: req.params.id, reqId: randomUUID() });
109-
res.send();
110-
});
171+
app.delete(
172+
'/clients/:id',
173+
guard(async (req, res) => {
174+
await send({ command: 'deleteClient', id: req.params.id, reqId: randomUUID() });
175+
res.send();
176+
}),
177+
);
111178

112179
server = app.listen(restPort, () => {
113180
console.log('Listening on port %d', restPort);
114181
});
182+
// Surface bind failures (e.g. EADDRINUSE) instead of silently never
183+
// logging "Listening on port".
184+
server.on('error', (err) => {
185+
console.error(`REST server error on port ${restPort}:`, err);
186+
});
115187
});
116188
}

0 commit comments

Comments
 (0)