feat: allow multipart file uploads to be consumed as streams#4424
feat: allow multipart file uploads to be consumed as streams#4424Copilot wants to merge 11 commits into
Conversation
Review or Edit in CodeSandboxOpen the branch in Web Editor • VS Code • Insiders |
| @@ -1,3 +1,5 @@ | |||
| import { Readable } from 'node:stream'; | |||
| import Busboy, { BusboyFileStream } from '@fastify/busboy'; | |||
There was a problem hiding this comment.
Replaced the @fastify/busboy + node:stream implementation with a self-contained WHATWG-streams-based multipart parser that uses only standard APIs (ReadableStream, TextDecoder, TextEncoder, Uint8Array). It works in Node.js, Cloudflare Workers, Deno, and browsers alike. The @fastify/busboy dependency has been removed from package.json. Commit: 040cdf1.
💻 Website PreviewThe latest changes are available as preview in: https://59469410.envelop.pages.dev |
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
💻 Website PreviewThe latest changes are available as preview in: https://f4241516.envelop.pages.dev |
Fixed. The CI failure was a TypeScript 5.9 error — Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
| } | ||
| }); | ||
|
|
||
| describe('file uploads (streaming mode)', () => { |
There was a problem hiding this comment.
@copilot make sure it is streamed not buffered first then consumed. So files should be lazily loaded right? So let's test it
💻 Website PreviewThe latest changes are available as preview in: https://6ff8a251.envelop.pages.dev |
💻 Website PreviewThe latest changes are available as preview in: https://pr-4424.graphql-yoga.pages.dev |
…tion Agent-Logs-Url: https://github.com/graphql-hive/graphql-yoga/sessions/c66673b2-a216-4b18-8262-5016670052e5 Co-authored-by: ardatan <20847995+ardatan@users.noreply.github.com>
| - **Only available in Node.js.** In edge runtimes the standard `request.formData()` path is used | ||
| regardless of this option. | ||
|
|
||
| ## Configuring Multipart Request Processing (only for Node.js) |
There was a problem hiding this comment.
@copilot Implement this in the parser too so we don't have this node.js specific and have this part of our new parser. Update the configuraiton api etc for that
✅ Yoga Benchmark Results |
|
hey @ardatan , I saw you started working on this. In the meantime I implemented with help of Claude a GatewayPlugin and a YogaPlugin, that allows for streaming. In case you are interested, I can share the code. |
|
Sure @phal0r ! I'd love to see it! |
…st, fix dead assignments Agent-Logs-Url: https://github.com/graphql-hive/graphql-yoga/sessions/6298cf19-6020-4a6c-a7dc-61e237ef133f
💻 Website PreviewThe latest changes are available as preview in: https://e8454ba2.envelop.pages.dev |
💻 Website PreviewThe latest changes are available as preview in: https://pr-4424.graphql-yoga.pages.dev |
…; align error message format Agent-Logs-Url: https://github.com/graphql-hive/graphql-yoga/sessions/a8adc1de-80f2-43ac-8de0-42de2415eaf1
💻 Website PreviewThe latest changes are available as preview in: https://ba9b7d0a.envelop.pages.dev |
💻 Website PreviewThe latest changes are available as preview in: https://pr-4424.graphql-yoga.pages.dev |
|
Sure thing @ardatan ! Some additional notes: I implemented one GatewayPlugin for the hive-gateway and one YogaPlugin for the yoga-graphql server, although I know that the gateway uses yoga under the hood. The implementations are different due to different contexts: For the gateway I assume, that all subgraphs run in the same cluster/network/infra, so usually network is stable, fast and reliable. Thus the stream is just piped through from incoming network to outgoing network. For yoga I choose to write the file to disk and create a read stream from the temporarily stored file, because usually one wants to send this to some object storage with unreliable network over internet. Of course this is only an inspiration for a final solution, but would love to see those implemented and remove my custom plugins :) This is the one for Yoga: https://gist.github.com/phal0r/b25878bd83e10b576529f59f6bbb16fc this is the one for the Gateway: https://gist.github.com/phal0r/507e7cd3b67230b2fd7fbc14efff9da0 I am running both apps with Bun and it works well there. I tested with files > 1GB and container limits of 1GB to ensure it really works. I also created some end2end tests with real file uploads through the network to rule out any inconsistencies. Along the way were other problems, like the minio js client offering streaming, but being buggy (does not cater for backpressure from the network and thus still loaded the whole in a memory buffer :D ) At the moment this system runs reliable in my tests with big uploads. |
done = trueassignments at lines 347 and 351 inpost-multipart.tsformData()code path, removestream?: booleanoption)limitsoption toMultipartOptionsand enforce them in the streaming parser (fileSize, files, fieldSize)limitsconfigurationincremental-delivery.spec.tsto use newmultipart.limitsAPI instead offetchAPI: createFetch({ formDataLimits })File size limit exceeded: N bytes)