I've found analyizing a heap dump that something is holding a reference to a promise for every file in the archive until the entire extraction completes. I'm streaming the file through from an HTTP multipart form upload, using the Parse() function and associated events and streams. My test file is about 10GB containing about 500,000 files. The lib is doing a terrific job, extraction is flawless, memory usage is great - as advertised, it never requires holding the entire archive or even any single file within the archive in memory, and I can keep my entire node process memory to around 200-300MB or less streaming several of these 10GB files concurrently even with this issue. Once the file is completely extracted, the 'finish' event is emitted, and everything gets garbage collected, and everything is cleaned up. So it's not technically a memory leak, just not ideal. But our use case will in fact sometimes include extractions this large with hundreds of thousands or millions of files in a single archive, and we're running it in containers where we need to be able to keep the memory resources limited, and this is the one show stopper right now. I've created a fork and am looking at it, but glancing through it, I don't see anything obvious like promises being added to an array or map.
I've tested this with none of my own code in the pipeline, just using the entry autodrain() function on each entry, and still see the issue, so I believe it is a problem in the lib and not in my implementation.
I've found analyizing a heap dump that something is holding a reference to a promise for every file in the archive until the entire extraction completes. I'm streaming the file through from an HTTP multipart form upload, using the Parse() function and associated events and streams. My test file is about 10GB containing about 500,000 files. The lib is doing a terrific job, extraction is flawless, memory usage is great - as advertised, it never requires holding the entire archive or even any single file within the archive in memory, and I can keep my entire node process memory to around 200-300MB or less streaming several of these 10GB files concurrently even with this issue. Once the file is completely extracted, the 'finish' event is emitted, and everything gets garbage collected, and everything is cleaned up. So it's not technically a memory leak, just not ideal. But our use case will in fact sometimes include extractions this large with hundreds of thousands or millions of files in a single archive, and we're running it in containers where we need to be able to keep the memory resources limited, and this is the one show stopper right now. I've created a fork and am looking at it, but glancing through it, I don't see anything obvious like promises being added to an array or map.
I've tested this with none of my own code in the pipeline, just using the entry autodrain() function on each entry, and still see the issue, so I believe it is a problem in the lib and not in my implementation.