Skip to content

Memory leak while reading large files #1857

@dimitribergerilg

Description

@dimitribergerilg

Hello !

Description

When trying to read a large file (1.3 GB) using flow-php/parquet with version ^0.24.0, I get a PHP fatal error due to memory limit (512MB).

The same code works with flow-php/parquet version 0.7.4.


Steps to Reproduce

  1. Implement the following getReader function:
public function getReader(FileModel $fileModel): mixed
{
    $reader = new Reader();
    try {
        return $reader->read($fileModel->getTmpFileName());
    } catch (Exception $e) {
        throw new FileException(
            sprintf(
                'Can\'t access the temporary file %s %s %s',
                $fileModel->getTmpFileName(),
                $fileModel->getOriginalFileName(),
                $e->getMessage()
            )
        );
    }
}
  1. Try to iterate over the file content:
$fileResource = $this->getReader($fileModel);
foreach ($fileResource->values(["col1", "col2"]) as $row) {
    dump($row);
    exit;
}
  1. Run with a file of size 1.3 GB.

Expected Behavior

The file should be read row by row without exceeding the PHP memory limit.


Actual Behavior

Execution fails after some time with a PHP Fatal error (memory limit 512MB) before entering the foreach loop.


Additional Attempts

I also tried using readStream:

return $reader->readStream(
    NativeLocalSourceStream::open(
        new Path($fileModel->getTmpFileName())
    )
);

But this does not work with:

"flow-php/etl": "^0.24.0",
"flow-php/parquet": "^0.24.0"

It only works with:

"flow-php/parquet": "0.7.4"

Environment

  • PHP version: [8.4]
  • OS: [bookwork-dockerised]
  • Memory limit: 512MB
  • flow-php/etl: ^0.24.0
  • flow-php/parquet: ^0.24.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    Status
    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions