Skip to content
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
This repository was archived by the owner on Dec 15, 2022. It is now read-only.

Does not handle searching for multiline regex's #5

@benogle

Description

@benogle

PathSearcher runs the regex on each line, not on the text as a whole. So something like [a-z]+\n[0-9]+ will not be matched.

This was a design decision for efficiency. We only need to have each line, not the whole file. Right now the file reader only reads 10k at a time, and searches on the lines returned.

I'm not sure how to handle multiline regexs efficiently. How to make it handle a 100MB file? I'm opening this for discussion. As I build out the PathReplacer, it will have the same limitation.

One approach could be to:

  1. set buffer = ''
  2. read 10k chunk, append it onto buffer
  3. run the regex buffer
  4. if match: buffer = buffer.slice(match.end)
  5. goto 2

But on files with no match, it will read the entire file into memory.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions