Skip to content

Can ResumableParser return whether the last partial_value container is completed or not? #1041

Description

@kou

This is not a critical feature request.

I'm trying to migrate to json gem from json-stream gem in https://github.com/groonga/groonga-command-parser/ by JSON::ResumableParser.

It parses records to be loaded to Groonga (a full text search engine with column store). The records use the following format:

[
{"column_name1": "record1's column_value1", "column_name2": 2, ...},
{"column_name1": "record2's column_value1", "column_name2": 2, ...},
...
]

groonga-command-parser wants to process a record as soon as it's completed. For example, it wants to process the first record when the following chunk is processed:

[
{"column_name1": "record1's column_value1", "column_name2": 2, ...},

We can do it by the following script:

require "json"

parser = JSON::ResumableParser.new
parser << <<-CHUNK
[
{"column1": "value1"},
CHUNK
parser.parse
p parser.partial_value[0] # => {"column1" => "value1"}

But it doesn't work with the following chunk:

[
{"column_name1": "record1's column_value1",

We can't know whether the record is completed or not:

require "json"

parser = JSON::ResumableParser.new
parser << <<-CHUNK
[
{"column1": "value1",
CHUNK
parser.parse
p parser.partial_value[0] # => {"column1" => "value1"} # This is not completed yet

groonga-command-parser has a workaround for it. It ignores the last record:

require "json"

parser = JSON::ResumableParser.new
parser << <<-CHUNK
[
{"column1": "value1"},
CHUNK
parser.parse
p parser.partial_value[0..-2] # => []

parser << <<-CHUNK
{"column1": "value2"},
CHUNK
parser.parse
p parser.partial_value[0..-2] # => [{"column1" => "value1"}] # 1st record

If we can detect whether a record in this example is completed or not, it's useful. But this is not a critical feature request because there is a workaround for this case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions