Skip to content
This repository was archived by the owner on May 19, 2022. It is now read-only.
This repository was archived by the owner on May 19, 2022. It is now read-only.

Some thoughts on standard #35

@StreetStrider

Description

@StreetStrider

Hello. You have quite nice library. I personally like approaches where you limit the complexity of the library and/or your formats to keep things simple yet still useful. So, big respect to you for robust KISS approach. I was thinking about simple greppable formats as well. I wanna share some feedback, maybe it would be useful for you.

The two most important things to keep in mind when designing greppable format is: 1. simplify context, make your grammar less contextual, 2. use symbols which does not have special meaning in pcre.

Tags and fields

From this perspective current tag format: [a|b|c] has some downsides. It has context (opening bracket), tags in the tail have leading symbol which differs from leading symbol of first tag. Both [ and | are special symbols in pcre.

If I would design this, I would do so:

  1. Collect all tags (tag must not have a or # char in it).
  2. Sort them.
  3. Prepend each with #.
  4. Join them by .
  5. Append .

(like #a #b #c)
Now it is quite easy to filter by single tag (#a\b or #a , because space at the end). It is also easy to filter by subset. Say you want to filter by c and b. You need to sort them and create regex like #b .*#c ).

Same applies to fields:

  1. Collect all fields (no or @ in it), join key and value with space.
  2. Sort by key (kinda, see JSON below).
  3. Prepend each with @.
  4. Join them by .
  5. Append .

The search for field would be the similar. Subsetting also possible.
Extracting of value can be done with @b ([^ ]+)
If value contains @ or , then I think it may be urlencoded (simple in Node/Web and alphanumeric values still readable).

Timestamps

I kinda agree that timestamp is garbaging the output. In most of the cases the order of logs is crucial for debugging, while timestamps are not. If they still be beneficial, I think they can be placed at the end of the line (not garbaging too much, but still easy greppable).

JSON

I have no good ideas for JSON, but I think there may be two restrictions:

  1. Allow only one json field, forbid multiple.
  2. Always put this single json field as a last field (ignoring sorting). (this is a compromise).
  3. Maybe give it fixed name? (so grepping would be like @json (.*)$).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions