Hello. You have quite nice library. I personally like approaches where you limit the complexity of the library and/or your formats to keep things simple yet still useful. So, big respect to you for robust KISS approach. I was thinking about simple greppable formats as well. I wanna share some feedback, maybe it would be useful for you.
The two most important things to keep in mind when designing greppable format is: 1. simplify context, make your grammar less contextual, 2. use symbols which does not have special meaning in pcre.
Tags and fields
From this perspective current tag format: [a|b|c] has some downsides. It has context (opening bracket), tags in the tail have leading symbol which differs from leading symbol of first tag. Both [ and | are special symbols in pcre.
If I would design this, I would do so:
- Collect all tags (tag must not have a
or # char in it).
- Sort them.
- Prepend each with
#.
- Join them by
.
- Append
.
(like #a #b #c)
Now it is quite easy to filter by single tag (#a\b or #a , because space at the end). It is also easy to filter by subset. Say you want to filter by c and b. You need to sort them and create regex like #b .*#c ).
Same applies to fields:
- Collect all fields (no
or @ in it), join key and value with space.
- Sort by key (kinda, see JSON below).
- Prepend each with
@.
- Join them by
.
- Append
.
The search for field would be the similar. Subsetting also possible.
Extracting of value can be done with @b ([^ ]+)
If value contains @ or , then I think it may be urlencoded (simple in Node/Web and alphanumeric values still readable).
Timestamps
I kinda agree that timestamp is garbaging the output. In most of the cases the order of logs is crucial for debugging, while timestamps are not. If they still be beneficial, I think they can be placed at the end of the line (not garbaging too much, but still easy greppable).
JSON
I have no good ideas for JSON, but I think there may be two restrictions:
- Allow only one json field, forbid multiple.
- Always put this single json field as a last field (ignoring sorting). (this is a compromise).
- Maybe give it fixed name? (so grepping would be like
@json (.*)$).
Hello. You have quite nice library. I personally like approaches where you limit the complexity of the library and/or your formats to keep things simple yet still useful. So, big respect to you for robust KISS approach. I was thinking about simple greppable formats as well. I wanna share some feedback, maybe it would be useful for you.
The two most important things to keep in mind when designing greppable format is: 1. simplify context, make your grammar less contextual, 2. use symbols which does not have special meaning in pcre.
Tags and fields
From this perspective current tag format:
[a|b|c]has some downsides. It has context (opening bracket), tags in the tail have leading symbol which differs from leading symbol of first tag. Both[and|are special symbols in pcre.If I would design this, I would do so:
or#char in it).#...(like
#a #b #c)Now it is quite easy to filter by single tag (
#a\bor#a, because space at the end). It is also easy to filter by subset. Say you want to filter bycandb. You need to sort them and create regex like#b .*#c).Same applies to fields:
or@in it), join key and value with space.@...The search for field would be the similar. Subsetting also possible.
Extracting of value can be done with
@b ([^ ]+)If value contains
@or, then I think it may be urlencoded (simple in Node/Web and alphanumeric values still readable).Timestamps
I kinda agree that timestamp is garbaging the output. In most of the cases the order of logs is crucial for debugging, while timestamps are not. If they still be beneficial, I think they can be placed at the end of the line (not garbaging too much, but still easy greppable).
JSON
I have no good ideas for JSON, but I think there may be two restrictions:
@json (.*)$).