Skip to content

Commit fb6ad77

Browse files
authored
Merge pull request #13 from vectordotdev/feat/split-json-by-year
Split issues JSON into year files, build-db supports directories
2 parents 6631330 + f4d3226 commit fb6ad77

41 files changed

Lines changed: 673417 additions & 673311 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Tools for extracting data from GitHub, storing it in a local SQLite database, qu
1111
src/ # Rust source (single binary: github-tools)
1212
scripts/util/ # Python: plot.py (charts), json_to_csv.py (utility)
1313
data/ # Committed snapshots: JSON inputs and PNG charts
14+
{owner}_{repo}/issues/ # Issues/PRs JSON split by year (2024.json, 2025.json, ...)
1415
images/ # Committed PNG charts (promoted from out/images/)
1516
out/ # Gitignored — all generated and local-only files
1617
historical/ # Raw JSON fetched from GitHub API
@@ -77,14 +78,7 @@ Run `github-tools <COMMAND> --help` for full argument details.
7778
github-tools fetch-all --env-file vector.env --env-file vrl.env
7879
```
7980

80-
Writes to `out/historical/`. Promote to `data/` to commit as a snapshot:
81-
82-
```shell
83-
cp out/historical/issues/vectordotdev_vector_issues.json data/
84-
cp out/historical/issues/vectordotdev_vrl_issues.json data/
85-
cp out/historical/discussions/vectordotdev_vector_discussions.json data/
86-
cp out/historical/discussions/vectordotdev_vrl_discussions.json data/
87-
```
81+
Writes to `out/historical/`. The fetched JSON must be split by year and promoted to `data/` to commit as a snapshot. Issues/PRs are stored in `data/{owner}_{repo}/issues/{year}.json`.
8882

8983
## 2. Generate DB, summaries, and charts
9084

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
[
2+
{
3+
"number": 5472,
4+
"title": "Welcome to Vector Discussions!",
5+
"bodyText": "Welcome to the Vector community\nWelcome to Vector's Github Discussions! We are trying Discussions to help organize support for Vector. Please use the following guidelines to get the most out of the Vector community:\nCommunity is very important to Vector, we value feedback and take pride in Vector's user experience. Engaging with users is an important feedback loop for the project; please feel free to share everything (good and bad). Thank you!\nEdited: See Vector support policy in: https://vector.dev/community/",
6+
"url": "https://github.com/vectordotdev/vector/discussions/5472",
7+
"createdAt": "2020-12-10T02:52:48Z",
8+
"updatedAt": "2026-01-26T17:50:04Z",
9+
"closedAt": null,
10+
"closed": false,
11+
"stateReason": null,
12+
"isAnswered": null,
13+
"locked": false,
14+
"author": {
15+
"login": "binarylogic"
16+
},
17+
"category": {
18+
"name": "Announcements"
19+
},
20+
"comments": {
21+
"totalCount": 0
22+
},
23+
"upvoteCount": 1
24+
},
25+
{
26+
"number": 13834,
27+
"title": "Split logs by docker container label",
28+
"bodyText": "There are several running docker containers with labels.\nAll logs records go through common transforms and then I need to split stream based on label.\nBut I can't find right condition for swimlanes transform\nvector.toml\n[sources.in]\n type = \"docker\"\n include_labels = [\"com.atuko.log_shipping=junk\", \"com.atuko.log_shipping=elastic\"]\n\n[transforms.json]\n type = \"json_parser\"\n inputs = [\"in\"]\n drop_invalid = true\n\n[transforms.remove_tokens]\n type = \"lua\"\n inputs = [\"json\"]\n version = \"2\"\n\n hooks.process = \"\"\"\n function (event, emit)\n if event.log.request ~= nil then\n r = event.log.request:gsub('X%-System%-Token:.*', '')\n r = r:gsub('Authorization:.*', '')\n event.log.request = r\n end\n\n emit(event)\n end\n \"\"\"\n\n[transforms.split]\n type = \"swimlanes\"\n inputs = [\"remove_tokens\"]\n\n # Lanes\n [transforms.split.lanes.files]\n type = \"check_fields\"\n \"com.atuko.log_shipping.eq\" = \"junk\"\n\n [transforms.split.lanes.elastic]\n type = \"check_fields\"\n \"com.atuko.log_shipping.eq\" = \"elastic\"\n\n[sinks.http]\n type = \"http\"\n inputs = [\"split.files\"]\n uri = \"10.1.1.1:80\"\n encoding.codec = \"ndjson\"\n\n[sinks.elastic]\n type = \"elasticsearch\"\n inputs = [\"split.elastic\"]\n host = \"10.1.1.1:9000\"\n index = \"vector-%F\"\n\nvector.log\nJul 23 07:52:51.315 INFO vector: Vector is starting. version=\"0.10.0\" git_version=\"v0.9.0-266-g5e5d806\" released=\"Thu, 25 Jun 2020 14:43:00 +0000\" arch=\"x86_64\"\nJul 23 07:52:51.346 TRACE source{name=in type=docker}: vector::sources::docker: Found already running container id=2e6e4e80c1566e3c526bab2f2dabb60d2e539b4a165ea94e4ce4c030b0760057 names=[\"/target-actions\"]\nJul 23 07:52:51.348 INFO vector::sources::docker: Started listening logs on docker container id=2e6e4e80c1566e3c526bab2f2dabb60d2e539b4a165ea94e4ce4c030b0760057\nJul 23 07:53:08.330 TRACE vector::sources::docker: Received one event. event=Log(LogEvent { fields: {\"container_created_at\": Timestamp(2020-07-23T07:49:25.591113769Z), \"container_id\": Bytes(b\"2e6e4e80c1566e3c526bab2f2dabb60d2e539b4a165ea94e4ce4c030b0760057\"), \"container_name\": Bytes(b\"target-actions\"), \"image\": Bytes(b\"target-actions:debug\"), \"label\": Map({\"com\": Map({\"atuko\": Map({\"log_shipping\": Bytes(b\"junk\")})}), \"logs_enabled\": Bytes(b\"true\")}), \"message\": Bytes(b\"{\\\"level\\\":\\\"info\\\",\\\"host\\\":\\\"d4\\\",\\\"service\\\":\\\"target-actions\\\",\\\"event_type\\\":\\\"debug\\\",\\\"response\\\":\\\"HTTP/1.1 204 No Content\\\",\\\"response_code\\\":204,\\\"time\\\":\\\"2020-07-23T10:53:08+03:00\\\",\\\"message\\\":\\\"ok\\\"}\"), \"source_type\": Bytes(b\"docker\"), \"stream\": Bytes(b\"stderr\"), \"timestamp\": Timestamp(2020-07-23T07:53:08.325606575Z)} })",
29+
"url": "https://github.com/vectordotdev/vector/discussions/13834",
30+
"createdAt": "2020-07-23T08:20:01Z",
31+
"updatedAt": "2022-08-16T11:13:33Z",
32+
"closedAt": null,
33+
"closed": false,
34+
"stateReason": null,
35+
"isAnswered": true,
36+
"locked": false,
37+
"author": {
38+
"login": "andrew4fr"
39+
},
40+
"category": {
41+
"name": "Q&A"
42+
},
43+
"comments": {
44+
"totalCount": 2
45+
},
46+
"upvoteCount": 1
47+
}
48+
]

0 commit comments

Comments
 (0)