-
Notifications
You must be signed in to change notification settings - Fork 234
feat: Export events to JSON Lines #451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
a8cffb6
1d9b9a8
8ed6a89
43f2eb1
9281d14
b011a66
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,99 @@ | ||||||||
| import 'pg-query-stream' | ||||||||
| import dotenv from 'dotenv' | ||||||||
| dotenv.config() | ||||||||
|
|
||||||||
| import fs from 'fs' | ||||||||
| import knex from 'knex' | ||||||||
| import path from 'path' | ||||||||
| import { pipeline } from 'stream/promises' | ||||||||
| import { Transform } from 'stream' | ||||||||
|
|
||||||||
| const getDbConfig = () => ({ | ||||||||
| client: 'pg', | ||||||||
| connection: process.env.DB_URI || { | ||||||||
| host: process.env.DB_HOST ?? 'localhost', | ||||||||
| port: Number(process.env.DB_PORT ?? 5432), | ||||||||
| user: process.env.DB_USER ?? 'postgres', | ||||||||
| password: process.env.DB_PASSWORD ?? 'postgres', | ||||||||
| database: process.env.DB_NAME ?? 'nostream', | ||||||||
| }, | ||||||||
| }) | ||||||||
|
|
||||||||
| async function exportEvents(): Promise<void> { | ||||||||
| const filename = process.argv[2] || 'events.jsonl' | ||||||||
| const outputPath = path.resolve(filename) | ||||||||
| const db = knex(getDbConfig()) | ||||||||
|
|
||||||||
|
Comment on lines
+22
to
+43
|
||||||||
| try { | ||||||||
| const [{ count }] = await db('events') | ||||||||
| .whereNull('deleted_at') | ||||||||
| .count('* as count') | ||||||||
| const total = Number(count) | ||||||||
|
|
||||||||
| if (total === 0) { | ||||||||
| console.log('No events to export.') | ||||||||
| return | ||||||||
| } | ||||||||
|
|
||||||||
| console.log(`Exporting ${total} events to ${outputPath}`) | ||||||||
|
|
||||||||
| const output = fs.createWriteStream(outputPath) | ||||||||
| let exported = 0 | ||||||||
|
|
||||||||
| const trx = await db.transaction(null, { isolationLevel: 'repeatable read' }) | ||||||||
|
||||||||
| const trx = await db.transaction(null, { isolationLevel: 'repeatable read' }) | |
| const trx = await db.transaction(null, { isolationLevel: 'read committed' }) |
Copilot
AI
Apr 18, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
orderBy('event_created_at', 'asc') does not guarantee deterministic ordering when multiple events share the same event_created_at value, so repeated exports can legitimately produce different line orders. If stable output is desired, add a secondary tie-breaker (e.g. event_id or the PK id) to the ORDER BY.
| .orderBy('event_created_at', 'asc') | |
| .orderBy('event_created_at', 'asc') | |
| .orderBy('event_id', 'asc') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getDbConfig()duplicates the repo’s Knex configuration logic and doesn’t honor severalDB_*settings the relay supports (e.g. pool sizing / acquire timeout). It also introduces default host/user/password/db values, which can make the script silently export from an unexpected database when env vars are missing. Consider reusingsrc/database/client.ts(or factoring out a shared config helper) so the export command uses the same connection behavior as the relay and fails fast when required env vars aren’t set.