Skip to content

Commit 310b9f7

Browse files
committed
Updated README and some cleanup
1 parent 37f4a3a commit 310b9f7

2 files changed

Lines changed: 137 additions & 54 deletions

File tree

README.md

Lines changed: 72 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,12 @@ This tool simplifies the process of exporting workload data for analysis, includ
1313
5. Zone configurations
1414
6. System settings
1515

16+
## Prerequisites
17+
18+
- CockroachDB cluster (v21.1 or later recommended)
19+
- Network access to the CockroachDB cluster
20+
- User with appropriate read permissions on system tables
21+
1622
## Installation
1723

1824
### From Binary Releases
@@ -23,7 +29,7 @@ Download the appropriate binary for your platform from the [releases page](https
2329

2430
```bash
2531
# Clone the repository
26-
git clone https://github.com/yourusername/workload-exporter.git
32+
git clone https://github.com/cockroachlabs/workload-exporter.git
2733
cd workload-exporter
2834

2935
# Build the binary
@@ -46,8 +52,9 @@ workload-exporter export \
4652

4753
- `--connection-url`, `-c`: Connection string for CockroachDB (required)
4854
- `--output-file`, `-o`: Output zip file name (default: "workload-export.zip")
49-
- `--start`, `-s`: start time (default: current time - 6 hours)
50-
- `--end`, `-e`: End time (default: current time + 1 hour)
55+
- `--start`, `-s`: Start time in RFC3339 format (default: current time - 6 hours)
56+
- `--end`, `-e`: End time in RFC3339 format (default: current time + 1 hour)
57+
- `--debug`: Enable debug logging output
5158

5259
## Examples
5360

@@ -65,27 +72,58 @@ workload-exporter export -c "postgresql://user:password@source-host:26257/?sslmo
6572
Export for a specific time period:
6673

6774
```bash
68-
# Export a specific time window
69-
workload-exporter export -c "postgresql://user:password@host:26257/?sslmode=verify-full"
70-
-s '2025-04-18T13:25:00Z'
71-
-e '2025-04-18T20:25:00Z'
75+
# Export a specific time window (times must be in RFC3339 format)
76+
workload-exporter export \
77+
-c "postgresql://user:password@host:26257/?sslmode=verify-full" \
78+
-s "2025-04-18T13:25:00Z" \
79+
-e "2025-04-18T20:25:00Z"
7280
```
7381

7482
### Custom output file
7583

7684
Export using a custom file:
7785

7886
```bash
79-
workload-exporter export -c "postgresql://user:password@host:26257/?sslmode=verify-full"
80-
-o 'my-export.zip'
87+
workload-exporter export \
88+
-c "postgresql://user:password@host:26257/?sslmode=verify-full" \
89+
-o "my-export.zip"
90+
```
91+
92+
### Enable debug logging
93+
94+
Export with verbose debug output:
95+
96+
```bash
97+
workload-exporter export \
98+
-c "postgresql://user:password@host:26257/?sslmode=verify-full" \
99+
--debug
81100
```
82101

83102
## File Format
84103

85104
The export zip file contains:
86105

87-
- `metadata.json`: Information about the export, including databases, tables, and configuration
88-
- One file per exported table in the format `[database].[table]`
106+
### Metadata
107+
- `metadata.json`: Export metadata including:
108+
- Cluster version, ID, name, and organization
109+
- SQL statistics aggregation and flush intervals
110+
- Export configuration (connection string with password redacted, time range, output file)
111+
- Timestamp of export
112+
113+
### Statistics Data (CSV format with headers)
114+
- `crdb_internal.statement_statistics.csv`: Statement execution statistics
115+
- `crdb_internal.transaction_statistics.csv`: Transaction execution statistics
116+
- `crdb_internal.transaction_contention_events.csv`: Lock contention events
117+
- `crdb_internal.gossip_nodes.csv`: Cluster node information
118+
119+
**Note:** Statistics tables are filtered by the specified time range using their timestamp columns.
120+
121+
### Database Schemas
122+
- `[database_name].schema.txt`: CREATE statements for all tables in each user database
123+
- One file per database (excludes system databases: `system`, `crdb_internal`, `postgres`)
124+
125+
### Configuration
126+
- `zone_configurations.txt`: All zone configuration SQL statements from the cluster
89127

90128
## Building from Source
91129

@@ -100,6 +138,29 @@ go mod tidy
100138
go build -o workload-exporter
101139
```
102140

141+
## Troubleshooting
142+
143+
### Connection Issues
144+
Ensure your connection string includes the proper SSL mode and authentication credentials:
145+
```bash
146+
postgresql://user:password@host:26257/database?sslmode=verify-full
147+
```
148+
149+
### Time Format Errors
150+
Start and end times must be in RFC3339 format:
151+
- Correct: `2025-04-18T13:25:00Z`
152+
- Correct: `2025-04-18T13:25:00-05:00`
153+
- Incorrect: `2025-04-18 13:25:00`
154+
155+
### Permission Errors
156+
The database user must have read access to:
157+
- `crdb_internal` tables
158+
- System settings (for cluster metadata)
159+
- All user databases (for schema export)
160+
161+
### Empty Exports
162+
If the time range doesn't contain any data, the CSV files will only contain headers. Adjust your `--start` and `--end` flags to capture the desired time period.
163+
103164
## License
104165

105166
[MIT License](LICENSE)

pkg/export/exporter.go

Lines changed: 65 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,9 @@ type Metadata struct {
4242
Timestamp time.Time `json:"timestamp"`
4343
ExportConfig Config `json:"export_config"`
4444
ClusterVersion string `json:"cluster_version"`
45+
ClusterId string `json:"cluster_id"`
46+
ClusterName string `json:"cluster_name"`
47+
Organization string `json:"organization"`
4548
SqlStatsAggregationInterval time.Duration `json:"sql.stats.aggregation.interval"`
4649
SqlStatsFlushInterval time.Duration `json:"sql.stats.flush.interval"`
4750
}
@@ -63,7 +66,7 @@ func NewExporter(config Config) (*Exporter, error) {
6366
ctx := context.Background()
6467
cleanConnStr, err := cleanConnectionString(config.ConnectionString)
6568
if err != nil {
66-
return nil, fmt.Errorf("failed to clean connection string %w", err)
69+
return nil, fmt.Errorf("failed to clean connection string: %w", err)
6770
}
6871

6972
logrus.Infof("connecting to cluster at '%s'", cleanConnStr)
@@ -89,7 +92,7 @@ func (exporter *Exporter) Export() error {
8992
defer func(path string) {
9093
err := os.RemoveAll(path)
9194
if err != nil {
92-
logrus.Debugf("failed to remove temp directory: %w", err)
95+
logrus.WithError(err).Debug("failed to remove temp directory")
9396
}
9497
}(tempDir)
9598

@@ -99,6 +102,21 @@ func (exporter *Exporter) Export() error {
99102
return fmt.Errorf("failed to get cluster version: %w", err)
100103
}
101104

105+
clusterId, err := exporter.clusterId()
106+
if err != nil {
107+
return fmt.Errorf("failed to get cluster id: %w", err)
108+
}
109+
110+
clusterName, err := exporter.clusterName()
111+
if err != nil {
112+
return fmt.Errorf("failed to get cluster name: %w", err)
113+
}
114+
115+
organization, err := exporter.organization()
116+
if err != nil {
117+
return fmt.Errorf("failed to get organization: %w", err)
118+
}
119+
102120
agg, err := exporter.sqlStatsAggregationInterval()
103121
if err != nil {
104122
return fmt.Errorf("failed to get aggregation interval: %w", err)
@@ -118,6 +136,9 @@ func (exporter *Exporter) Export() error {
118136
TimeRange: exporter.Config.TimeRange,
119137
},
120138
ClusterVersion: clusterVersion,
139+
ClusterId: clusterId,
140+
ClusterName: clusterName,
141+
Organization: organization,
121142
SqlStatsAggregationInterval: agg,
122143
SqlStatsFlushInterval: flush,
123144
}
@@ -181,6 +202,30 @@ func (exporter *Exporter) clusterVersion() (string, error) {
181202

182203
}
183204

205+
func (exporter *Exporter) clusterId() (string, error) {
206+
r := exporter.Db.QueryRow(context.Background(), "SELECT crdb_internal.cluster_id()")
207+
var clusterId string
208+
err := r.Scan(&clusterId)
209+
return clusterId, err
210+
211+
}
212+
213+
func (exporter *Exporter) clusterName() (string, error) {
214+
r := exporter.Db.QueryRow(context.Background(), "SELECT crdb_internal.cluster_name()")
215+
var name string
216+
err := r.Scan(&name)
217+
return name, err
218+
219+
}
220+
221+
func (exporter *Exporter) organization() (string, error) {
222+
r := exporter.Db.QueryRow(context.Background(), "SHOW CLUSTER SETTING cluster.organization")
223+
var organization string
224+
err := r.Scan(&organization)
225+
return organization, err
226+
227+
}
228+
184229
// sql.stats.aggregation.interval
185230
// sql.stats.flush.interval
186231
func (exporter *Exporter) sqlStatsAggregationInterval() (time.Duration, error) {
@@ -211,23 +256,12 @@ func (exporter *Exporter) exportAllZoneConfigurations(ctx context.Context, tempD
211256

212257
dataFile := filepath.Join(tempDir, "zone_configurations.txt")
213258

214-
// Create output file
215-
file, err := os.Create(dataFile)
216-
if err != nil {
217-
return err
218-
}
219-
defer func(file *os.File) {
220-
err := file.Close()
221-
if err != nil {
222-
logrus.Errorf("failed to close file: %s", err)
223-
}
224-
}(file)
225-
226259
rows, err := exporter.Db.Query(ctx, "with z AS (SHOW ALL ZONE CONFIGURATIONS) SELECT raw_config_sql FROM z WHERE raw_config_sql IS NOT NULL")
227260

228261
if err != nil {
229262
return fmt.Errorf("failed to query z configurations: %w", err)
230263
}
264+
defer rows.Close()
231265

232266
var configs []string
233267
for rows.Next() {
@@ -252,18 +286,6 @@ func (exporter *Exporter) exportCreateStatements(ctx context.Context, db string,
252286
filename := fmt.Sprintf("%s.schema.txt", db)
253287
dataFile := filepath.Join(tempDir, filename)
254288

255-
// Create output file
256-
file, err := os.Create(dataFile)
257-
if err != nil {
258-
return err
259-
}
260-
defer func(file *os.File) {
261-
err := file.Close()
262-
if err != nil {
263-
logrus.Errorf("failed to close file: %s", err)
264-
}
265-
}(file)
266-
267289
creates, err := exporter.createStatements(db)
268290
if err != nil {
269291
return err
@@ -281,7 +303,7 @@ func (exporter *Exporter) createStatements(db string) ([]string, error) {
281303

282304
var creates []string
283305

284-
_, err := exporter.Db.Exec(context.Background(), fmt.Sprintf("USE \"%s\"", db))
306+
_, err := exporter.Db.Exec(context.Background(), fmt.Sprintf("USE %s", pgx.Identifier{db}.Sanitize()))
285307
if err != nil {
286308
return creates, err
287309
}
@@ -291,6 +313,7 @@ func (exporter *Exporter) createStatements(db string) ([]string, error) {
291313
if err != nil {
292314
return creates, err
293315
}
316+
defer rows.Close()
294317

295318
for rows.Next() {
296319
var create string
@@ -313,6 +336,7 @@ func (exporter *Exporter) userDatabases() ([]string, error) {
313336
if err != nil {
314337
return nil, err
315338
}
339+
defer rows.Close()
316340

317341
var db string
318342
for rows.Next() {
@@ -339,12 +363,15 @@ func (exporter *Exporter) exportTable(ctx context.Context, dir string, table Tab
339363
defer func(file *os.File) {
340364
err := file.Close()
341365
if err != nil {
342-
logrus.Errorf("failed to close file: %w", err)
366+
logrus.WithError(err).Debug("failed to close file")
343367
}
344368
}(file)
345369

346370
// Get column names
347-
rows, err := exporter.Db.Query(ctx, fmt.Sprintf("SELECT * FROM %s.%s LIMIT 0", table.Database, table.Name))
371+
rows, err := exporter.Db.Query(ctx,
372+
fmt.Sprintf("SELECT * FROM %s.%s LIMIT 0", pgx.Identifier{table.Database}.Sanitize(),
373+
374+
pgx.Identifier{table.Name}.Sanitize()))
348375
if err != nil {
349376
return err
350377
}
@@ -367,12 +394,14 @@ func (exporter *Exporter) exportTable(ctx context.Context, dir string, table Tab
367394
var where string
368395
if table.TimeColumn != "" {
369396
where = fmt.Sprintf("WHERE %s BETWEEN '%s' and '%s'",
370-
table.TimeColumn,
397+
pgx.Identifier{table.TimeColumn}.Sanitize(),
371398
startTime(exporter.Config.TimeRange.Start).Format("2006-01-02 15:04:05"), // offset for aggregation interval -- TODO
372399
endTime(exporter.Config.TimeRange.End).Format("2006-01-02 15:04:05"),
373400
)
374401
}
375-
copyQuery := fmt.Sprintf("COPY (SELECT * FROM %s.%s %s) TO STDOUT WITH CSV", table.Database, table.Name, where)
402+
copyQuery := fmt.Sprintf(
403+
"COPY (SELECT * FROM %s.%s %s) TO STDOUT WITH CSV",
404+
pgx.Identifier{table.Database}.Sanitize(), pgx.Identifier{table.Name}.Sanitize(), where)
376405
logrus.Info(copyQuery)
377406
_, err = exporter.Db.PgConn().CopyTo(ctx, file, copyQuery)
378407
if err != nil {
@@ -390,15 +419,15 @@ func (exporter *Exporter) createZipFile(sourceDir string) error {
390419
defer func(zipFile *os.File) {
391420
err := zipFile.Close()
392421
if err != nil {
393-
logrus.Debugf("failed to close zip file: %w", err)
422+
logrus.WithError(err).Debug("failed to close zip file")
394423
}
395424
}(zipFile)
396425

397426
zipWriter := zip.NewWriter(zipFile)
398427
defer func(zipWriter *zip.Writer) {
399428
err := zipWriter.Close()
400429
if err != nil {
401-
logrus.Debugf("failed to close zip writer: %w", err)
430+
logrus.WithError(err).Debug("failed to close zip writer")
402431
}
403432
}(zipWriter)
404433

@@ -416,7 +445,7 @@ func (exporter *Exporter) createZipFile(sourceDir string) error {
416445
return err
417446
}
418447

419-
zipFile, err := zipWriter.Create(relPath)
448+
zf, err := zipWriter.Create(relPath)
420449
if err != nil {
421450
return err
422451
}
@@ -428,11 +457,11 @@ func (exporter *Exporter) createZipFile(sourceDir string) error {
428457
defer func(file *os.File) {
429458
err := file.Close()
430459
if err != nil {
431-
logrus.Debugf("failed to close zip file: %w", err)
460+
logrus.WithError(err).Debug("failed to close zip file")
432461
}
433462
}(file)
434463

435-
_, err = io.Copy(zipFile, file)
464+
_, err = io.Copy(zf, file)
436465
return err
437466
})
438467

@@ -448,13 +477,6 @@ func endTime(t time.Time) time.Time {
448477
}
449478

450479
func cleanConnectionString(connStr string) (string, error) {
451-
/*
452-
if !strings.HasPrefix(connStr, "postgresql://") {
453-
return "", fmt.Errorf("invalid connection string: must start with postgresql://")
454-
}
455-
456-
*/
457-
458480
// Parse the connection string as a URL
459481
u, err := url.Parse(connStr)
460482
if err != nil {

0 commit comments

Comments
 (0)