-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path.env.example
More file actions
936 lines (785 loc) · 36.8 KB
/
.env.example
File metadata and controls
936 lines (785 loc) · 36.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
#================================================================================================
# TelemetryFlow Deployment — ENVIRONMENT CONFIGURATION
#================================================================================================
#
# Sources:
# - Backend (NestJS API)
# - Frontend (Vue 3 / Vite)
# - Docker Compose (Infrastructure)
#
# SECURITY NOTICE:
# This file contains DEFAULT VALUES suitable for DEVELOPMENT ONLY.
# DO NOT use these values in production environments.
#
# Before deploying to production:
# 1. Generate secure secrets: pnpm run generate:secrets
# 2. Change all default passwords and credentials
# 3. Configure CORS_ORIGIN with specific trusted domains
# 4. Review the Production Security Guide: docs/security/guides/PRODUCTION_SECURITY_GUIDE.md
# 5. Run security tests: npm run test:security
#
# Quick Start: README.md
# Quick Reference: docs/security/docs/security/SECURITY_QUICK_REFERENCE.md
# Complete Guide: docs/security/SECURITY_INDEX.md
# Documentation: docs/
#
#================================================================================================
#================================================================================================
# [1] APPLICATION CONFIGURATION
#================================================================================================
APP_NAME=TelemetryFlow
NODE_ENV=development
# Backend API Port (NestJS)
PORT=3000
# Frontend Dev Server Port (Vite) — only used in local dev, not Docker
# Production: Frontend is built and served by the backend
FRONTEND_PORT=3101
# Log level: debug | info | warn | error
LOG_LEVEL=info
# Timezone
TZ=UTC
# CORS Configuration
# Development: * allows all origins (localhost, 127.0.0.1, etc.)
# Production: MUST specify comma-separated list of trusted origins (no wildcards)
# Examples:
# Development: CORS_ORIGIN=*
# Production: CORS_ORIGIN=https://app.telemetryflow.id, https://dashboard.telemetryflow.id
# Staging: CORS_ORIGIN=https://staging.telemetryflow.id
# Security Note: Wildcard (*) will trigger warnings in production mode
CORS_ORIGIN=*
# Domain Configuration
MAIN_DOMAIN=telemetryflow.id
DEMO_DOMAIN=demo.telemetryflow.id
# Docker Image Version
VERSION=1.4.0
#================================================================================================
# [2] FRONTEND — APPLICATION SETTINGS
#================================================================================================
# Base URL for the application
TELEMETRYFLOW_BASE_URL=/
# App title & code
TELEMETRYFLOW_APP_TITLE="TELEMETRYFLOW"
TELEMETRYFLOW_APP_CODE="TFO-Viz"
# Enable mock data for development (set to 'true' to skip real collector API calls)
# Development: true | Production: false
TELEMETRYFLOW_USE_MOCK=false
# Default refresh interval (ms)
TELEMETRYFLOW_REFRESH_INTERVAL=5000
# Enable frontend caching
TELEMETRYFLOW_ENABLE_CACHE=true
# Cache TTL in milliseconds (default: 300000 = 5 minutes)
TELEMETRYFLOW_CACHE_TTL=300000
#================================================================================================
# [3] FRONTEND — API ENDPOINTS
#================================================================================================
# TFO-Collector API endpoint (HTTP) — for telemetry data
# Development: http://localhost:4318 | Production: http://tfo-collector:4318
TELEMETRYFLOW_API_URL=http://localhost:3000
# TFO-Core IAM API endpoint (HTTP) — for authentication and IAM
# Development: http://localhost:3000 | Production: http://tfo-core:3000
TELEMETRYFLOW_IAM_API_URL=http://localhost:3000
# TFO-Collector gRPC endpoint
# Development: http://localhost:4317 | Production: http://tfo-collector:4317
TELEMETRYFLOW_GRPC_URL=http://localhost:4317
# WebSocket endpoint for streaming
# Development: ws://localhost:4319 | Production: ws://tfo-collector:4319
TELEMETRYFLOW_WS_URL=ws://localhost:4319
# OTLP endpoint (HTTP)
# Development: http://localhost:4318 | Production: http://tfo-collector:4317
TELEMETRYFLOW_OTLP_ENDPOINT=http://localhost:4318
# Frontend datatable configuration
DATATABLE_MAX_ROWS=5000
# Soft limit: default rows per fetch per organization (capped by TELEMETRYFLOW_LIMIT_DATA_MAX)
# (card stat panels, trends, datatables, graphs, etc.) after interval time filtering
TELEMETRYFLOW_LIMIT_DATA=5000
# Hard limit: absolute max rows per fetch (overrides TELEMETRYFLOW_LIMIT_DATA)
TELEMETRYFLOW_LIMIT_DATA_MAX=100000
# Scan limit: max rows ClickHouse reads for aggregation queries (patterns, stats, top-errors)
# These queries scan ALL logs in the time window but return a small result set, so scan
# limit must be >> result limit. Default: TELEMETRYFLOW_LIMIT_DATA_MAX x 50
TELEMETRYFLOW_SCAN_LIMIT_DATA=5000000
# GitHub repository URL (used in frontend Help/About)
GITHUB_REPOSITORY_URL=https://github.com/telemetryflow/telemetryflow-platform
#================================================================================================
# [4] FRONTEND — AUTHENTICATION & SSO
#================================================================================================
# API Key ID (prefix: tfk_)
# Development: tfk_dev_key | Production: generate a proper key
TELEMETRYFLOW_API_KEY_ID=
# API Key Secret (prefix: tfs_)
# Development: tfs_dev_secret | Production: generate a proper secret
TELEMETRYFLOW_API_KEY_SECRET=
# API Key (sent via header for frontend → backend authentication)
# Development: leave empty | Production: set to match backend API key
TELEMETRYFLOW_API_KEY_HEADER=x-api-key
# Login credentials (admin dashboard)
# SECURITY: Generate a strong password for production!
# openssl rand -base64 24
TELEMETRYFLOW_VIZ_USERNAME=demo.telemetryflow
TELEMETRYFLOW_VIZ_EMAIL=demo.telemetryflow@telemetryflow.id
TELEMETRYFLOW_VIZ_PASSWORD=
# SSO Authentication Providers
# Enable SSO login buttons (set to 'true' to show)
# Development: partial SSO | Production: all SSO providers enabled
TELEMETRYFLOW_SSO_GOOGLE=false
TELEMETRYFLOW_SSO_MICROSOFT=false
TELEMETRYFLOW_SSO_APPLE=false
TELEMETRYFLOW_SSO_SLACK=false
TELEMETRYFLOW_SSO_COGNITO=false
TELEMETRYFLOW_SSO_GITHUB=false
# reCAPTCHA v3 (Status Page subscribe protection)
# Get keys from: https://www.google.com/recaptcha/admin
# Leave empty to disable reCAPTCHA verification (development mode)
TELEMETRYFLOW_RECAPTCHA_SITE_KEY=
RECAPTCHA_SECRET_KEY=
RECAPTCHA_MIN_SCORE=0.5
#================================================================================================
# [5] FRONTEND — WHITE LABEL / BRANDING
#================================================================================================
# Brand Information
TELEMETRYFLOW_BRAND_NAME="TelemetryFlow Observability"
TELEMETRYFLOW_BRAND_TAGLINE="Community Enterprise Observability Platform (CEOP)"
# Domain (used for placeholder URLs and example emails in forms)
TELEMETRYFLOW_DOMAIN=demo.telemetryflow.id
# GitHub/Source URL (used for Help link in navigation)
TELEMETRYFLOW_GITHUB_URL=https://github.com/telemetryflow/overview
# Logo Paths (relative to src/assets or absolute URL)
TELEMETRYFLOW_LOGO_LIGHT=src/assets/tfo-logo-light.svg
TELEMETRYFLOW_LOGO_DARK=src/assets/tfo-logo-dark.svg
TELEMETRYFLOW_LOGO_ICON=src/assets/favicon.svg
TELEMETRYFLOW_LOGO_WIDTH=180px
TELEMETRYFLOW_LOGO_HEIGHT=auto
# Copyright Information
TELEMETRYFLOW_COPYRIGHT_COMPANY=TelemetryFlow
TELEMETRYFLOW_COPYRIGHT_YEAR=2026
# TELEMETRYFLOW_COPYRIGHT_TEXT=Custom copyright text (overrides default)
TELEMETRYFLOW_SHOW_POWERED_BY=false
# Footer Links
TELEMETRYFLOW_LINK_WEBSITE=https://telemetryflow.id
TELEMETRYFLOW_LINK_DOCS=https://docs.telemetryflow.id
TELEMETRYFLOW_LINK_SUPPORT=https://support.telemetryflow.id
# TELEMETRYFLOW_LINK_PRIVACY=https://telemetryflow.id/privacy
# TELEMETRYFLOW_LINK_TERMS=https://telemetryflow.id/terms
# Theme Colors (optional — hex format)
# TELEMETRYFLOW_THEME_PRIMARY_COLOR=#6366f1
# TELEMETRYFLOW_THEME_ACCENT_COLOR=#8b5cf6
#================================================================================================
# [6] BACKEND — LOGGING CONFIGURATION
#================================================================================================
# TelemetryFlow supports two logging modes:
# 1. Native NestJS Logger (default) — Simple console logging
# 2. Winston Logger — Advanced logging with multiple transports
#
# Winston provides:
# - OpenTelemetry trace correlation (traceId, spanId)
# - Multiple transports (Console, File, Loki, FluentBit, OpenSearch, ClickHouse)
# - Structured JSON logging
# - Log aggregation and search capabilities
#
# Choose logging implementation:
# nestjs — Use native NestJS Logger (simple, console only)
# winston — Use Winston Logger with OpenTelemetry integration
#
# Default: nestjs (for backward compatibility)
# Recommended: winston (for production with observability)
# Documentation: docs/WINSTON_LOGGER.md | config/README.md
LOGGER_TYPE=winston
# Console Transport (always enabled in development)
# Pretty print for development, JSON for production
LOG_PRETTY_PRINT=false
# Log format: json | pretty
LOG_FORMAT=json
# Colorize console output (development only)
LOG_COLORIZE=false
# Paths to skip logging (health checks, metrics scraping)
LOG_SKIP_PATHS=/health,/metrics,/healthz,/ready
#------------------------------------------------------------------------------------------------
# OpenTelemetry Log Transport
#------------------------------------------------------------------------------------------------
# Enable OTEL log export to collector
OTEL_LOGS_ENABLED=true
#------------------------------------------------------------------------------------------------
# File Transport (Daily Rotation)
#------------------------------------------------------------------------------------------------
# File transport writes logs to rotating files for production environments
# Requires: npm install winston-daily-rotate-file (installed as optional dependency)
#
# Features:
# - Daily log rotation (configurable pattern)
# - Automatic compression of old logs (gzip)
# - Size-based rotation (maxSize)
# - Automatic cleanup (maxFiles)
#
# Enable file transport (default: enabled in production, disabled in development)
LOG_FILE_ENABLED=false
# Directory for log files (relative to app root or absolute path)
LOG_FILE_DIRNAME=logs
# Filename pattern (%DATE% is replaced with date)
LOG_FILE_FILENAME=app-%DATE%.log
# Date pattern for file rotation (default: YYYY-MM-DD = daily)
LOG_FILE_DATE_PATTERN=YYYY-MM-DD
# Compress old log files with gzip
LOG_FILE_ZIPPED=true
# Maximum file size before rotation (e.g., '20m', '100k', '1g')
LOG_FILE_MAX_SIZE=20m
# Maximum days/count to keep logs (e.g., '14d' = 14 days, '10' = 10 files)
LOG_FILE_MAX_FILES=14d
# Use JSON format for log files (recommended for log aggregation)
LOG_FILE_JSON=true
#------------------------------------------------------------------------------------------------
# Grafana Loki Integration (Log Aggregation)
#------------------------------------------------------------------------------------------------
# Loki provides log aggregation with LogQL querying
# Requires: docker-compose --profile monitoring (includes loki service)
LOKI_ENABLED=false
# Loki server endpoint
LOKI_HOST=http://loki:3100
# Additional labels for log streams
LOKI_LABELS_APP=telemetryflow
LOKI_LABELS_ENV=staging
#------------------------------------------------------------------------------------------------
# FluentBit Integration (Log Forwarding)
#------------------------------------------------------------------------------------------------
# FluentBit provides lightweight log forwarding to multiple destinations
# Requires: docker-compose --profile monitoring (includes fluentbit service)
FLUENTBIT_ENABLED=false
# FluentBit server host
FLUENTBIT_HOST=fluentbit
# FluentBit Forward protocol port
FLUENTBIT_PORT=24224
# Log tag for FluentBit routing
FLUENTBIT_TAG=telemetryflow.logs
#------------------------------------------------------------------------------------------------
# OpenSearch Integration (Full-Text Log Search)
#------------------------------------------------------------------------------------------------
# OpenSearch provides full-text search and log analytics
# Requires: docker-compose --profile monitoring (includes opensearch service)
OPENSEARCH_ENABLED=false
# OpenSearch server endpoint
OPENSEARCH_NODE=http://opensearch:9200
# SECURITY: Change default credentials for production!
# Generate strong credentials: openssl rand -base64 24
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=
# Index name prefix (daily indices: telemetryflow-logs-YYYY.MM.DD)
OPENSEARCH_INDEX=telemetryflow-logs
#------------------------------------------------------------------------------------------------
# ClickHouse Log Transport (High-Performance Log Analytics)
#------------------------------------------------------------------------------------------------
# ClickHouse provides high-performance log storage and analytics
# Leverages existing ClickHouse instance used for telemetry data
CLICKHOUSE_LOGS_ENABLED=false
# Table name for application logs
CLICKHOUSE_LOGS_TABLE=logs
# Flush interval in milliseconds (aligned with prod code path — flush every 5s)
CLICKHOUSE_FLUSH_INTERVAL=5000
# Buffer size — flush when N logs accumulated (default: 100)
CLICKHOUSE_BUFFER_LIMIT=2000
# ClickHouse client transport timeout (seconds) — safety net for long-running queries
CLICKHOUSE_REQUEST_TIMEOUT=120
#================================================================================================
# [7] BACKEND — OPENTELEMETRY CONFIGURATION
#================================================================================================
# Enable/Disable OpenTelemetry instrumentation
OTEL_ENABLED=true
# Service identity for OTEL traces and metrics
OTEL_SERVICE_NAME=telemetryflow-platform
SERVICE_VERSION=1.4.0
SERVICE_NAMESPACE=telemetryflow
SERVICE_TEAM=platform
# OTEL Collector Endpoint (SDK Exporter target)
# Via Collector: http://tfo-collector.telemetryflow.svc.cluster.local:4318 (TFO-Collector — spanmetrics, exemplars, servicegraph, correlation)
# Self-ingest: http://localhost:3000 (direct to backend — basic traces/logs/metrics only, no exemplars)
# Docker: http://tfo-collector:4318
# Flow: NestJS OTEL SDK → Collector:4318 → Backend:3100/v1/{traces,logs,metrics} → ClickHouse
# OTEL_EXPORTER_OTLP_ENDPOINT=http://tfo-collector.telemetryflow.svc.cluster.local:4318
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
# OTEL Collector Endpoint (Internal K8s service)
# OTEL_COLLECTOR_ENDPOINT=http://tfo-collector.telemetryflow.svc.cluster.local:4318
OTEL_COLLECTOR_ENDPOINT=http://tfo-collector:4318
# TFO Backend OTLP endpoint — where TFO-Collector forwards traces/logs for ClickHouse storage
# Development: http://localhost:3000 | K8s: http://tfo-backend.telemetryflow.svc.cluster.local:3100
# TELEMETRYFLOW_BACKEND_OTLP_ENDPOINT=http://tfo-backend.telemetryflow.svc.cluster.local:3100
TELEMETRYFLOW_BACKEND_OTLP_ENDPOINT=http://localhost:3000
# OTEL Collector Health Check URL (port 13133 = health_check extension)
# Development: http://localhost:13133 | Docker: http://tfo-collector:13133
OTEL_COLLECTOR_HEALTH_URL=http://localhost:13133
#================================================================================================
# [8] BACKEND — TELEMETRY DATA CONFIGURATION
#================================================================================================
# Data Retention (in days)
# How long telemetry data is kept before automatic cleanup
TELEMETRY_METRICS_RETENTION_DAYS=15
TELEMETRY_LOGS_RETENTION_DAYS=15
TELEMETRY_TRACES_RETENTION_DAYS=15
# Aggregation Configuration
# Enable/disable automatic data aggregation
TELEMETRY_AGGREGATION_ENABLED=true
# Aggregation intervals (comma-separated: 1m, 5m, 15m, 1h, 1d)
TELEMETRY_AGGREGATION_INTERVALS=1m,5m,15m,1h,1d
# Batch Processing Configuration
# Number of records to process in a single batch
TELEMETRY_BATCH_SIZE=2000
# Interval in milliseconds between batch flushes (aligned with prod code path)
TELEMETRY_FLUSH_INTERVAL=5000
# Real-time Streaming Types
# Reduced from metrics,logs,traces — lower streaming volume for staging
TELEMETRY_REALTIME_TYPES=metrics
# Agent Lifecycle Configuration
# Seconds between heartbeats
AGENT_HEARTBEAT_INTERVAL=30
# Minutes before marking inactive
AGENT_INACTIVE_THRESHOLD=10
# Days before deleting inactive agents
AGENT_CLEANUP_DAYS=30
# Uptime Checker
# 30s — staging doesn't need 10s granularity
UPTIME_CHECKER_INTERVAL_MS=30000
#================================================================================================
# [9] BACKEND — POSTGRESQL CONFIGURATION
#================================================================================================
# SECURITY: Change default database credentials for production
# Development: Default postgres user acceptable for local testing
# Production: Create dedicated user with restricted permissions, strong password
#
# Config managed in config/postgresql/postgresql.conf
# Note: Docker containers get POSTGRES_HOST=postgres override from docker-compose.yml
# POSTGRES_HOST=postgresql.telemetryflow.svc.cluster.local
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=telemetryflow_db
# NOTE: .env.example uses POSTGRES_USER, .env.poc uses POSTGRES_USERNAME
# Include both for compatibility — your app should read the one it expects
POSTGRES_USER=tfo_admin
POSTGRES_USERNAME=tfo_admin
# SECURITY: Generate a strong password for production!
# openssl rand -base64 24
POSTGRES_PASSWORD=
# TypeORM / Database aliases (K8s deployment convention)
# DATABASE_HOST=postgresql.telemetryflow.svc.cluster.local
DATABASE_HOST=localhost
DATABASE_PORT=5432
DATABASE_NAME=telemetryflow_db
DATABASE_USERNAME=tfo_admin
DATABASE_SSL=false
DATABASE_POOL_SIZE=10
DATABASE_LOGGING=false
#================================================================================================
# [10] BACKEND — CLICKHOUSE CONFIGURATION
#================================================================================================
# SECURITY: Change default database credentials for production
#
# IMPORTANT: ClickHouse configuration is managed in config/clickhouse/
# - config.xml: Server settings (memory, compression, logging)
# - users.xml: User accounts and access control
# See config/clickhouse/README.md for details.
#
# Security:
# Development: Default credentials acceptable for local testing
# Production:
# - Create dedicated users in config/clickhouse/users.xml
# - Use SHA256 password hashing (not plain text)
# - Separate users for: application, OTLP ingestion, read-only access
# - Restrict network access to specific IP ranges
#
# Note: Docker containers get CLICKHOUSE_HOST override from docker-compose.yml
#
# Performance Settings (configured in config/clickhouse/config.xml):
# - Max Memory Per Query: 10GB
# - Max Concurrent Queries: 100
# - Compression: LZ4 (optimized for telemetry data)
# - TTL: 90 days (metrics), 30 days (logs), 14 days (traces)
#
# Development: localhost | K8s: clickhouse.telemetryflow.svc.cluster.local
# CLICKHOUSE_HOST=clickhouse.telemetryflow.svc.cluster.local
CLICKHOUSE_HOST=localhost
CLICKHOUSE_PORT=8123
CLICKHOUSE_HTTP_PORT=8123
CLICKHOUSE_NATIVE_PORT=9000
CLICKHOUSE_DB=telemetryflow_db
CLICKHOUSE_USER=tfo_admin
# SECURITY: Generate a strong password for production!
# openssl rand -base64 24
CLICKHOUSE_PASSWORD=
#================================================================================================
# [11] BACKEND — REDIS CONFIGURATION
#================================================================================================
# Redis is used for: L2 Cache, BullMQ Queues, Sessions
# Eviction policy: noeviction (CRITICAL for BullMQ to prevent job data loss)
# See config/redis/redis.conf and config/redis/README.md for details
#
# Connection Settings:
# Development: localhost or Docker container IP
# Production: Redis cluster endpoint or managed Redis service
# Note: Docker containers get REDIS_HOST override from docker-compose.yml
# REDIS_HOST=redis-master.telemetryflow.svc.cluster.local
REDIS_HOST=localhost
REDIS_PORT=6379
# Security:
# Development: Leave password empty for local testing
# Production: MUST set a strong password
REDIS_PASSWORD=
# Memory Management:
# Default: 512mb (suitable for development)
# Production: Increase based on workload (e.g., 1gb, 2gb, 4gb)
# When memory limit is reached with 'noeviction', Redis returns errors instead of evicting data
REDIS_MAX_MEMORY=1024mb
# Redis Database Separation (0-15 available)
# Each database is isolated and can store different types of data
REDIS_DB=0
REDIS_SESSION_DB=0
REDIS_CACHE_DB=0
REDIS_QUEUE_DB=1
#================================================================================================
# [12] BACKEND — NATS CONFIGURATION (Event Streaming)
#================================================================================================
# NATS is used for real-time event streaming alongside BullMQ
# BullMQ: Reliable processing with retries
# NATS: Real-time broadcasting (<1ms latency)
#
# Connection URL:
# Development: nats://localhost:4222
# Production: nats://nats:4222 (Docker) or nats://your-nats-server:4222
#
# Features:
# - JetStream enabled for optional persistence
# - Monitoring available at http://0.0.0.0:8222
# - Used for: real-time dashboards, alert broadcasts, live telemetry
#
# Documentation: docs-saas/implementation-ddd/MESSAGING-ARCHITECTURE.md
# Note: Docker containers get NATS_URL override from docker-compose.yml
# NATS_URL=nats://nats.telemetryflow.svc.cluster.local:4222
NATS_URL=nats://localhost:4222
NATS_CLUSTER_ID=telemetryflow-cluster
#================================================================================================
# [13] BACKEND — CACHE CONFIGURATION (Multi-Level L1/L2)
#================================================================================================
# L1 Cache (In-Memory)
CACHE_L1_ENABLED=true
CACHE_L1_TTL=60
CACHE_L1_MAX_SIZE=5000
# L2 Cache (Redis)
CACHE_L2_ENABLED=true
CACHE_L2_TTL=3600
CACHE_KEY_PREFIX=tf:cache:
# Cache Feature Flags
CACHE_ENABLED=true
#================================================================================================
# [14] BACKEND — BULLMQ QUEUE CONFIGURATION
#================================================================================================
# Enable/Disable BullMQ queue system
# Development: true | Set to false to temporarily disable
ENABLE_BULLMQ=true
#------------------------------------------------------------------------------------------------
# Queue Worker Concurrency
#------------------------------------------------------------------------------------------------
# Number of concurrent workers per queue type (reduced for staging)
QUEUE_OTLP_CONCURRENCY=5
QUEUE_EVENTS_CONCURRENCY=3
QUEUE_TELEMETRY_CONCURRENCY=5
QUEUE_ALERTS_CONCURRENCY=2
QUEUE_NOTIFICATIONS_CONCURRENCY=1
QUEUE_AGGREGATION_CONCURRENCY=1
QUEUE_CLEANUP_CONCURRENCY=1
#------------------------------------------------------------------------------------------------
# Queue Rate Limits (jobs per second)
#------------------------------------------------------------------------------------------------
QUEUE_OTLP_RATE_LIMIT=1000
#------------------------------------------------------------------------------------------------
# Queue Retry Configuration
#------------------------------------------------------------------------------------------------
# Maximum retry attempts (default: 3)
QUEUE_MAX_ATTEMPTS=5
# Initial backoff delay in ms (default: 1000)
QUEUE_BACKOFF_DELAY=2000
#------------------------------------------------------------------------------------------------
# Queue Job Cleanup
#------------------------------------------------------------------------------------------------
# Keep last N completed/failed jobs
QUEUE_KEEP_COMPLETED=1000
QUEUE_KEEP_FAILED=5000
# Keep completed/failed jobs for N seconds
QUEUE_COMPLETED_AGE=86400
QUEUE_FAILED_AGE=604800
#------------------------------------------------------------------------------------------------
# Queue Feature Flags
#------------------------------------------------------------------------------------------------
QUEUE_ENABLED=true
#================================================================================================
# [15] BACKEND — TFO-AGENT CONFIGURATION
#================================================================================================
# TFO-Agent: Host-level metrics collector (CPU, memory, disk, network)
# Replaces Prometheus for infrastructure monitoring
TFO_AGENT_VERSION=1.2.0
TFO_AGENT_HOSTNAME=tfo-container
TFO_AGENT_HEARTBEAT_INTERVAL=60s
TFO_AGENT_METRICS_INTERVAL=30s
TFO_COLLECTOR_VERSION=1.2.1
#================================================================================================
# [16] BACKEND — API KEYS, JWT & SESSION
#================================================================================================
# SECURITY CRITICAL: These secrets MUST be changed in production
#
# Requirements:
# - Minimum 32 characters length
# - Use cryptographically secure random strings
# - NEVER commit production secrets to version control
# - Different secrets for each environment (dev, staging, prod)
#
# Generate secure secrets:
# pnpm run generate:secrets
# node -e "console.log(require('crypto').randomBytes(32).toString('base64'))"
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
# JWT Secret (Token signing) — SECURITY: Generate for production!
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
JWT_SECRET=
# JWT Refresh Secret (Refresh token signing) — SECURITY: Generate for production!
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
JWT_REFRESH_SECRET=
# JWT Token Expiry
JWT_ACCESS_EXPIRY=24h
JWT_REFRESH_EXPIRY=30d
JWT_EXPIRES_IN=24h
# Session Secret (Cookie signing) — SECURITY: Generate for production! (different from JWT)
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
SESSION_SECRET=
# General Encryption Key — SECURITY: Generate for production!
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
ENCRYPTION_KEY=
# MFA Encryption Key — Used to encrypt TOTP secrets stored in database
# SECURITY: Generate for production! If key is lost, users will need to re-enroll MFA
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
MFA_ENCRYPTION_KEY=
# LLM Encryption Key (AES-256-GCM for encrypting stored LLM API keys) — SECURITY: Generate for production!
# openssl rand -base64 32 | tr -d '/+=' | cut -c -32
LLM_ENCRYPTION_KEY=
#================================================================================================
# [17] BACKEND — EMAIL / SMTP CONFIGURATION
#================================================================================================
# Email notification system for security alerts and account notifications
#
# Enable/Disable Features:
# SMTP_ENABLED=false (Development: Logs to console instead of sending)
# SMTP_ENABLED=true (Production: Sends actual emails via SMTP)
#
# SMTP Configuration Examples:
# Gmail: host=smtp.gmail.com, port=587, secure=false
# SendGrid: host=smtp.sendgrid.net, port=587, secure=false
# AWS SES: host=email-smtp.region.amazonaws.com, port=587, secure=false
# Mailgun: host=smtp.mailgun.org, port=587, secure=false
# MailHog: host=localhost, port=1025, secure=false (development)
SMTP_ENABLED=true
SMTP_HOST=smtp.telemetryflow.id
SMTP_PORT=587
SMTP_SECURE=true
SMTP_USER=
SMTP_PASSWORD=
# Sender configuration (combined format)
SMTP_FROM="TelemetryFlow Platform <noreply@telemetryflow.id>"
# Sender configuration (separate fields — alternative to SMTP_FROM)
SMTP_FROM_ADDRESS=noreply@telemetryflow.id
SMTP_FROM_NAME=TelemetryFlow Platform
SMTP_REPLY_TO=support@telemetryflow.id
# Application URL (used in email templates for links)
# APP_URL=https://demo.telemetryflow.id
APP_URL=http://localhost:3000
#================================================================================================
# [18] BACKEND — RATE LIMITING CONFIGURATION
#================================================================================================
# Default rate limit for authenticated endpoints (requests per minute)
THROTTLE_TTL=60000
THROTTLE_LIMIT=500
# Ingestion rate limit for OTEL collector endpoints (requests per minute)
THROTTLE_INGESTION_TTL=60000
THROTTLE_INGESTION_LIMIT=5000
#================================================================================================
# [19] DOCKER-COMPOSE CONFIGURATION
#================================================================================================
# Usage:
# Core services only: docker-compose --profile core up -d
# Core + monitoring: docker-compose --profile core --profile monitoring up -d
# All services: docker-compose --profile all up -d
# Development (all services): docker-compose --profile dev --profile monitoring up -d
# With monitoring only: docker-compose --profile monitoring up -d
#
# Profiles:
# - core : postgres, clickhouse, redis, nats, backend, frontend
# - monitoring : OTEL collector, TFO-Agent, Loki, OpenSearch, FluentBit
# - tools/dev : PgAdmin, Portainer
# - all : All services
#------------------------------------------------------------------------------------------------
# Volume Configuration
#------------------------------------------------------------------------------------------------
VOLUMES_BASE_PATH=/opt/data/docker/telemetryflow-platform
#------------------------------------------------------------------------------------------------
# Data Paths (for bind mounts)
#------------------------------------------------------------------------------------------------
# Docker manages these volumes automatically; only needed for external bind mounts
# Core Services
DATA_POSTGRESQL=/opt/data/docker/telemetryflow-platform/postgresql
DATA_CLICKHOUSE=/opt/data/docker/telemetryflow-platform/clickhouse
DATA_REDIS=/opt/data/docker/telemetryflow-platform/redis
DATA_NATS=/opt/data/docker/telemetryflow-platform/nats
# Monitoring Services (Profile: monitoring)
DATA_TFO_AGENT=/opt/data/docker/telemetryflow-platform/tfo-agent
# Dev Tools (Profile: dev/tools)
DATA_PORTAINER=/opt/data/docker/telemetryflow-platform/portainer
#================================================================================================
# [20] SERVICE VERSIONS
#================================================================================================
# Application
NODE_VERSION=22
# Infrastructure
POSTGRES_VERSION=16-alpine
CLICKHOUSE_VERSION=latest
REDIS_VERSION=7-alpine
NATS_VERSION=2-alpine
# Monitoring (Profile: monitoring)
OTEL_VERSION=latest
# Logging Infrastructure (Profile: monitoring)
LOKI_VERSION=2.9.0
OPENSEARCH_VERSION=2.11.0
OPENSEARCH_DASHBOARDS_VERSION=2.11.0
FLUENTBIT_VERSION=2.2
# Dev Tools (Profile: dev/tools)
PGADMIN_VERSION=latest
PORTAINER_VERSION=latest
#================================================================================================
# [21] CONTAINER NAMES
#================================================================================================
# Core Services
CONTAINER_BACKEND=telemetryflow_platform_backend
CONTAINER_FRONTEND=telemetryflow_platform_frontend
CONTAINER_POSTGRES=telemetryflow_platform_postgres
CONTAINER_CLICKHOUSE=telemetryflow_platform_clickhouse
CONTAINER_REDIS=telemetryflow_platform_redis
CONTAINER_NATS=telemetryflow_platform_nats
# Monitoring Services (Profile: monitoring)
CONTAINER_OTEL=telemetryflow_platform_tfo_collector
CONTAINER_TFO_AGENT=telemetryflow_platform_tfo_agent
# Logging Infrastructure (Profile: monitoring)
CONTAINER_LOKI=telemetryflow_loki
CONTAINER_OPENSEARCH=telemetryflow_opensearch
CONTAINER_OPENSEARCH_DASHBOARDS=telemetryflow_opensearch_dashboards
CONTAINER_FLUENTBIT=telemetryflow_fluentbit
# Dev Tools (Profile: dev/tools)
CONTAINER_PGADMIN=telemetryflow_pgadmin
CONTAINER_PORTAINER=telemetryflow_platform_portainer
#================================================================================================
# [22] PORT MAPPINGS
#================================================================================================
# Core Services
PORT_BACKEND=3000
PORT_FRONTEND=8080
PORT_POSTGRES=5432
PORT_CLICKHOUSE_HTTP=8123
PORT_CLICKHOUSE_NATIVE=9000
PORT_CLICKHOUSE_METRICS=9363
PORT_REDIS=6379
PORT_NATS_CLIENT=4222
PORT_NATS_MONITOR=8222
# Monitoring Services (Profile: monitoring)
PORT_OTEL_GRPC=4317
PORT_OTEL_HTTP=4318
PORT_OTEL_METRICS=8889
PORT_OTEL_HEALTH=13133
PORT_OTEL_ZPAGES=55679
PORT_TFO_AGENT=9191
# Dev Tools (Profile: dev/tools)
PORT_PORTAINER=9100
PORT_PORTAINER_HTTPS=9443
#================================================================================================
# [23] STATIC IP ADDRESSES (Docker internal networking)
#================================================================================================
# Network Subnet: 172.151.0.0/16
# Note: These IPs are used for internal Docker networking only
# Services can still communicate using DNS names (preferred method)
# Core Services
CONTAINER_IP_BACKEND=172.151.151.10
CONTAINER_IP_FRONTEND=172.151.151.15
CONTAINER_IP_POSTGRES=172.151.151.20
CONTAINER_IP_CLICKHOUSE=172.151.151.40
CONTAINER_IP_REDIS=172.151.151.50
CONTAINER_IP_NATS=172.151.151.55
# Monitoring Services (Profile: monitoring)
CONTAINER_IP_OTEL=172.151.151.30
CONTAINER_IP_TFO_AGENT=172.151.151.35
# Dev Tools (Profile: dev/tools)
CONTAINER_IP_PORTAINER=172.151.151.5
#================================================================================================
# [24] DEV TOOLS — PGADMIN CONFIGURATION (Profile: dev/tools)
#================================================================================================
# SECURITY: Generate a strong password for production!
# openssl rand -base64 24
PGADMIN_EMAIL=admin@telemetryflow.id
PGADMIN_PASSWORD=
#================================================================================================
# [25] KUBERNETES / DEPLOYMENT CONFIGURATION (OPTIONAL)
#================================================================================================
CLUSTER_NAME=telemetryflow-staging
K8S_NAMESPACE=telemetryflow
K8S_POD_NAME=local-pod
K8S_NODE_NAME=local-node
HOSTNAME=localhost
# TFO Kubernetes Cluster Registration ID
# Obtain by: POST /api/v2/monitoring/kubernetes/clusters → copy the returned {id}
# Set the same UUID in TFO Agent config: collectors.kubernetes.cluster_id
TELEMETRYFLOW_K8S_CLUSTER_ID=
#================================================================================================
# [26] DEFAULT DEMO CONFIGURATION (OTEL Collector)
#================================================================================================
# Default region, workspace, and tenant for internal OTEL collector
# These values are automatically applied to ingested data that doesn't specify them
# Region code (e.g., APS3 = ap-southeast-3 / Asia Pacific - Jakarta)
DEFAULT_REGION_CODE=APS3
# Workspace code
DEFAULT_WORKSPACE_CODE=TELEMETRYFLOW-STAGING
# Tenant code
DEFAULT_TENANT_CODE=TELEMETRYFLOW
# Default Organization ID for alert rule evaluation
# This is used as fallback when no enabled alert rules are found
# Query your database to get the correct organization ID:
# SELECT id FROM organizations WHERE name = 'Your Organization Name';
# Leave empty to use 'default' as fallback
DEFAULT_ORGANIZATION_ID=
#================================================================================================
# PRODUCTION DEPLOYMENT SECURITY CHECKLIST
#================================================================================================
#
# Before deploying to production, ensure ALL items are checked:
#
# [ ] JWT_SECRET changed to 32+ character random string
# [ ] JWT_REFRESH_SECRET changed to 32+ character random string
# [ ] SESSION_SECRET changed to 32+ character random string (different from JWT)
# [ ] ENCRYPTION_KEY changed to 32+ character random string
# [ ] MFA_ENCRYPTION_KEY changed to 32+ character random string
# [ ] LLM_ENCRYPTION_KEY changed to 32+ character random string
# [ ] CORS_ORIGIN set to specific trusted domains (no wildcard)
# [ ] POSTGRES_PASSWORD changed from default
# [ ] CLICKHOUSE_PASSWORD changed from default
# [ ] REDIS_PASSWORD set to a strong password
# [ ] OPENSEARCH_PASSWORD changed from default
# [ ] PGADMIN_PASSWORD changed from default
# [ ] TELEMETRYFLOW_API_KEY_ID / TELEMETRYFLOW_API_KEY_SECRET aligned to system key #1 (DevOpsCorner)
# [ ] TELEMETRYFLOW_VIZ_PASSWORD changed from default
# [ ] All database users have restricted permissions (least privilege)
# [ ] HTTPS/TLS enabled at reverse proxy (nginx/traefik)
# [ ] Security headers validated (run: npm run test:security)
# [ ] SMTP_ENABLED=true with production SMTP server
# [ ] TELEMETRYFLOW_USE_MOCK=false
# [ ] NODE_ENV=production
# [ ] Environment variables stored securely (secrets manager, not in git)
# [ ] Backup and disaster recovery plan in place
# [ ] Monitoring and alerting configured
#
# Generate Secure Secrets:
# pnpm run generate:secrets
#
# Test Security Configuration:
# npm run test:security
#
# For complete deployment guide, see:
# - docs/security/guides/PRODUCTION_SECURITY_GUIDE.md
# - docs/security/SECURITY_QUICK_REFERENCE.md
#
#================================================================================================