You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add db and vital metrics for clock and deployment_updater
This change allows the export of database connection pool metrics and vital metrics
for the `cloud_controller_clock` and `cc_deployment_updater` processes via prometheus.
This required some bigger changes:
- Metrics Infrastructure:
- Add `ExecutionContext` class to manage process type identification and rake context setup across all CAPI processes
- Add `StandaloneMetricsWebserver` for non-API processes (clock, deployment_updater, cc-worker) with automatic TLS configuration
- Refactor `PrometheusUpdater` to register metrics based on execution context, avoiding unnecessary metric registration
- Consolidate `PrometheusUpdater` instances (remove separate `cc_worker_prometheus_updater`)
- Clock & Deployment Updater Configuration:
- Add `publish_metrics` config option to enable/disable metrics publishing
- Add `prometheus_port` config option (default: 9394 for clock, 9395 for deployment_updater)
- Initialize metrics webserver and periodic vital updates when metrics publishing is enabled
- Metric Improvements:
- Refactor `PeriodicUpdater` to support configurable task lists (enables vitals-only updates for clock, deployment_updater, cc-worker)
- Code Organization
- Rename `MetricsWebserver` to `ApiMetricsWebserver` for clarity
- Update all rake tasks to set execution context via `ExecutionContext` API
- Update DB connection metrics initialization to use `ExecutionContext`
- Update existing specs to work with new execution context
# only add the metrics for api and cc-worker processes. Otherwise e.g. rake db:migrate would also initialize metric updaters, which need additional config
{type: :gauge,name: :cc_running_tasks_memory_bytes,docstring: 'Total memory consumed by running tasks',aggregation: :most_recent},
40
35
{type: :gauge,name: :cc_users_total,docstring: 'Number of users',aggregation: :most_recent},
@@ -53,7 +48,8 @@ def self.allow_pid_label
53
48
54
49
DB_CONNECTION_POOL_METRICS=[
55
50
{type: :gauge,name: :cc_acquired_db_connections_total,labels: %i[process_type],docstring: 'Number of acquired DB connections'},
56
-
{type: :histogram,name: :cc_db_connection_hold_duration_seconds,docstring: 'The time threads were holding DB connections',buckets: CONNECTION_DURATION_BUCKETS},
51
+
{type: :histogram,name: :cc_db_connection_hold_duration_seconds,docstring: 'The time threads were holding DB connections',
52
+
buckets: CONNECTION_DURATION_BUCKETS},
57
53
# cc_connection_pool_timeouts_total must be a gauge metric, because otherwise we cannot match them with processes
0 commit comments