Reference for all configurable knobs in the GKG server. All four modes (Webserver, Indexer, DispatchIndexing, HealthCheck) share the same AppConfig struct and loading mechanism.
Config is loaded in layers, each overriding the previous:
- Configuration file: Example
config/default.yaml - Secrets: Files in
/etc/secrets/(Kubernetes secret mounts) - Environment variables: Prefixed with
GKG_, using__as a separator for nested keys and,for lists
Environment variable examples:
GKG_NATS__URL=nats://gkg-nats:4222
GKG_GRAPH__DATABASE=gkg-sandbox
GKG_ENGINE__MAX_CONCURRENT_WORKERS=16The binary (gkg-server) runs in one of four modes via --mode:
| Mode | Purpose | Key config sections |
|---|---|---|
Webserver |
HTTP/gRPC query server | bind_address, grpc_bind_address, grpc, tls, query, graph, gitlab |
Indexer |
Consumes NATS messages and runs indexing handlers | nats, engine, graph, datalake, gitlab, schedule, schema |
DispatchIndexing |
Runs the scheduler loop that publishes indexing requests | nats, graph, datalake, schedule, schema |
HealthCheck |
K8s readiness/liveness probes | health_check, graph, datalake |
All modes share the same configuration structure.
| Config path | Env var | Default | Description |
|---|---|---|---|
nats.url |
GKG_NATS__URL |
localhost:4222 |
Broker address |
nats.username |
GKG_NATS__USERNAME |
None | Auth username |
nats.password |
GKG_NATS__PASSWORD |
None | Auth password |
nats.tls_ca_cert_path |
GKG_NATS__TLS_CA_CERT_PATH |
None | CA cert (PEM) for TLS. Setting any TLS path enables TLS. |
nats.tls_cert_path |
GKG_NATS__TLS_CERT_PATH |
None | Client cert (PEM) for mTLS. Must pair with tls_key_path. |
nats.tls_key_path |
GKG_NATS__TLS_KEY_PATH |
None | Client key (PEM) for mTLS. Must pair with tls_cert_path. |
nats.connection_timeout_secs |
10 |
Connection timeout | |
nats.request_timeout_secs |
5 |
Request timeout |
| Config path | Env var | Default | Description |
|---|---|---|---|
nats.consumer_name |
GKG_NATS__CONSUMER_NAME |
None | Durable consumer name. None = ephemeral (lost on disconnect). Set in production for persistence across restarts. |
nats.ack_wait_secs |
300 |
Seconds before unacked message is redelivered | |
nats.max_deliver |
5 |
Max redelivery attempts. None = unlimited. |
|
nats.batch_size |
10 |
Messages fetched per batch | |
nats.subscription_buffer_size |
100 |
Internal channel buffer between fetch loop and handler | |
nats.fetch_expires_secs |
5 |
Server-side timeout for batch fetch (clamped to min 1s) |
| Config path | Env var | Default | Description |
|---|---|---|---|
nats.auto_create_streams |
GKG_NATS__AUTO_CREATE_STREAMS |
true |
Create streams on startup |
nats.stream_replicas |
GKG_NATS__STREAM_REPLICAS |
1 |
Replicas per stream. Use 3 in production for fault tolerance. |
nats.stream_max_age_secs |
GKG_NATS__STREAM_MAX_AGE_SECS |
None | Max message age before deletion |
nats.stream_max_bytes |
GKG_NATS__STREAM_MAX_BYTES |
None | Max stream size in bytes |
nats.stream_max_messages |
GKG_NATS__STREAM_MAX_MESSAGES |
None | Max messages per stream |
The GKG_INDEXER stream is created with:
- Retention:
WorkQueue(messages deleted after ack) - Discard:
Newwithdiscard_new_per_subject: true - Max messages per subject:
1(deduplication: rejects publishes while a handler hasn't acked) - Storage: File
Two separate ClickHouse connections are required: one for the datalake (Siphon-replicated tables) and one for the graph (indexed property graph).
| Config path | Env var | Default | Description |
|---|---|---|---|
datalake.url |
GKG_DATALAKE__URL |
http://127.0.0.1:8123 |
HTTP endpoint |
datalake.database |
GKG_DATALAKE__DATABASE |
default |
Database name |
datalake.username |
GKG_DATALAKE__USERNAME |
default |
Auth user |
datalake.password |
GKG_DATALAKE__PASSWORD |
None | Auth password |
datalake.query_settings |
{} |
ClickHouse session settings (e.g., max_rows_to_read) |
| Config path | Env var | Default | Description |
|---|---|---|---|
graph.url |
GKG_GRAPH__URL |
http://127.0.0.1:8123 |
HTTP endpoint |
graph.database |
GKG_GRAPH__DATABASE |
default |
Database name |
graph.username |
GKG_GRAPH__USERNAME |
default |
Auth user |
graph.password |
GKG_GRAPH__PASSWORD |
None | Auth password |
graph.query_settings |
{} |
ClickHouse session settings |
| Config path | Default | Description |
|---|---|---|
graph.profiling.enabled |
false |
Enable query profiling |
graph.profiling.explain |
false |
Collect EXPLAIN output |
graph.profiling.query_log |
false |
Log to system.query_log |
graph.profiling.processors |
false |
Collect processor stats |
graph.profiling.instance_health |
false |
Check instance health |
The worker pool limits how many messages are processed concurrently. It uses a two-level semaphore: a global limit and optional per-group limits.
| Config path | Env var | Default | Description |
|---|---|---|---|
engine.max_concurrent_workers |
GKG_ENGINE__MAX_CONCURRENT_WORKERS |
16 |
Global concurrency cap |
engine.concurrency_groups |
{} |
Named group limits |
Each handler can declare a concurrency_group. When a message arrives, the handler acquires the group semaphore first, then the global semaphore. Both are released after processing.
This prevents one handler type from monopolizing all workers. For example, with 16 global workers, you can cap SDLC at 12 and code at 4:
engine:
max_concurrent_workers: 16
concurrency_groups:
sdlc: 12
code: 4Each NATS subscription has retry and concurrency settings under engine.topics.<name>:
| Config path | Default | Description |
|---|---|---|
topics.<name>.concurrency_group |
None | Which group semaphore to use |
topics.<name>.max_attempts |
None | Total attempts (1 = no retry, 5 = 4 retries) |
topics.<name>.retry_interval_secs |
None | Delay between retries (NATS nack delay) |
From config/default.yaml:
engine:
max_concurrent_workers: 16
concurrency_groups:
sdlc: 12
code: 4
topics:
global-handler:
concurrency_group: sdlc
max_attempts: 1
retry_interval_secs: 60
namespace-handler:
concurrency_group: sdlc
max_attempts: 1
retry_interval_secs: 60
code-indexing-task:
concurrency_group: code
max_attempts: 5
retry_interval_secs: 60
dead_letter_on_exhaustion: true
namespace-deletion:
concurrency_group: code
max_attempts: 1| Config path | Default | Description |
|---|---|---|
engine.handlers.entity-handler.datalake_batch_size |
1,000,000 |
Rows per datalake extraction query |
engine.handlers.entity-handler.batch_size_overrides.<Entity> |
None | Per-entity override for datalake batch size |
engine.handlers.entity-handler.partition_overrides.<Entity> |
None | Number of partitions for initial load parallelism |
| Config path | Default | Description |
|---|---|---|
engine.handlers.code-indexing-task.pipeline.max_file_size_bytes |
5,000,000 |
Largest source file the v2 pipeline will parse |
engine.handlers.code-indexing-task.pipeline.max_files |
1,000,000 |
Maximum language-supported source files accepted for one pipeline run |
engine.handlers.code-indexing-task.pipeline.worker_threads |
0 |
Rayon workers per language; 0 uses Rayon default |
engine.handlers.code-indexing-task.pipeline.max_concurrent_languages |
0 |
Concurrent language pipelines; 0 uses the pipeline default |
| Topic | max_attempts | DLQ | Rationale |
|---|---|---|---|
global-handler (SDLC) |
1 | No | Re-dispatched every cycle. No need to retry. |
namespace-handler (SDLC) |
1 | No | Re-dispatched every cycle. No need to retry. |
code-indexing-task |
5 | Yes | Event-driven. Won't be re-dispatched. Must retry and DLQ. |
namespace-deletion |
1 | No | Re-dispatched on next scheduler cycle. |
Scheduled tasks run in DispatchIndexing mode. Each task has a 6-field cron expression (seconds, minutes, hours, day-of-month, month, day-of-week). Tasks without a cron expression fall back to a 60-second interval.
Distributed locking via NATS KV ensures only one dispatcher instance runs each task per interval.
| Task | Config path | Default cron | Description |
|---|---|---|---|
| Global dispatch | schedule.tasks.global.cron |
0 */1 * * * * (every minute) |
Publishes GlobalIndexingRequest |
| Namespace dispatch | schedule.tasks.namespace.cron |
0 */1 * * * * (every minute) |
Publishes per-namespace requests |
| Code task dispatch | schedule.tasks.code-indexing-task.cron |
0 */1 * * * * (every minute) |
Consumes Siphon CDC push events |
| Code backfill | schedule.tasks.namespace-code-backfill.cron |
0 */1 * * * * (every minute) |
Backfills newly enabled namespaces |
| Table cleanup | schedule.tasks.table-cleanup.cron |
0 0 3 * * * (daily 03:00 UTC) |
Runs OPTIMIZE TABLE ... FINAL CLEANUP |
| Namespace deletion | schedule.tasks.namespace-deletion.cron |
0 0 3 * * * (daily 03:00 UTC) |
Schedules and executes namespace deletions |
| Migration completion | schedule.tasks.migration-completion.cron |
0 */1 * * * * (every minute) |
Detects completed schema migrations |
| Config path | Default | Description |
|---|---|---|
schedule.tasks.code-indexing-task.events_stream_name |
siphon_stream_main_db |
NATS stream for Siphon CDC events |
schedule.tasks.code-indexing-task.batch_size |
100 |
CDC events to process per cycle |
| Config path | Default | Description |
|---|---|---|
schedule.tasks.namespace-code-backfill.events_stream_name |
siphon_stream_main_db |
NATS stream for namespace events |
schedule.tasks.namespace-code-backfill.batch_size |
100 |
Events to process per cycle |
Required for code indexing (repository archive download) and authorization.
| Config path | Env var | Default | Description |
|---|---|---|---|
gitlab.base_url |
GKG_GITLAB__BASE_URL |
None | GitLab instance URL |
gitlab.jwt.signing_key |
GKG_GITLAB__JWT__SIGNING_KEY |
None | JWT signing key (for creating tokens) |
gitlab.jwt.verifying_key |
GKG_GITLAB__JWT__VERIFYING_KEY |
(required) | JWT verification key |
gitlab.resolve_host |
GKG_GITLAB__RESOLVE_HOST |
None | Override DNS resolution for GitLab |
| Config path | Env var | Default | Description |
|---|---|---|---|
metrics.log_level |
GKG_METRICS__LOG_LEVEL |
None | Rust log filter string |
Example: info,gkg_server=debug,gkg_indexer=trace
| Config path | Default | Description |
|---|---|---|
metrics.otel.enabled |
false |
Enable OTEL tracing |
metrics.otel.endpoint |
http://localhost:4317 |
OTEL gRPC collector endpoint |
| Config path | Default | Description |
|---|---|---|
metrics.prometheus.enabled |
false |
Expose /metrics endpoint |
metrics.prometheus.port |
9394 |
Prometheus scrape port |
These settings are used by the Webserver mode.
| Config path | Env var | Default | Description |
|---|---|---|---|
bind_address |
GKG_BIND_ADDRESS |
127.0.0.1:4200 |
HTTP server bind address |
grpc_bind_address |
GKG_GRPC_BIND_ADDRESS |
127.0.0.1:50054 |
gRPC server bind address |
jwt_clock_skew_secs |
GKG_JWT_CLOCK_SKEW_SECS |
60 |
Allowed JWT clock skew in seconds |
health_check_url |
GKG_HEALTH_CHECK_URL |
None | Optional health check URL |
| Config path | Env var | Default | Description |
|---|---|---|---|
tls.cert_path |
GKG_TLS__CERT_PATH |
None | TLS certificate path (PEM) |
tls.key_path |
GKG_TLS__KEY_PATH |
None | TLS private key path (PEM) |
| Config path | Default | Description |
|---|---|---|
grpc.keepalive_interval_secs |
20 |
HTTP/2 keepalive ping interval |
grpc.keepalive_timeout_secs |
20 |
Keepalive ping timeout |
grpc.tcp_keepalive_secs |
60 |
TCP keepalive interval |
grpc.connection_window_size |
2097152 (2 MB) |
HTTP/2 connection flow control window |
grpc.stream_window_size |
1048576 (1 MB) |
HTTP/2 stream flow control window |
grpc.concurrency_limit |
256 |
Max concurrent requests |
grpc.max_connection_age_secs |
300 (5 min) |
Max connection age (for L4 ILB rebalancing) |
grpc.max_connection_age_grace_secs |
30 |
Graceful drain window after max_connection_age_secs fires. Must be non-zero to avoid a tonic 0.14.5 panic (hyperium/tonic#2522). |
grpc.stream_timeout_secs |
60 |
Stream timeout |
grpc.max_header_list_size_bytes |
65536 (64 KiB) |
HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE advertised to clients. tonic/hyper default of 16 KiB is too small for GitLab JWTs carrying many traversal IDs. |
Supports default settings and per-query-type overrides (e.g. aggregation, traversal, search):
query:
default:
max_execution_time: 30
max_memory_usage: 1073741824
use_query_cache: false
query_cache_ttl: 60
aggregation:
max_execution_time: 60| Config path | Default | Description |
|---|---|---|
query.default.max_execution_time |
30 |
ClickHouse max_execution_time in seconds |
query.default.max_memory_usage |
unset | ClickHouse max_memory_usage in bytes |
query.default.max_bytes_to_read |
unset | ClickHouse max_bytes_to_read in bytes |
query.default.max_rows_to_read |
unset | ClickHouse max_rows_to_read |
query.default.max_rows_in_set |
unset | ClickHouse max_rows_in_set (IN subquery cap) |
query.default.use_query_cache |
false |
Enable ClickHouse query cache |
query.default.query_cache_ttl |
60 |
Query cache TTL in seconds |
| Config path | Default | Description |
|---|---|---|
schema.max_retained_versions |
2 |
Number of schema version table-sets to retain (min 2) |
Controls Snowplow product-analytics event emission. Events carry orbit_common and orbit_query contexts (consumer-owned, defined in gkg-analytics). Disabled by default -- Helm enables it for .com and Dedicated.
| Config path | Env var | Default | Description |
|---|---|---|---|
analytics.enabled |
GKG_ANALYTICS__ENABLED |
false |
Enable Snowplow analytics event emission |
analytics.collector_url |
GKG_ANALYTICS__COLLECTOR_URL |
"" |
Snowplow collector endpoint (e.g. https://snowplowprd.trx.gitlab.net) |
analytics.deployment.type |
GKG_ANALYTICS__DEPLOYMENT__TYPE |
self_managed |
com, dedicated, or self_managed |
analytics.deployment.environment |
GKG_ANALYTICS__DEPLOYMENT__ENVIRONMENT |
development |
development, staging, or production |
Example for the .com staging cluster:
analytics:
enabled: true
collector_url: "https://snowplowprd.trx.gitlab.net"
deployment:
type: com
environment: stagingControls Snowplow billing-event emission and the CDot quota gate that enforces GitLab credit limits on metered Orbit queries.
| Config path | Env var | Default | Description |
|---|---|---|---|
billing.enabled |
GKG_BILLING__ENABLED |
false |
Enable Snowplow billing-event emission |
billing.collector_url |
GKG_BILLING__COLLECTOR_URL |
"" |
Snowplow collector endpoint |
When enabled, every metered Orbit query (mcp, rest source types) is checked against CDot before execution. Requests from namespaces with exhausted credits are rejected with codes.ResourceExhausted.
| Config path | Env var | Default | Description |
|---|---|---|---|
billing.quota.enabled |
GKG_BILLING__QUOTA__ENABLED |
false |
Enable the CDot quota gate |
billing.quota.customers_dot_url |
GKG_BILLING__QUOTA__CUSTOMERS_DOT_URL |
"" |
CDot base URL (e.g. https://customers.gitlab.com) |
billing.quota.request_timeout_ms |
GKG_BILLING__QUOTA__REQUEST_TIMEOUT_MS |
1000 |
CDot request timeout in milliseconds |
billing.quota.api_user |
GKG_BILLING__QUOTA__API_USER |
None | CDot admin email. Mounted from /etc/secrets/billing/quota/api_user. |
billing.quota.api_token |
GKG_BILLING__QUOTA__API_TOKEN |
None | CDot admin token. Mounted from /etc/secrets/billing/quota/api_token. |
| Config path | Default | Description |
|---|---|---|
health_check.bind_address |
0.0.0.0:4201 |
HealthCheck mode bind address |
indexer_health_bind_address |
0.0.0.0:4202 |
Health check server address for Indexer mode |
dispatcher_health_bind_address |
0.0.0.0:4203 |
Health check server address for DispatchIndexing mode |
Increase global and group concurrency:
engine:
max_concurrent_workers: 32
concurrency_groups:
sdlc: 24
code: 8Increase SDLC batch sizes for large namespaces:
engine:
handlers:
entity-handler:
datalake_batch_size: 5000000Increase ack wait for slow handlers:
GKG_NATS__ACK_WAIT_SECS=600 # 10 minutes instead of default 5Increase the code dispatch batch size:
schedule:
tasks:
code-indexing-task:
batch_size: 500GKG_NATS__CONSUMER_NAME=gkg-indexer # Durable consumer (survives restarts)
GKG_NATS__STREAM_REPLICAS=3 # Fault tolerance
GKG_NATS__AUTO_CREATE_STREAMS=true # Auto-create on startupIn production, GKG is deployed via the gkg-helm-charts. Most configuration is set through Helm values rather than raw YAML or environment variables.
| Helm value | Application config | Description |
|---|---|---|
nats.url |
nats.url |
NATS broker address |
nats.consumerName |
nats.consumer_name |
Durable consumer name |
clickhouse.datalake.host |
datalake.url |
Datalake ClickHouse host |
clickhouse.datalake.database |
datalake.database |
Datalake database name |
clickhouse.graph.host |
graph.url |
Graph ClickHouse host |
clickhouse.graph.database |
graph.database |
Graph database name |
gitlab.baseUrl |
gitlab.base_url |
GitLab instance URL |
indexer.logLevel |
metrics.log_level |
Indexer log level |
secrets.existingSecret |
(secret mounts) | Kubernetes secret with credentials |
helm upgrade gkg gkg-helm-charts/gkg \
--set nats.consumerName=gkg-indexer \
--set clickhouse.graph.database=gkg-productionFor complex overrides, use a values file:
helm upgrade gkg gkg-helm-charts/gkg -f custom-values.yamlkubectl -n gkg get pods
kubectl -n gkg logs deployment/gkg-indexer -f
kubectl -n gkg logs deployment/gkg-dispatcher -f
# Filter logs for a specific project
kubectl -n gkg logs deployment/gkg-indexer -f | grep 'project_id=<id>'kubectl -n gkg exec deployment/gkg-indexer -- env | grep GKG_Spin up a nats-box pod to run NATS commands inside the cluster:
kubectl -n gkg run nats-box --image=natsio/nats-box:latest --restart=Never -- sleep infinity
kubectl -n gkg exec -it nats-box -- shFrom inside nats-box:
# Check stream health
nats -s nats://gkg-nats:4222 stream ls
nats -s nats://gkg-nats:4222 stream info GKG_INDEXER
# Inspect consumers
nats -s nats://gkg-nats:4222 consumer ls GKG_INDEXER
# Check dead letter queue
nats -s nats://gkg-nats:4222 stream info GKG_DEAD_LETTERS
# Purge a stuck subject
nats -s nats://gkg-nats:4222 stream purge GKG_INDEXER \
--subject='sdlc.namespace.indexing.requested.<org>.<ns>'
# Inspect KV locks
nats -s nats://gkg-nats:4222 kv ls indexing_locksClean up when done:
kubectl -n gkg delete pod nats-boxnats:
url: nats://gkg-nats:4222
consumer_name: gkg-indexer
ack_wait_secs: 300
auto_create_streams: true
stream_replicas: 3
datalake:
url: http://clickhouse:8123
database: gitlab_clickhouse_main_production
username: default
graph:
url: http://clickhouse:8123
database: gkg-sandbox
username: default
gitlab:
base_url: https://gitlab.example.com
engine:
max_concurrent_workers: 16
concurrency_groups:
sdlc: 12
code: 4
topics:
global-handler:
concurrency_group: sdlc
max_attempts: 1
retry_interval_secs: 60
namespace-handler:
concurrency_group: sdlc
max_attempts: 1
retry_interval_secs: 60
code-indexing-task:
concurrency_group: code
max_attempts: 5
retry_interval_secs: 60
dead_letter_on_exhaustion: true
namespace-deletion:
concurrency_group: code
max_attempts: 1
handlers:
entity-handler:
datalake_batch_size: 1000000
code-indexing-task:
pipeline:
max_file_size_bytes: 5000000
max_files: 1000000
worker_threads: 0
max_concurrent_languages: 0
schedule:
tasks:
table-cleanup:
cron: "0 0 3 * * *"
namespace-deletion:
cron: "0 0 3 * * *"
migration-completion:
cron: "0 */1 * * * *"
metrics:
log_level: info,gkg_server=debug
prometheus:
enabled: true
port: 9394