Skip to content

Commit 994b951

Browse files
authored
Merge pull request #5 from dnks0/feature/dbx-proxy
added readme, config rendering, unit-tests
2 parents 7331f06 + 2960a85 commit 994b951

19 files changed

Lines changed: 761 additions & 20 deletions

README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
## dbx-proxy
2+
3+
`dbx-proxy` is a lightweight **HAProxy-based load balancer** that enables **private network connectivity** from **Databricks Serverless** compute to **resources in your own VPC/VNet** (for example: databases, applications, etc).
4+
5+
### What problem it solves
6+
7+
Many enterprise resources live in private networks and are not reachable from serverless compute by default. `dbx-proxy` provides a controlled entry point for [private connectivity to resources in your VPC/Vnet](https://docs.databricks.com/aws/en/security/network/serverless-network-security/pl-to-internal-network).
8+
9+
### What you get
10+
11+
- **Forwarding of L4 & L7 network traffic** based on your configuration
12+
- L4 (TCP): forwarding of plain TCP traffic, e.g. for databases
13+
- L7 (HTTP) forwarding of HTTP(s) traffic with **SNI-based routing**, e.g. for applications/APIS
14+
- **Terraform module** ready to use (currently **AWS only**)
15+
16+
### Deployment (Terraform) / How to use
17+
18+
`dbx-proxy` essentially provides Steps 1 and 2 when following the official Databricks documentation for private connectivity to resources in your own networks:
19+
- [(AWS) Configure private connectivity to resources in your VPC](https://docs.databricks.com/aws/en/security/network/serverless-network-security/pl-to-internal-network)
20+
21+
22+
Include the module in your Terraform stack:
23+
```hcl
24+
module "dbx_proxy" {
25+
26+
source = "github.com/dnks0/dbx-proxy//terraform/aws?ref=v0.1.0"
27+
28+
# aws config
29+
region = "eu-central-1"
30+
tags = {}
31+
...
32+
33+
# dbx-proxy config
34+
dbx_proxy_image_version = "0.1.0"
35+
dbx_proxy_health_port = 8080
36+
37+
# Example: forward TCP/443 to a private target in your VPC
38+
dbx_proxy_listener = [
39+
{
40+
name = "http-443"
41+
mode = "http"
42+
port = 443
43+
routes = [
44+
{
45+
name = "example"
46+
domains = ["example.internal"]
47+
destinations = [
48+
{ name = "example-server-1", host = "10.0.1.10", port = 443 },
49+
]
50+
}
51+
]
52+
}
53+
]
54+
}
55+
```
56+
57+
More details about the Terraform module and configurations can be found [here](terraform/README.md).
58+
59+
You will still need to configure the Databricks-side objects like NCC, private endpoint rules and accept the connection on your endpoint-service.
60+
61+
### Troubleshooting
62+
63+
To validate that the proxy is up and reachable,run the following from a serverless notebook:
64+
65+
```bash
66+
%sh
67+
68+
curl -sS -w '\nHTTP %{http_code}\n' http://<ncc-endpoint-rule-domain>:8080/status
69+
```
70+
71+
### Limitations / Trade-Offs
72+
Before going to production, please review the following [limitations & trade-offs](terraform/README.md#limitations--tradeoffs-of-the-current-implementation).

docker/proxy.Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ RUN apt-get update --fix-missing \
1212
&& rm -rf /var/lib/apt/lists/*
1313

1414
COPY ./docker/proxy.entrypoint.sh /dbx-proxy/entrypoint.sh
15-
COPY ./docker/default.cfg /dbx-proxy/etc/default.cfg
15+
COPY ./docker/default.cfg /dbx-proxy/conf/default.cfg
1616

1717
RUN mkdir -p /dbx-proxy/conf /dbx-proxy/run /dbx-proxy/log /dbx-proxy/etc \
1818
&& chown -R haproxy:dbx-proxy /dbx-proxy

docker/proxy.entrypoint.sh

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
11
set -eu
22

33
PID="/dbx-proxy/run/dbx-proxy.pid"
4-
CONFIG="/dbx-proxy/etc/default.cfg"
4+
CONFIG="/dbx-proxy/conf/default.cfg"
5+
6+
if [ -f "/dbx-proxy/conf/dbx-proxy.cfg" ]; then
7+
CONFIG="/dbx-proxy/conf/dbx-proxy.cfg"
8+
fi
59

610
log() {
711
# 2025-12-20 13:30:01,123 | LEVEL | message

terraform/README.md

Lines changed: 231 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,231 @@
1+
## Terraform module: `dbx-proxy` (multi-cloud)
2+
3+
This repository provides a **Terraform module** for deploying `dbx-proxy` across multiple clouds.
4+
5+
- **AWS**: implemented today (`terraform/aws`)
6+
- **Azure**: planned (not implemented yet)
7+
8+
`dbx-proxy` is commonly used as the customer-side component for Databricks Serverless **private connectivity to resources in your VPC/VNet**. In the Databricks AWS guide, this corresponds to provisioning the **internal NLB frontend** (and the endpoint service). See [Configure private connectivity to resources in your VPC](https://docs.databricks.com/aws/en/security/network/serverless-network-security/pl-to-internal-network).
9+
10+
---
11+
12+
### AWS architecture (what gets deployed)
13+
14+
**Always deployed (AWS implementation):**
15+
- **Compute**: EC2 Launch Template + Auto Scaling Group (ASG) running `dbx-proxy`
16+
- **Load balancing**: internal **Network Load Balancer (NLB)**
17+
- **Private connectivity**: **VPC Endpoint Service (PrivateLink)** backed by the NLB
18+
- **Networking & security**: Security Group, IAM Role + Instance Profile
19+
20+
**Conditionally deployed:**
21+
- VPC, private subnets
22+
- Optional IGW + public subnet + NAT gateway (internet connectivity needed to pull images, etc.!)
23+
- Route tables + associations
24+
25+
---
26+
27+
### Architecture diagram (AWS)
28+
29+
(to be added)
30+
31+
---
32+
33+
### Quick start (AWS)
34+
35+
In your existing Terraform stack, add:
36+
37+
```hcl
38+
module "dbx_proxy" {
39+
source = "github.com/dnks0/dbx-proxy//terraform/aws?ref=v0.1.0"
40+
41+
# AWS config
42+
region = "eu-central-1"
43+
tags = {}
44+
...
45+
46+
# dbx-proxy config
47+
dbx_proxy_image_version = "0.1.0"
48+
dbx_proxy_health_port = 8080
49+
dbx_proxy_listener = []
50+
}
51+
```
52+
53+
Then run:
54+
55+
```bash
56+
terraform init
57+
terraform apply
58+
```
59+
60+
After apply, use the output `vpc_endpoint_service_name` when creating Databricks private endpoint rules (see Databricks guide linked above).
61+
Also, make sure to add a domain of your choice as private endpoint rule on your NCC that you could use for [troubleshooting](../README.md#troubleshooting) purposes.
62+
63+
---
64+
65+
### Configuration variables
66+
67+
This module separates **common variables** (used by the config renderer) from **cloud-specific variables** (provisioning details for a specific cloud).
68+
69+
#### Common variables (all clouds)
70+
71+
These variables define what the proxy should do (listeners, health port, image tag).
72+
73+
| Variable | Type | Default | Description |
74+
|---|---:|---:|---|
75+
| `dbx_proxy_image_version` | `string` | `"0.1.0"` | Docker image tag/version of `dbx-proxy` to deploy. |
76+
| `dbx_proxy_health_port` | `number` | `8080` | Health port exposed by `dbx-proxy` (HTTP `GET /status`). Also used for NLB target group health checks. |
77+
| `dbx_proxy_listener` | `list(object)` | `[]` | Listener configuration (ports/modes/routes/destinations). See **Listener configuration** below. |
78+
79+
#### AWS-specific variables (`terraform/aws`)
80+
81+
| Variable | Type | Default | Description |
82+
|---|---:|---:|---|
83+
| `region` | `string` | (required) | AWS region to deploy to. |
84+
| `prefix` | `string` | `null` | Optional naming prefix. A randomized suffix is always appended to avoid collisions. |
85+
| `tags` | `map(string)` | `{}` | Extra tags applied to AWS resources (also used as provider default tags). |
86+
| `instance_type` | `string` | `"t3.medium"` | EC2 instance type for proxy nodes. |
87+
| `vpc_id` | `string` | `null` | Existing VPC ID. If `null`, the module bootstraps a VPC. |
88+
| `subnet_ids` | `list(string)` | `[]` | Existing private subnet IDs for the NLB + ASG. If empty, subnets are created. |
89+
| `vpc_cidr` | `string` | `"10.0.0.0/16"` | VPC CIDR (only used when creating a VPC). |
90+
| `subnet_cidrs` | `list(string)` | `["10.0.1.0/24","10.0.2.0/24"]` | Private subnet CIDRs (only used when creating subnets). |
91+
| `enable_nat_gateway` | `bool` | `true` | Whether to create NAT (and related IGW/public subnet) for outbound internet access (only when creating networking). |
92+
| `public_subnet_cidr` | `string` | `"10.0.0.0/24"` | Public subnet CIDR for the NAT gateway (only used when creating networking). |
93+
94+
---
95+
96+
### Outputs (AWS)
97+
98+
- `nlb_arn`: ARN of the internal NLB
99+
- `vpc_endpoint_service_name`: **input** for Databricks private endpoint rules
100+
- `vpc_endpoint_service_arn`: ARN of the endpoint service
101+
- `nlb_dns_name`: internal NLB DNS name
102+
- `nlb_zone_id`: Route53 hosted zone id for NLB aliases
103+
- `autoscaling_group_name`: ASG name
104+
- `security_group_id`: Security group ID attached to the proxy instances
105+
- `target_group_arns`: listener target groups keyed by listener name
106+
107+
---
108+
109+
### Listener configuration (deep dive)
110+
111+
`dbx_proxy_listener` is a list of listener objects:
112+
113+
```hcl
114+
dbx_proxy_listener = [
115+
{
116+
name = string
117+
mode = string # "tcp" or "http"
118+
port = number
119+
routes = [
120+
{
121+
name = string
122+
domains = list(string)
123+
destinations = [
124+
{ name = string, host = string, port = number }
125+
]
126+
}
127+
]
128+
}
129+
]
130+
```
131+
132+
#### Listener fields
133+
134+
- **`name`**: stable identifier (used for naming resources like target groups)
135+
- **`mode`**:
136+
- `"tcp"`: L4 forwarding
137+
- `"http"`: L7 (HTTP) behavior in the proxy configuration; the AWS NLB still uses TCP listeners
138+
- **`port`**: frontend port exposed by the NLB and `dbx-proxy`
139+
- **`routes`**: list of routing rules (domains + destinations)
140+
141+
#### Route fields
142+
143+
- **`name`**: route identifier
144+
- **`domains`**: list of domains that should match this route (used for SNI/host-based routing depending on mode)
145+
- Databricks private endpoint rules have limits (for example max domain count per rule). Follow Databricks’ documented constraints.
146+
- **`destinations`**: list of upstream targets (`host` + `port`)
147+
148+
#### Common patterns
149+
150+
**1) TCP database traffic (e.g. Postgres, L4):**
151+
152+
```hcl
153+
dbx_proxy_listener = [
154+
{
155+
name = "postgres-5432"
156+
mode = "tcp"
157+
port = 5432
158+
routes = [
159+
{
160+
name = "postgres"
161+
domains = ["postgres.database.domain"]
162+
destinations = [
163+
{ name = "postgres-1", host = "10.0.1.10", port = 5432 },
164+
]
165+
}
166+
]
167+
}
168+
]
169+
```
170+
171+
**2) HTTPS with SNI-based forwarding (multiple backends):**
172+
173+
```hcl
174+
dbx_proxy_listener = [
175+
{
176+
name = "https-443"
177+
mode = "http"
178+
port = 443
179+
routes = [
180+
{
181+
name = "app-a"
182+
domains = ["app-a.application.domain"]
183+
destinations = [
184+
{ name = "app-a-1", host = "10.0.2.20", port = 443 },
185+
]
186+
},
187+
{
188+
name = "app-b"
189+
domains = ["app-b.application.domain"]
190+
destinations = [
191+
{ name = "app-b-1", host = "app-b-server-1.app-b.application.domain", port = 443 },
192+
]
193+
}
194+
]
195+
}
196+
]
197+
```
198+
199+
#### Health checks
200+
201+
- `dbx-proxy` health endpoint is `GET /status` on `dbx_proxy_health_port` (default `8080`).
202+
- AWS NLB target groups use the health port for health checks.
203+
- The AWS implementation also creates an **optional NLB listener** on `dbx_proxy_health_port` so the health endpoint can be reached through the NLB/PrivateLink (unless the health port is already used as a normal listener port).
204+
205+
---
206+
207+
### Limitations & tradeoffs of the current implementation
208+
209+
This module is intentionally minimal right now. The following limitations are important for production planning:
210+
211+
- **Single instance / no horizontal scaling by default**
212+
- The AWS ASG is configured as `min=desired=max=1`, so you get **one EC2 instance** running `dbx-proxy`.
213+
- **Mitigation**: increase `max_size` / `desired_capacity` (requires module changes today) and consider multi-AZ designs.
214+
215+
- **Planned downtime during updates**
216+
- The ASG uses `instance_refresh` with `min_healthy_percentage = 0` to ensure launch template updates roll out even with a single instance.
217+
- This implies **downtime during replacement** on apply (terminate -> relaunch).
218+
- **Mitigation**: run at least 2 instances and set `min_healthy_percentage` accordingly (requires module changes today).
219+
- On changes to `dbx_proxy_listener`, `terraform apply` updates the EC2 launch template `user_data`, and the ASG replaces the instance (short downtime) so the new config is applied via cloud-init.
220+
221+
- **Outbound internet dependency (when bootstrapping)**
222+
- If you let the module create networking and keep `enable_nat_gateway = true`, instances use NAT for outbound access.
223+
- Cloud-init installs Docker and downloads the Docker Compose plugin from GitHub, so **egress to the internet is required** (or you must customize the bootstrap/AMI).
224+
225+
- **AWS-only**
226+
- The module is structured for multi-cloud, but **only AWS is implemented** right now.
227+
228+
- **Databricks serverless private connectivity constraints apply**
229+
- Databricks enforces limits around NCCs, private endpoints, and private endpoint rules (including limits on the number of domain names per rule).
230+
- Treat these as **external constraints** that influence how you model `dbx_proxy_listener`.
231+
- Reference: [Configure private connectivity to resources in your VPC](https://docs.databricks.com/aws/en/security/network/serverless-network-security/pl-to-internal-network).

terraform/aws/local.tf

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,10 @@ locals {
2020
cloud_config = {
2121
write_files = [
2222
{
23-
path = "/dbx-proxy/conf/listener.yaml"
23+
path = "/dbx-proxy/conf/dbx-proxy.cfg"
2424
owner = "root:root"
2525
permissions = "0644"
26-
content = module.common.dbx_proxy_listener
26+
content = module.common.dbx_proxy_cfg
2727
},
2828
{
2929
path = "/dbx-proxy/docker-compose.yaml"

terraform/common/main.tf

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
locals {
2-
dbx_proxy_listener_yaml = yamlencode({
3-
listeners = var.dbx_proxy_listener
2+
dbx_proxy_cfg = templatefile("${path.module}/templates/dbx-proxy.cfg.tpl", {
3+
dbx_proxy_health_port = var.dbx_proxy_health_port
4+
dbx_proxy_listener = var.dbx_proxy_listener
45
})
56

67
docker_compose_yaml = templatefile("${path.module}/templates/docker-compose.yaml.tpl", {

terraform/common/outputs.tf

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
output "dbx_proxy_listener" {
2-
description = "Rendered listener.yaml content."
3-
value = local.dbx_proxy_listener_yaml
1+
output "dbx_proxy_cfg" {
2+
description = "Rendered dbx-proxy config (dbx-proxy.cfg) derived from dbx_proxy_listener."
3+
value = local.dbx_proxy_cfg
44
}
55

66
output "docker_compose" {

0 commit comments

Comments
 (0)