Skip to content

Commit fee0f15

Browse files
authored
Merge pull request #15 from dnks0/feature/azure-terraform-support
* added terraform module for Azure * updated README's * aligned AWS TF module behavior to match Azure
2 parents b706d72 + a0e1a9c commit fee0f15

41 files changed

Lines changed: 1352 additions & 228 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -19,22 +19,23 @@ Connectivity to your custom resources can be configured via a dedicated Private
1919

2020
- **Forwarding of L4 & L7 network traffic** based on your configuration
2121
- L4 (TCP): forwarding of plain TCP traffic, e.g. for databases
22-
- L7 (HTTP) forwarding of HTTP(s) traffic with **SNI-based routing**, e.g. for applications/APIS
23-
- **Terraform module** ready to use (currently **AWS only**)
22+
- L7 (HTTP) forwarding of HTTP(s) traffic with **SNI-based routing**, e.g. for applications/API's
23+
- **Terraform module** ready to use for **AWS and Azure**
2424
- No TLS termination, only passthrough!
2525

2626

2727
### High availability
2828

29-
`dbx-proxy` is placed behind an AWS Network Load Balancer, which spreads connections across the instances in the Auto Scaling Group. Availability depends on how many instances you run and whether your subnets span multiple AZs. See the Terraform module details for configuration and behavior: [High availability (AWS)](terraform/README.md#high-availability-aws).
29+
High Availability depends on how many instances you run and whether your deployment spans multiple availability zones in your cloud. See the Terraform module details for configuration and behavior: [Terraform module documentation](terraform/README.md).
3030

3131
### Deployment (Terraform) / How to use
3232

33-
`dbx-proxy` essentially provides Steps 1 and 2 when following the official Databricks documentation for private connectivity to resources in your own networks:
33+
`dbx-proxy` provides the customer-side components (Step 1 and 2) when following the official Databricks documentation for private connectivity to resources in your own networks:
3434
- [(AWS) Configure private connectivity to resources in your VPC](https://docs.databricks.com/aws/en/security/network/serverless-network-security/pl-to-internal-network)
35+
- [(Azure) Configure private connectivity to resources in your Vnet](https://learn.microsoft.com/en-us/azure/databricks/security/network/serverless-network-security/pl-to-internal-network)
3536

3637

37-
Include the module in your Terraform stack:
38+
Include the module in your Terraform stack (example for AWS):
3839
```hcl
3940
module "dbx_proxy" {
4041
@@ -69,15 +70,15 @@ module "dbx_proxy" {
6970
}
7071
```
7172

72-
More details about the Terraform module and configurations can be found [here](terraform/README.md).
73+
More details about the Terraform module (including Azure) can be found [here](terraform/README.md).
7374

7475
You will still need to configure the Databricks-side objects like NCC, private endpoint rules and accept the connection on your endpoint-service.
7576

76-
By default the module runs in `deployment_mode = "bootstrap"` and can create networking, NLB + endpoint service. If you already have a VPC/subnets, keep `deployment_mode = "bootstrap"` and provide `vpc_id` and `subnet_ids`. If you already have an NLB as well, set `deployment_mode = "proxy-only"` and provide `vpc_id`, `subnet_ids`, and `nlb_arn` (see Terraform docs for details).
77+
By default the module runs in `deployment_mode = "bootstrap"` and can create networking, an internal load balancer (NLB/SLB), and a private endpoint service (Private Link). If you already have networking, keep `deployment_mode = "bootstrap"` and provide the network IDs. If you already have a load balancer as well, set `deployment_mode = "proxy-only"` and provide the load balancer ID/ARN (see Terraform docs for details).
7778

7879
### Troubleshooting
7980

80-
To validate that the proxy is up and reachable,run the following from a serverless notebook:
81+
To validate that the proxy is up and reachable,run the following from e.g. a serverless notebook:
8182

8283
```bash
8384
%sh
172 KB
Loading

terraform/README.md

Lines changed: 90 additions & 170 deletions
Large diffs are not rendered by default.

terraform/aws/README.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
## AWS Terraform module: `dbx-proxy`
2+
3+
This module deploys `dbx-proxy` on AWS, using an internal Network Load Balancer (NLB) and a VPC Endpoint Service (PrivateLink) for Databricks Serverless private connectivity.
4+
5+
For common concepts (listener config, deployment modes, overall limitations), see the global module documentation in `terraform/README.md`.
6+
7+
#### Architecture
8+
9+
![AWS dbx-proxy architecture](../../resources/img/aws-architecture.png)
10+
11+
This module provisions a private Network-Load-Balancer with target groups, an endpoint service for Private Link communication from Databricks serverless, and an autoscaling-group of `dbx-proxy` instances inside your VPC.
12+
In bootstrap-mode, the default subnets are created across availability-zones. The autoscaling-group automatically tries to balance instances across subnets and therefore availability-zones to achieve robustness.
13+
In proxy-only mode, it is your responsibility to configure subnets accordingly.
14+
Optional bootstrap networking creates the VPC, subnets, and NAT/IGW when not provided.
15+
16+
---
17+
18+
### Quick start
19+
20+
In your existing Terraform stack, add:
21+
22+
```hcl
23+
module "dbx_proxy" {
24+
source = "github.com/dnks0/dbx-proxy//terraform/aws?ref=v<release>"
25+
26+
# AWS config
27+
region = "eu-central-1"
28+
tags = {}
29+
30+
# dbx-proxy config
31+
dbx_proxy_image_version = "<release>"
32+
dbx_proxy_health_port = 8080
33+
dbx_proxy_listener = []
34+
}
35+
```
36+
37+
Make sure to replace `<release>` with the actual release version!
38+
39+
Then run:
40+
41+
```bash
42+
terraform init
43+
terraform apply
44+
```
45+
46+
After apply, use the output `load_balancer.vpc_endpoint_service_name` when creating Databricks private endpoint rules in your NCC. Also, add a domain of your choice as private endpoint rule on your NCC that you can use for troubleshooting.
47+
48+
---
49+
50+
### AWS-specific variables
51+
52+
| Variable | Type | Default | Description |
53+
|---|---:|---:|---|
54+
| `region` | `string` | (required) | AWS region to deploy to. |
55+
| `vpc_id` | `string` | `null` | Existing VPC ID. Required for `proxy-only` mode. If `null`, a VPC can be bootstrapped in `bootstrap` mode. |
56+
| `subnet_ids` | `list(string)` | `[]` | Existing private subnet IDs for the NLB + ASG. Required for `proxy-only` mode. If empty, subnets can be created in `bootstrap` mode. |
57+
| `vpc_cidr` | `string` | `"10.0.0.0/16"` | VPC CIDR (only used when creating a VPC in `bootstrap`). |
58+
| `subnet_cidrs` | `list(string)` | `["10.0.1.0/24", "10.0.2.0/24"]` | Private subnet CIDRs (only used when creating subnets in `bootstrap` mode). |
59+
| `nat_subnet_cidr` | `string` | `"10.0.0.0/24"` | Public subnet CIDR for the NAT gateway (only used when creating networking in `bootstrap` mode). |
60+
| `nlb_arn` | `string` | `null` | Existing NLB ARN to attach listeners/target groups to in `proxy-only` mode. |
61+
62+
Common variables are documented in `terraform/README.md`.
63+
64+
---
65+
66+
### Outputs
67+
68+
- `networking`: object with
69+
- `vpc_id`
70+
- `vpc_cidr`
71+
- `subnet_ids`
72+
- `subnet_cidrs`
73+
- `nat_gateway_id`
74+
- `nat_subnet_id`
75+
- `nat_subnet_cidr`
76+
- `internet_gateway_id`
77+
78+
- `load_balancer`: object with
79+
- `nlb_arn`
80+
- `nlb_dns_name`
81+
- `nlb_target_group_arns`
82+
- `nlb_security_group_ids`
83+
- `vpc_endpoint_service_arn`
84+
- `vpc_endpoint_service_name`
85+
86+
- `proxy`: object with
87+
- `iam_role_name`
88+
- `iam_role_arn`
89+
- `instance_profile_name`
90+
- `instance_profile_arn`
91+
- `security_group_id`
92+
- `autoscaling_group_name`
93+
- `launch_template_name`
94+
- `dbx_proxy_cfg`
95+
96+
---
97+
### Notes for AWS users
98+
99+
- Multi availability-zone resilience can be achieved by providing subnets across multiple availability-zones. By default, the autoscaling-group tries to spread dbx-proxy instances across subnets eavenly. In `proxy-only` mode, you are responsible to configure subnets accordingly. In `bootstrap` mode, default subnets are created across multiple availaiblity-zones in the selected region.

terraform/aws/local.tf

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,9 @@ locals {
1212
bootstrap_networking = var.deployment_mode == "bootstrap" && (var.vpc_id == null && length(var.subnet_ids) == 0)
1313
bootstrap_load_balancer = var.deployment_mode == "bootstrap"
1414

15-
vpc_id = module.networking.vpc_id
16-
subnet_ids = module.networking.subnet_ids
17-
subnet_cidrs = module.networking.subnet_cidrs
15+
vpc_id = module.networking.vpc_id
16+
subnet_ids = module.networking.subnet_ids
17+
subnet_cidrs = module.networking.subnet_cidrs
1818

1919
nlb_target_group_arns = module.load_balancer.nlb_target_group_arns
2020

terraform/aws/main.tf

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,19 @@ resource "random_string" "this" {
77
module "networking" {
88
source = "./modules/networking"
99

10-
bootstrap_networking = local.bootstrap_networking
10+
bootstrap_networking = local.bootstrap_networking
1111

12-
prefix = local.prefix
13-
tags = local.tags
12+
prefix = local.prefix
13+
tags = local.tags
1414

15-
vpc_id = var.vpc_id
16-
vpc_cidr = var.vpc_cidr
15+
vpc_id = var.vpc_id
16+
vpc_cidr = var.vpc_cidr
1717

18-
subnet_ids = var.subnet_ids
19-
subnet_cidrs = var.subnet_cidrs
18+
subnet_ids = var.subnet_ids
19+
subnet_cidrs = var.subnet_cidrs
2020

21-
nat_subnet_cidr = var.nat_subnet_cidr
22-
enable_nat_gateway = var.enable_nat_gateway
21+
nat_subnet_cidr = var.nat_subnet_cidr
22+
enable_nat_gateway = var.enable_nat_gateway
2323

2424
}
2525

@@ -28,35 +28,35 @@ module "load_balancer" {
2828

2929
bootstrap_load_balancer = local.bootstrap_load_balancer
3030

31-
prefix = local.prefix
32-
region = var.region
33-
tags = local.tags
31+
prefix = local.prefix
32+
region = var.region
33+
tags = local.tags
3434

35-
nlb_arn = var.nlb_arn
35+
nlb_arn = var.nlb_arn
3636

37-
vpc_id = local.vpc_id
38-
subnet_ids = local.subnet_ids
39-
subnet_cidrs = local.subnet_cidrs
37+
vpc_id = local.vpc_id
38+
subnet_ids = local.subnet_ids
39+
subnet_cidrs = local.subnet_cidrs
4040

41-
dbx_proxy_health_port = var.dbx_proxy_health_port
42-
dbx_proxy_listener = var.dbx_proxy_listener
41+
dbx_proxy_health_port = var.dbx_proxy_health_port
42+
dbx_proxy_listener = var.dbx_proxy_listener
4343

4444
}
4545

4646
module "proxy" {
4747
source = "./modules/proxy"
4848

49-
prefix = local.prefix
50-
tags = local.tags
49+
prefix = local.prefix
50+
tags = local.tags
5151

52-
vpc_id = local.vpc_id
53-
subnet_ids = local.subnet_ids
54-
subnet_cidrs = local.subnet_cidrs
52+
vpc_id = local.vpc_id
53+
subnet_ids = local.subnet_ids
54+
subnet_cidrs = local.subnet_cidrs
5555

56-
instance_type = var.instance_type
57-
min_capacity = var.min_capacity
58-
max_capacity = var.max_capacity
59-
nlb_target_group_arns = local.nlb_target_group_arns
56+
instance_type = var.instance_type
57+
min_capacity = var.min_capacity
58+
max_capacity = var.max_capacity
59+
nlb_target_group_arns = local.nlb_target_group_arns
6060

6161
dbx_proxy_image_version = var.dbx_proxy_image_version
6262
dbx_proxy_health_port = var.dbx_proxy_health_port

terraform/aws/modules/load-balancer/local.tf

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ locals {
22

33
nlb_arn = var.bootstrap_load_balancer ? aws_lb.this[0].arn : data.aws_lb.this[0].arn
44
nlb_dns_name = var.bootstrap_load_balancer ? aws_lb.this[0].dns_name : data.aws_lb.this[0].dns_name
5-
nlb_zone_id = var.bootstrap_load_balancer ? aws_lb.this[0].zone_id : data.aws_lb.this[0].zone_id
65

76
nlb_security_group_ids = var.bootstrap_load_balancer ? aws_lb.this[0].security_groups : data.aws_lb.this[0].security_groups
87

terraform/aws/modules/load-balancer/nlb.tf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ resource "aws_lb" "this" {
99
security_groups = [aws_security_group.this[0].id]
1010

1111
enable_cross_zone_load_balancing = true
12-
enable_deletion_protection = false
12+
enable_deletion_protection = false
1313

1414
# PrivateLink traffic bypasses NLB SG ingress when this is off. We use off
1515
# because the service owner cannot restrict ingress by endpoint SG/CIDR without
Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
locals {
22

3-
vpc_id = var.bootstrap_networking ? aws_vpc.this[0].id : var.vpc_id
4-
subnet_ids = length(var.subnet_ids) > 0 ? var.subnet_ids : [for s in aws_subnet.this : s.id]
5-
subnet_cidrs = length(var.subnet_ids) > 0 ? [for s in values(data.aws_subnet.this) : s.cidr_block] : [for s in aws_subnet.this : s.cidr_block]
3+
vpc_id = var.bootstrap_networking ? aws_vpc.this[0].id : var.vpc_id
4+
subnet_ids = length(var.subnet_ids) > 0 ? var.subnet_ids : [for s in aws_subnet.this : s.id]
5+
subnet_cidrs = length(var.subnet_ids) > 0 ? [for s in values(data.aws_subnet.this) : s.cidr_block] : [for s in aws_subnet.this : s.cidr_block]
66

77
}

terraform/aws/modules/networking/network.tf

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
resource "aws_vpc" "this" {
2-
count = var.bootstrap_networking ? 1 : 0
2+
count = var.bootstrap_networking ? 1 : 0
3+
34
cidr_block = var.vpc_cidr
45
enable_dns_hostnames = true
56
enable_dns_support = true

0 commit comments

Comments
 (0)