Skip to content

Commit 4fb75d4

Browse files
authored
Docs: Split ouy how-to for Proxy
Signed-off-by: Kate Andrews <keandrews@gmail.com>
1 parent cc7ab21 commit 4fb75d4

1 file changed

Lines changed: 255 additions & 0 deletions

File tree

docs/how-to.md

Lines changed: 255 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,255 @@
1+
# CipherStash Proxy how-to guide
2+
3+
This page contains how-to documentation for installing, configuring, and running CipherStash Proxy.
4+
5+
## Table of contents
6+
7+
- [How-to](#how-to)
8+
- [Installing Proxy](#installing-proxy)
9+
- [Configuring Proxy](#configuring-proxy)
10+
- [Configuring Proxy with environment variables](#configuring-proxy-with-environment-variables)
11+
- [Configuring Proxy with a TOML file](#configuring-proxy-with-a-toml-file)
12+
- [Running Proxy locally](#running-proxy-locally)
13+
- [Setting up the database schema](#setting-up-the-database-schema)
14+
- [Creating columns with the right types](#creating-columns-with-the-right-types)
15+
- [Encrypting data in an existing database](#encrypting-data-in-an-existing-database)
16+
17+
## Installing Proxy
18+
19+
CipherStash Proxy is available as a [container image](https://hub.docker.com/r/cipherstash/proxy) on Docker Hub that can be deployed locally, in CI/CD, through to production.
20+
21+
The easiest way to start using CipherStash Proxy with your application is by adding a container to your application's `docker-compose.yml`.
22+
The following is an example of what adding CipherStash Proxy to your app's `docker-compose.yml` might look like:
23+
24+
```yaml
25+
services:
26+
app:
27+
# Your Postgres container config
28+
db:
29+
# Your Postgres container config
30+
proxy:
31+
image: cipherstash/proxy:latest
32+
container_name: proxy
33+
ports:
34+
- 6432:6432
35+
- 9930:9930
36+
environment:
37+
# Hostname of the Postgres database server connections will be proxied to
38+
- CS_DATABASE__HOST=${CS_DATABASE__HOST}
39+
# Port of the Postgres database server connections will be proxied to
40+
- CS_DATABASE__PORT=${CS_DATABASE__PORT}
41+
# Username of the Postgres database server connections will be proxied to
42+
- CS_DATABASE__USERNAME=${CS_DATABASE__USERNAME}
43+
# Password of the Postgres database server connections will be proxied to
44+
- CS_DATABASE__PASSWORD=${CS_DATABASE__PASSWORD}
45+
# The database name on the Postgres database server connections will be proxied to
46+
- CS_DATABASE__NAME=${CS_DATABASE__NAME}
47+
# The CipherStash workspace ID for making requests for encryption keys
48+
- CS_WORKSPACE_ID=${CS_WORKSPACE_ID}
49+
# The CipherStash client access key for making requests for encryption keys
50+
- CS_CLIENT_ACCESS_KEY=${CS_CLIENT_ACCESS_KEY}
51+
# The CipherStash dataset ID for generating and retrieving encryption keys
52+
- CS_DEFAULT_KEYSET_ID=${CS_DEFAULT_KEYSET_ID}
53+
# The CipherStash client ID used to programmatically access a dataset
54+
- CS_CLIENT_ID=${CS_CLIENT_ID}
55+
# The CipherStash client key used to programmatically access a dataset
56+
- CS_CLIENT_KEY=${CS_CLIENT_KEY}
57+
# Toggle Prometheus exporter for CipherStash Proxy operations
58+
- CS_PROMETHEUS__ENABLED=${CS_PROMETHEUS__ENABLED:-true}
59+
```
60+
61+
62+
For a fully-working example, go to [`docker-compose.yml`](./docker-compose.yml).
63+
Follow the steps in [Getting started](../README.md#getting-started) to see it in action.
64+
65+
Once you have set up a `docker-compose.yml`, start the Proxy container:
66+
67+
```bash
68+
docker compose up
69+
```
70+
71+
Connect your PostgreSQL client to Proxy on TCP 6432.
72+
Point [Prometheus to scrape metrics](reference.md#prometheus-metrics) on TCP 9930.
73+
74+
## Configuring Proxy
75+
76+
To run, CipherStash Proxy needs to know:
77+
78+
- What port to run on
79+
- How to connect to the target PostgreSQL database
80+
- Secrets to authenticate to CipherStash
81+
82+
There are two ways to configure Proxy:
83+
84+
- [With environment variables that Proxy looks up on startup](#configuring-proxy-with-environment-variables)
85+
- [With a TOML file that Proxy reads on startup](#configuring-proxy-with-a-toml-file)
86+
87+
Proxy's configuration loading order of preference is:
88+
89+
1. If `cipherstash-proxy.toml` is present in the current working directory, Proxy will read its config from that file
90+
1. If `cipherstash-proxy.toml` is not present, Proxy will look up environment variables to configure itself
91+
1. If **both** `cipherstash-proxy.toml` and environment variables are present, Proxy will use `cipherstash-proxy.toml` as the base configuration, and override it with any environment variables that are set
92+
93+
See [Proxy config options](reference.md#proxy-config-options) for all the available options.
94+
95+
### Configuring Proxy with environment variables
96+
97+
If you are configuring Proxy with environment variables, these are the minimum environment variables required to run Proxy:
98+
99+
```bash
100+
CS_DATABASE__NAME
101+
CS_DATABASE__USERNAME
102+
CS_DATABASE__PASSWORD
103+
CS_WORKSPACE_ID
104+
CS_CLIENT_ACCESS_KEY
105+
CS_DEFAULT_KEYSET_ID
106+
CS_CLIENT_ID
107+
CS_CLIENT_KEY
108+
```
109+
110+
Read the full list of environment variables and what they do in the [reference documentation](reference.md#proxy-config-options).
111+
112+
### Configuring Proxy with a TOML file
113+
114+
If you are configuring Proxy with a `cipherstash-proxy.toml` file, these are the minimum values required to run Proxy:
115+
116+
```toml
117+
[database]
118+
name = "cipherstash"
119+
username = "cipherstash"
120+
password = "password"
121+
122+
[auth]
123+
workspace_id = "cipherstash-workspace-id"
124+
client_access_key = "cipherstash-client-access-key"
125+
126+
[encrypt]
127+
default_keyset_id = "cipherstash-default-keyset-id"
128+
client_id = "cipherstash-client-id"
129+
client_key = "cipherstash-client-key"
130+
```
131+
132+
Read the full list of configuration options and what they do in the [reference documentation](reference.md#proxy-config-options).
133+
134+
## Running Proxy locally
135+
136+
TODO: Add instructions for running Proxy locally
137+
138+
## Setting up the database schema
139+
140+
Under the hood, Proxy uses [CipherStash Encrypt Query Language](https://github.com/cipherstash/encrypt-query-language/) to index and search encrypted data.
141+
142+
When you start the Proxy container, you can install EQL by setting the `CS_DATABASE__INSTALL_EQL` environment variable:
143+
144+
```bash
145+
CS_DATABASE__INSTALL_EQL=true
146+
```
147+
148+
This will install the version of EQL bundled with the Proxy container.
149+
The version of EQL bundled with the Proxy container is tested to work with that version of Proxy.
150+
151+
If you are following the [getting started](../README.md#getting-started) guide above, EQL is automatically installed for you.
152+
You can also install EQL by running [the installation script](https://github.com/cipherstash/encrypt-query-language/releases) as a database migration in your application.
153+
154+
Once you have installed EQL, you can see what version is installed by querying the database:
155+
156+
```sql
157+
SELECT cs_eql_version();
158+
```
159+
160+
This will output the version of EQL installed.
161+
162+
### Creating columns with the right types
163+
164+
In your existing PostgreSQL database, you store your data in tables and columns.
165+
Those columns have types like `integer`, `text`, `timestamp`, and `boolean`.
166+
When storing encrypted data in PostgreSQL with Proxy, you use a special column type called `cs_encrypted_v1`, which is [provided by EQL](#setting-up-the-database-schema).
167+
`cs_encrypted_v1` is a container column type that can be used for any type of encrypted data you want to store or search, whether they are numbers (`int`, `small_int`, `big_int`), text (`text`), dates and times (`date`), or booleans (`boolean`).
168+
169+
Create a table with an encrypted column for `email`:
170+
171+
```sql
172+
CREATE TABLE users (
173+
id SERIAL PRIMARY KEY,
174+
email cs_encrypted_v1
175+
)
176+
```
177+
178+
This creates a `users` table with two columns:
179+
180+
- `id`, an autoincrementing integer column that is the primary key for the record
181+
- `email`, a `cs_encrypted_v1` column
182+
183+
There are important differences between the plaintext columns you've traditionally used in PostgreSQL and encrypted columns with CipherStash Proxy:
184+
185+
- **Plaintext columns can be searched if they don't have an index**, albeit with the performance cost of a full table scan.
186+
- **Encrypted columns cannot be searched without an encrypted index**, and the encrypted indexes you define determine what kind of searches you can do on encrypted data.
187+
188+
In the previous step we created a table with an encrypted column, but without any encrypted indexes.
189+
190+
Now you can add an encrypted index for that encrypted column:
191+
192+
```sql
193+
SELECT cs_add_index_v1(
194+
'users',
195+
'email',
196+
'unique',
197+
'text'
198+
);
199+
```
200+
201+
This statement adds a `unique` index for the `email` column in the `users` table, which has an underlying data type of `text`.
202+
203+
`unique` indexes are used to find records with columns with unique values, like with the `=` operator.
204+
205+
There are two other types of encrypted indexes you can use on `text` data:
206+
207+
```sql
208+
SELECT cs_add_index_v1(
209+
'users',
210+
'email',
211+
'match',
212+
'text'
213+
);
214+
215+
SELECT cs_add_index_v1(
216+
'users',
217+
'email',
218+
'ore',
219+
'text'
220+
);
221+
```
222+
223+
The first SQL statement adds a `match` index, which is used for partial matches with `LIKE`.
224+
The second SQL statement adds an `ore` index, which is used for ordering with `ORDER BY`.
225+
226+
Now that the indexes has been added, you must activate them:
227+
228+
```sql
229+
SELECT cs_encrypt_v1();
230+
SELECT cs_activate_v1();
231+
```
232+
233+
This loads and activates the encrypted indexes.
234+
235+
You must run the `cs_encrypt_v1()` and `cs_activate_v1()` functions after any modifications to the encrypted indexes.
236+
237+
> ![IMPORTANT]
238+
> Adding, updating, or deleting encrypted indexes on columns that already contain encrypted data will not re-index that data. To use the new indexes, you must `SELECT` the data out of the column, and `UPDATE` it again.
239+
240+
To learn how to use encrypted indexes for other encrypted data types like `text`, `int`, `boolean`, `date`, and `jsonb`, see the [EQL documentation](https://github.com/cipherstash/encrypt-query-language/blob/main/docs/reference/INDEX.md).
241+
242+
When deploying CipherStash Proxy into production environments with real data, we recommend that you apply these database schema changes with the normal tools and process you use for making changes to your database schema.
243+
244+
To see more examples of how to modify your database schema, check out [the example schema](./getting-started/schema-example.sql) from [Getting started](#getting-started).
245+
246+
## Encrypting data in an existing database
247+
248+
CipherStash Proxy includes an `encrypt` tool – a CLI application to encrypt existing data, or to apply index changes after changes to the encryption configuration of a protected database.
249+
See the [`encrypt` tool guide](encrypt-tool.md) for info about using the `encrypt` tool.
250+
251+
---
252+
253+
### Didn't find what you wanted?
254+
255+
[Click here to let us know what was missing from our docs.](https://github.com/cipherstash/proxy/issues/new?template=docs-feedback.yml&title=[Docs:]%20Feedback%20on%20how-to.md)

0 commit comments

Comments
 (0)