Skip to content

Commit 043da3c

Browse files
aarthy-dkclaude
andcommitted
docs: simplify README install section to point at docs site
The dk-installer now offers Docker and pip install modes, both fully documented at docs.datakitchen.io. Replace the long install instructions in the README — which had drifted out of sync with the installer and the docs — with a brief pointer to the install pages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 01ad413 commit 043da3c

1 file changed

Lines changed: 4 additions & 191 deletions

File tree

README.md

Lines changed: 4 additions & 191 deletions
Original file line numberDiff line numberDiff line change
@@ -29,199 +29,12 @@ A <b>single place to manage Data Quality</b> across data sets, locations, and te
2929
<img alt="DataKitchen Open Source Data Quality TestGen Features - Single Place" src="https://datakitchen.io/wp-content/uploads/2024/07/Screenshot-dataops-testgen-centralize.png" width="70%">
3030
</p>
3131

32-
## Installation with dk-installer (recommended)
32+
## Installation
3333

34-
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) program installs DataOps Data Quality TestGen as a [Docker Compose](https://docs.docker.com/compose/) application. This is the recommended mode of installation as Docker encapsulates and isolates the application from other software on your machine and does not require you to manage Python dependencies.
34+
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) program installs TestGen in either Docker or pip mode. For complete instructions, see the documentation:
3535

36-
### Install the prerequisite software
37-
38-
| Software | Tested Versions | Command to check version |
39-
|-------------------------|-------------------------|-------------------------------|
40-
| [Python](https://www.python.org/downloads/) <br/>- Most Linux and macOS systems have Python pre-installed. <br/>- On Windows machines, you will need to download and install it. <br/> Why Python? To run the installer. | 3.9, 3.10, 3.11, 3.12 | `python3 --version` |
41-
| [Docker](https://docs.docker.com/get-docker/) <br/>[Docker Compose](https://docs.docker.com/compose/install/) <br/> Why Docker? Docker lets you try TestGen without affecting your local software environment. All the dependencies TestGen needs are isolated in its own container, so installation is easy and insulated. | 26.1, 27.5, 28.0 <br/> 2.32, 2.33, 2.34 | `docker -v` <br/> `docker compose version` |
42-
43-
### Download the installer
44-
45-
On Unix-based operating systems, use the following command to download it to the current directory. We recommend creating a new, empty directory.
46-
47-
```shell
48-
curl -o dk-installer.py 'https://raw.githubusercontent.com/DataKitchen/data-observability-installer/main/dk-installer.py'
49-
```
50-
51-
* Alternatively, you can manually download the [`dk-installer.py`](https://github.com/DataKitchen/data-observability-installer/blob/main/dk-installer.py) file from the [data-observability-installer](https://github.com/DataKitchen/data-observability-installer) repository.
52-
* All commands listed below should be run from the folder containing this file.
53-
* For usage help and command options, run `python3 dk-installer.py --help` or `python3 dk-installer.py <command> --help`.
54-
55-
### Install the TestGen application
56-
57-
The installation downloads the latest Docker images for TestGen and deploys a new Docker Compose application. The process may take 5~10 minutes depending on your machine and network connection.
58-
59-
```shell
60-
python3 dk-installer.py tg install
61-
```
62-
63-
The `--port` option may be used to set a custom localhost port for the application (default: 8501).
64-
65-
To enable SSL for HTTPS support, use the `--ssl-cert-file` and `--ssl-key-file` options to specify local file paths to your SSL certificate and key files.
66-
67-
Once the installation completes, verify that you can login to the UI with the URL and credentials provided in the output.
68-
69-
### Optional: Run the TestGen demo setup
70-
71-
The [Data Observability quickstart](https://docs.datakitchen.io/tutorials/quickstart-demo/) walks you through DataOps Data Quality TestGen capabilities to demonstrate how it covers critical use cases for data and analytic teams.
72-
73-
```shell
74-
python3 dk-installer.py tg run-demo
75-
```
76-
77-
In the TestGen UI, you will see that new data profiling and test results have been generated.
78-
79-
## Installation with pip
80-
81-
As an alternative to the Docker Compose [installation with dk-installer (recommended)](#installation-with-dk-installer-recommended), DataOps Data Quality TestGen can also be installed as a Python package via [pip](https://pip.pypa.io/en/stable/). This mode of installation uses the [dataops-testgen](https://pypi.org/project/dataops-testgen/) package published to PyPI, and it requires a PostgreSQL instance to be provisioned for the application database.
82-
83-
### Install the prerequisite software
84-
85-
| Software | Tested Versions | Command to check version |
86-
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|------------------------------|
87-
| [Python](https://www.python.org/downloads/) <br/>- Most Linux and macOS systems have Python pre-installed. <br/>- On Windows machines, you will need to download and install it. | 3.11, 3.12, 3.13 | `python3 --version` |
88-
| [PostgreSQL](https://www.postgresql.org/download/) | 14.1, 15.8, 16.4 | `psql --version`|
89-
90-
### Install the TestGen package
91-
92-
We recommend using a Python virtual environment to avoid any dependency conflicts with other applications installed on your machine. The [venv](https://docs.python.org/3/library/venv.html#creating-virtual-environments) module, which is part of the Python standard library, or other third-party tools, like [virtualenv](https://virtualenv.pypa.io/en/latest/) or [conda](https://docs.conda.io/en/latest/), can be used.
93-
94-
Create and activate a virtual environment with a TestGen-compatible version of Python (`>=3.11`). The steps may vary based on your operating system and Python installation - the [Python packaging user guide](https://packaging.python.org/en/latest/tutorials/installing-packages/) is a useful reference.
95-
96-
_On Linux/Mac_
97-
```shell
98-
python3 -m venv venv
99-
source venv/bin/activate
100-
```
101-
102-
_On Windows_
103-
```powershell
104-
py -3.13 -m venv venv
105-
venv\Scripts\activate
106-
```
107-
108-
Within the virtual environment, install the TestGen package with pip.
109-
```shell
110-
pip install dataops-testgen
111-
```
112-
113-
Verify that the [_testgen_ command line](https://docs.datakitchen.io/testgen/cli-reference/) works.
114-
```shell
115-
testgen --help
116-
```
117-
118-
### Set up the application database in PostgresSQL
119-
120-
Create a `local.env` file with the following environment variables, replacing the `<value>` placeholders with appropriate values. Refer to the [TestGen Configuration](docs/configuration.md) document for more details, defaults, and other supported configuration.
121-
```shell
122-
# Connection parameters for the PostgreSQL server
123-
export TG_METADATA_DB_HOST=<postgres_hostname>
124-
export TG_METADATA_DB_PORT=<postgres_port>
125-
126-
# Connection credentials for the PostgreSQL server
127-
# This role must have privileges to create roles, users, database and schema so that the application database can be initialized
128-
export TG_METADATA_DB_USER=<postgres_username>
129-
export TG_METADATA_DB_PASSWORD=<postgres_password>
130-
131-
# Set a password and arbitrary string (the "salt") to be used for encrypting secrets in the application database
132-
export TG_DECRYPT_PASSWORD=<encryption_password>
133-
export TG_DECRYPT_SALT=<encryption_salt>
134-
135-
# Set credentials for the default admin user to be created for TestGen
136-
export TESTGEN_USERNAME=<username>
137-
export TESTGEN_PASSWORD=<password>
138-
139-
# Set an arbitrary base64-encoded string to be used for signing authentication tokens
140-
export TG_JWT_HASHING_KEY=<base64_key>
141-
142-
# Set an accessible path for storing application logs
143-
export TESTGEN_LOG_FILE_PATH=<path_for_logs>
144-
```
145-
146-
Source the file to apply the environment variables. For the Windows equivalent, refer to [this guide](https://bennett4.medium.com/windows-alternative-to-source-env-for-setting-environment-variables-606be2a6d3e1).
147-
```shell
148-
source local.env
149-
```
150-
151-
Make sure the PostgreSQL database server is up and running. Initialize the application database for TestGen.
152-
```shell
153-
testgen setup-system-db --yes
154-
```
155-
156-
### Run the application modules
157-
158-
Run the following command to start TestGen. It will open the browser at [http://localhost:8501](http://localhost:8501).
159-
160-
```shell
161-
testgen run-app
162-
```
163-
164-
Verify that you can login to the UI with the `TESTGEN_USERNAME` and `TESTGEN_PASSWORD` values that you configured in the environment variables.
165-
166-
### Optional: Run the TestGen demo setup
167-
168-
The [Data Observability quickstart](https://docs.datakitchen.io/tutorials/quickstart-demo/) walks you through DataOps Data Quality TestGen capabilities to demonstrate how it covers critical use cases for data and analytic teams.
169-
170-
```shell
171-
testgen quick-start
172-
```
173-
174-
In the TestGen UI, you will see that new data profiling and test results have been generated.
175-
176-
## Useful Commands
177-
178-
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) and [docker compose CLI](https://docs.docker.com/compose/reference/) can be used to operate the TestGen application installed using dk-installer. All commands must be run in the same folder that contains the `dk-installer.py` and `docker-compose.yml` files used by the installation.
179-
180-
### Remove demo data
181-
182-
After completing the quickstart, you can remove the demo data from the application with the following command.
183-
184-
```shell
185-
python3 dk-installer.py tg delete-demo
186-
```
187-
188-
### Upgrade to latest version
189-
190-
New releases of TestGen are announced on the `#releases` channel on [Data Observability Slack](https://data-observability-slack.datakitchen.io/join), and release notes can be found on the [DataKitchen documentation portal](https://docs.datakitchen.io/testgen/release-notes/). Use the following command to upgrade to the latest released version.
191-
192-
```shell
193-
python3 dk-installer.py tg upgrade
194-
```
195-
196-
### Uninstall the application
197-
198-
The following command uninstalls the Docker Compose application and removes all data, containers, and images related to TestGen from your machine.
199-
200-
```shell
201-
python3 dk-installer.py tg delete
202-
```
203-
204-
### Access the _testgen_ CLI
205-
206-
The [_testgen_ command line](https://docs.datakitchen.io/testgen/cli-reference/) can be accessed within the running container.
207-
208-
```shell
209-
docker compose exec engine bash
210-
```
211-
212-
Use `exit` to return to the regular terminal.
213-
214-
### Stop the application
215-
216-
```shell
217-
docker compose down
218-
```
219-
220-
### Restart the application
221-
222-
```shell
223-
docker compose up -d
224-
```
36+
* [Install on Mac/Linux](https://docs.datakitchen.io/testgen/get-started/install-on-mac-linux/)
37+
* [Install on Windows](https://docs.datakitchen.io/testgen/get-started/install-on-windows/)
22538

22639
## What Next?
22740

0 commit comments

Comments
 (0)