You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-191Lines changed: 4 additions & 191 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,199 +29,12 @@ A <b>single place to manage Data Quality</b> across data sets, locations, and te
29
29
<imgalt="DataKitchen Open Source Data Quality TestGen Features - Single Place"src="https://datakitchen.io/wp-content/uploads/2024/07/Screenshot-dataops-testgen-centralize.png"width="70%">
30
30
</p>
31
31
32
-
## Installation with dk-installer (recommended)
32
+
## Installation
33
33
34
-
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) program installs DataOps Data Quality TestGen as a [Docker Compose](https://docs.docker.com/compose/) application. This is the recommended mode of installation as Docker encapsulates and isolates the application from other software on your machine and does not require you to manage Python dependencies.
34
+
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) program installs TestGen in either Docker or pip mode. For complete instructions, see the documentation:
35
35
36
-
### Install the prerequisite software
37
-
38
-
| Software | Tested Versions | Command to check version |
|[Python](https://www.python.org/downloads/) <br/>- Most Linux and macOS systems have Python pre-installed. <br/>- On Windows machines, you will need to download and install it. <br/> Why Python? To run the installer. | 3.9, 3.10, 3.11, 3.12 |`python3 --version`|
41
-
|[Docker](https://docs.docker.com/get-docker/) <br/>[Docker Compose](https://docs.docker.com/compose/install/) <br/> Why Docker? Docker lets you try TestGen without affecting your local software environment. All the dependencies TestGen needs are isolated in its own container, so installation is easy and insulated. | 26.1, 27.5, 28.0 <br/> 2.32, 2.33, 2.34 |`docker -v` <br/> `docker compose version`|
42
-
43
-
### Download the installer
44
-
45
-
On Unix-based operating systems, use the following command to download it to the current directory. We recommend creating a new, empty directory.
* Alternatively, you can manually download the [`dk-installer.py`](https://github.com/DataKitchen/data-observability-installer/blob/main/dk-installer.py) file from the [data-observability-installer](https://github.com/DataKitchen/data-observability-installer) repository.
52
-
* All commands listed below should be run from the folder containing this file.
53
-
* For usage help and command options, run `python3 dk-installer.py --help` or `python3 dk-installer.py <command> --help`.
54
-
55
-
### Install the TestGen application
56
-
57
-
The installation downloads the latest Docker images for TestGen and deploys a new Docker Compose application. The process may take 5~10 minutes depending on your machine and network connection.
58
-
59
-
```shell
60
-
python3 dk-installer.py tg install
61
-
```
62
-
63
-
The `--port` option may be used to set a custom localhost port for the application (default: 8501).
64
-
65
-
To enable SSL for HTTPS support, use the `--ssl-cert-file` and `--ssl-key-file` options to specify local file paths to your SSL certificate and key files.
66
-
67
-
Once the installation completes, verify that you can login to the UI with the URL and credentials provided in the output.
68
-
69
-
### Optional: Run the TestGen demo setup
70
-
71
-
The [Data Observability quickstart](https://docs.datakitchen.io/tutorials/quickstart-demo/) walks you through DataOps Data Quality TestGen capabilities to demonstrate how it covers critical use cases for data and analytic teams.
72
-
73
-
```shell
74
-
python3 dk-installer.py tg run-demo
75
-
```
76
-
77
-
In the TestGen UI, you will see that new data profiling and test results have been generated.
78
-
79
-
## Installation with pip
80
-
81
-
As an alternative to the Docker Compose [installation with dk-installer (recommended)](#installation-with-dk-installer-recommended), DataOps Data Quality TestGen can also be installed as a Python package via [pip](https://pip.pypa.io/en/stable/). This mode of installation uses the [dataops-testgen](https://pypi.org/project/dataops-testgen/) package published to PyPI, and it requires a PostgreSQL instance to be provisioned for the application database.
82
-
83
-
### Install the prerequisite software
84
-
85
-
| Software | Tested Versions | Command to check version |
|[Python](https://www.python.org/downloads/) <br/>- Most Linux and macOS systems have Python pre-installed. <br/>- On Windows machines, you will need to download and install it. | 3.11, 3.12, 3.13 |`python3 --version`|
We recommend using a Python virtual environment to avoid any dependency conflicts with other applications installed on your machine. The [venv](https://docs.python.org/3/library/venv.html#creating-virtual-environments) module, which is part of the Python standard library, or other third-party tools, like [virtualenv](https://virtualenv.pypa.io/en/latest/) or [conda](https://docs.conda.io/en/latest/), can be used.
93
-
94
-
Create and activate a virtual environment with a TestGen-compatible version of Python (`>=3.11`). The steps may vary based on your operating system and Python installation - the [Python packaging user guide](https://packaging.python.org/en/latest/tutorials/installing-packages/) is a useful reference.
95
-
96
-
_On Linux/Mac_
97
-
```shell
98
-
python3 -m venv venv
99
-
source venv/bin/activate
100
-
```
101
-
102
-
_On Windows_
103
-
```powershell
104
-
py -3.13 -m venv venv
105
-
venv\Scripts\activate
106
-
```
107
-
108
-
Within the virtual environment, install the TestGen package with pip.
109
-
```shell
110
-
pip install dataops-testgen
111
-
```
112
-
113
-
Verify that the [_testgen_ command line](https://docs.datakitchen.io/testgen/cli-reference/) works.
114
-
```shell
115
-
testgen --help
116
-
```
117
-
118
-
### Set up the application database in PostgresSQL
119
-
120
-
Create a `local.env` file with the following environment variables, replacing the `<value>` placeholders with appropriate values. Refer to the [TestGen Configuration](docs/configuration.md) document for more details, defaults, and other supported configuration.
121
-
```shell
122
-
# Connection parameters for the PostgreSQL server
123
-
export TG_METADATA_DB_HOST=<postgres_hostname>
124
-
export TG_METADATA_DB_PORT=<postgres_port>
125
-
126
-
# Connection credentials for the PostgreSQL server
127
-
# This role must have privileges to create roles, users, database and schema so that the application database can be initialized
# Set a password and arbitrary string (the "salt") to be used for encrypting secrets in the application database
132
-
export TG_DECRYPT_PASSWORD=<encryption_password>
133
-
export TG_DECRYPT_SALT=<encryption_salt>
134
-
135
-
# Set credentials for the default admin user to be created for TestGen
136
-
export TESTGEN_USERNAME=<username>
137
-
export TESTGEN_PASSWORD=<password>
138
-
139
-
# Set an arbitrary base64-encoded string to be used for signing authentication tokens
140
-
export TG_JWT_HASHING_KEY=<base64_key>
141
-
142
-
# Set an accessible path for storing application logs
143
-
export TESTGEN_LOG_FILE_PATH=<path_for_logs>
144
-
```
145
-
146
-
Source the file to apply the environment variables. For the Windows equivalent, refer to [this guide](https://bennett4.medium.com/windows-alternative-to-source-env-for-setting-environment-variables-606be2a6d3e1).
147
-
```shell
148
-
source local.env
149
-
```
150
-
151
-
Make sure the PostgreSQL database server is up and running. Initialize the application database for TestGen.
152
-
```shell
153
-
testgen setup-system-db --yes
154
-
```
155
-
156
-
### Run the application modules
157
-
158
-
Run the following command to start TestGen. It will open the browser at [http://localhost:8501](http://localhost:8501).
159
-
160
-
```shell
161
-
testgen run-app
162
-
```
163
-
164
-
Verify that you can login to the UI with the `TESTGEN_USERNAME` and `TESTGEN_PASSWORD` values that you configured in the environment variables.
165
-
166
-
### Optional: Run the TestGen demo setup
167
-
168
-
The [Data Observability quickstart](https://docs.datakitchen.io/tutorials/quickstart-demo/) walks you through DataOps Data Quality TestGen capabilities to demonstrate how it covers critical use cases for data and analytic teams.
169
-
170
-
```shell
171
-
testgen quick-start
172
-
```
173
-
174
-
In the TestGen UI, you will see that new data profiling and test results have been generated.
175
-
176
-
## Useful Commands
177
-
178
-
The [dk-installer](https://github.com/DataKitchen/data-observability-installer/?tab=readme-ov-file#install-the-testgen-application) and [docker compose CLI](https://docs.docker.com/compose/reference/) can be used to operate the TestGen application installed using dk-installer. All commands must be run in the same folder that contains the `dk-installer.py` and `docker-compose.yml` files used by the installation.
179
-
180
-
### Remove demo data
181
-
182
-
After completing the quickstart, you can remove the demo data from the application with the following command.
183
-
184
-
```shell
185
-
python3 dk-installer.py tg delete-demo
186
-
```
187
-
188
-
### Upgrade to latest version
189
-
190
-
New releases of TestGen are announced on the `#releases` channel on [Data Observability Slack](https://data-observability-slack.datakitchen.io/join), and release notes can be found on the [DataKitchen documentation portal](https://docs.datakitchen.io/testgen/release-notes/). Use the following command to upgrade to the latest released version.
191
-
192
-
```shell
193
-
python3 dk-installer.py tg upgrade
194
-
```
195
-
196
-
### Uninstall the application
197
-
198
-
The following command uninstalls the Docker Compose application and removes all data, containers, and images related to TestGen from your machine.
199
-
200
-
```shell
201
-
python3 dk-installer.py tg delete
202
-
```
203
-
204
-
### Access the _testgen_ CLI
205
-
206
-
The [_testgen_ command line](https://docs.datakitchen.io/testgen/cli-reference/) can be accessed within the running container.
207
-
208
-
```shell
209
-
docker compose exec engine bash
210
-
```
211
-
212
-
Use `exit` to return to the regular terminal.
213
-
214
-
### Stop the application
215
-
216
-
```shell
217
-
docker compose down
218
-
```
219
-
220
-
### Restart the application
221
-
222
-
```shell
223
-
docker compose up -d
224
-
```
36
+
*[Install on Mac/Linux](https://docs.datakitchen.io/testgen/get-started/install-on-mac-linux/)
37
+
*[Install on Windows](https://docs.datakitchen.io/testgen/get-started/install-on-windows/)
0 commit comments