Skip to content

Commit c443b41

Browse files
committed
Add AI README to reflect Data Engineer Server features and structure
1 parent ff0a739 commit c443b41

1 file changed

Lines changed: 234 additions & 37 deletions

File tree

README.md

Lines changed: 234 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,248 @@
1+
# Data Engineer Server
12

2-
# Data Science Server
3+
![Docker](https://img.shields.io/badge/Docker-Compose-blue?logo=docker)
4+
![Ubuntu](https://img.shields.io/badge/Ubuntu-24.04-E95420?logo=ubuntu&logoColor=white)
5+
![Python](https://img.shields.io/badge/Python-3.11%2B-3776AB?logo=python&logoColor=white)
6+
![Status](https://img.shields.io/badge/status-active-success)
37

4-
_The list of shell scripts for configuration Data Science server on Ubuntu Server 24.04_.
8+
- [Repository structure](#repository-structure)
9+
- [Features](#features)
10+
- [Prerequisites](#prerequisites)
11+
- [Quick start](#quick-start)
12+
- [Servers](#servers)
13+
- [Clients](#clients)
14+
- [Configuration](#configuration)
15+
- [Environment variables](#environment-variables)
16+
- [Ports](#ports)
17+
- [Troubleshooting](#troubleshooting)
18+
- [Development helpers (Ubuntu 24.04)](#development-helpers-ubuntu-2404)
19+
- [Notes on reverse proxy (Traefik)](#notes-on-reverse-proxy-traefik)
20+
- [Contributing](#contributing)
521

6-
## Software / Frameworks
722

8-
Installing software/frameworks:
23+
Local-first data engineering and AI/DS workspace with dockerized services, helper scripts, and Python clients.
924

10-
- [x] ML/DL frameworks:
11-
- [x] Tensorflow (with GPU support)
12-
- [x] Keras (with GPU support)
13-
- [x] LightGBM (with GPU support)
14-
- [x] H2O Open
15-
- [x] R CRAN
16-
- [x] with pre-installed basic R-packages
17-
- [x] RStudio Server
18-
- [x] with Azure Database Service connector
19-
- [x] JupyterLab
20-
- [x] .NET Core SDK
21-
- [x] Docker
22-
- [x] Git configure
25+
- AI agents (Ollama + Open WebUI)
26+
- Object storage (MinIO) with event bus (RabbitMQ)
27+
- Centralized structured logging (Seq)
28+
- RStudio Server for R analytics
29+
- Ubuntu provisioning notes and developer setup scripts
2330

24-
## Preparation
2531

26-
```sh
27-
git clone https://github.com/codez0mb1e/cloud-deep-learning-server.git
32+
## Repository structure
2833

29-
cd cloud-deep-learning-server/src
30-
mkdir logs
3134
```
35+
|--.
36+
| -- ai-agents/ # Local AI stack (Ollama + Open WebUI) and model bootstrap
37+
| -- development/ # One-off scripts to set up dev tools on Ubuntu
38+
| -- minio/ # MinIO object storage: server compose + Python client
39+
| -- pipelines/ # AutoML pipeline docs and diagrams (conceptual)
40+
| -- rstudio-server/ # RStudio Server via Docker Compose (+ R frameworks notes)
41+
| -- seq/ # Seq logging: server compose + Python client
42+
| -- ubuntu-os/ # Ubuntu 24.04 tips, packages, disks/network, users
43+
```
44+
45+
46+
## Features
47+
48+
- Local AI stack with Ollama and Open WebUI for chat and coding assistance
49+
- RStudio Server for R analytics and data science
50+
- MinIO S3-compatible object storage with optional RabbitMQ notifications
51+
- Seq centralized logging with structured logs
52+
- Minimal Python clients for MinIO and Seq
53+
- Clear, scriptable startup via Docker Compose and small helper scripts
54+
55+
56+
## Prerequisites
57+
58+
- Linux (tested on Ubuntu Server 24.04)
59+
- Docker Engine and Docker Compose plugin
60+
- Optional: Python 3.11+ for the MinIO/Seq client examples
61+
62+
63+
## Quick start
64+
65+
### Servers
66+
67+
#### 1. AI agents: Ollama + Open WebUI
68+
69+
Folder: `ai-agents/` — see more in that folder's README.
70+
71+
```bash
72+
cd ai-agents
73+
74+
# Start services (Ollama + Open WebUI; model-downloader will fetch baseline models)
75+
docker compose up -d
76+
77+
# Watch model downloads
78+
docker compose logs model-downloader -f
79+
80+
# Open the chat UI
81+
# http://localhost:3000
82+
```
83+
84+
Services:
85+
- Ollama: http://localhost:11434
86+
- Open WebUI: http://localhost:3000
87+
88+
89+
#### 2. RStudio Server
90+
91+
Folder: `rstudio-server/server/`
92+
93+
```bash
94+
cd rstudio-server/server
95+
96+
# Set a password for the rstudio user
97+
echo "RSTUDIO_PASSWORD=<your-password>" > .env
98+
99+
# Start RStudio Server
100+
docker compose up -d
101+
102+
# Open the IDE
103+
# http://localhost:8787 (username: rstudio, password: from .env)
104+
```
105+
106+
Notes:
107+
- Port is bound to 127.0.0.1:8787 by default.
108+
- The compose file mounts `/home/${USER}/` into the container at `/home/rstudio/`.
109+
- Extra tips and R package guidance live in `rstudio-server/ds-frameworks/README.md`.
110+
111+
112+
#### 3. MinIO object storage (+ RabbitMQ for notifications)
113+
114+
Folder: `minio/server/`
115+
116+
```bash
117+
cd minio/server
118+
119+
# Required secrets
120+
cat > .env << 'EOF'
121+
MINIO_ROOT_PASSWORD=<strong-password>
122+
RABBITMQ_ROOT_PASSWORD=<strong-password>
123+
# Optional, only used if you run behind Traefik
124+
# PRIMARY_DOMAIN=example.com
125+
EOF
126+
127+
# Create network/volumes and start services
128+
bash ./run.sh
129+
130+
# MinIO S3 API: http://localhost:9000
131+
# RabbitMQ UI: http://localhost:15672 (user: admin, pass: from .env)
132+
```
133+
134+
Notes:
135+
- The MinIO Console runs on port 9001 inside the container. It's exposed via Traefik labels if you have a proxy configured; no direct host port is published here.
136+
137+
138+
#### 4. Seq centralized logging
139+
140+
Folder: `seq/server/`
141+
142+
```bash
143+
cd seq/server
144+
145+
# Start (creates network/volume and launches the container)
146+
bash ./run.sh
147+
148+
# Access
149+
# This compose is set up for reverse proxy via Traefik (labels only).
150+
# Publish ports or configure Traefik+PRIMARY_DOMAIN to access the UI.
151+
```
152+
153+
154+
### Clients
155+
156+
#### MinIO client
157+
158+
Folder: `minio/client/`
159+
160+
```bash
161+
cd minio/client
162+
python -m venv .venv && source .venv/bin/activate
163+
pip install -r requirements.txt
164+
```
165+
166+
See `minio/client/clients.py` for Pandas/Polars put/get helpers.
167+
168+
169+
#### Seq logger client
170+
171+
Folder: `seq/client/`
172+
173+
```bash
174+
cd seq/client
175+
python -m venv .venv && source .venv/bin/activate
176+
pip install -r requirements.txt
177+
```
178+
179+
Configure `seq/client/config.yml` with your Seq endpoint and API key, then wire a logger using `LoggerFactory` in `seq_logger.py`.
180+
181+
182+
## Configuration
183+
184+
Most services read configuration from simple `.env` files or inline compose env. Keep secrets out of VCS.
185+
186+
### Environment variables
187+
188+
- AI Agents (`ai-agents/.env`)
189+
- `WEBUI_SECRET_KEY` — optional secret for Open WebUI (set if enabling auth)
190+
- RStudio (`rstudio-server/server/.env`)
191+
- `RSTUDIO_PASSWORD` — password for the `rstudio` user
192+
- MinIO/RabbitMQ (`minio/server/.env`)
193+
- `MINIO_ROOT_PASSWORD` — MinIO root password
194+
- `RABBITMQ_ROOT_PASSWORD` — RabbitMQ admin password
195+
- `PRIMARY_DOMAIN` — optional, used by Traefik labels
196+
- Seq client (`seq/client/config.yml`)
197+
- `logger_settings.seq.server_url`, `api_key` — endpoint and API key
198+
199+
200+
## Ports
201+
202+
- Open WebUI: 3000 (host)
203+
- Ollama: 11434 (host)
204+
- RStudio Server: 8787 (bound to 127.0.0.1)
205+
- MinIO S3 API: 9000 (host)
206+
- RabbitMQ: 5672 (AMQP), 15672 (management UI)
207+
- Seq: not exposed by default (Traefik labels included; add ports or a proxy)
208+
209+
210+
## Troubleshooting
211+
212+
- Port already in use
213+
- Check with `lsof -i :PORT` and stop conflicting process or change the mapping in compose.
214+
- Docker permission denied
215+
- Add your user to the `docker` group: `sudo usermod -aG docker $USER && newgrp docker`.
216+
- Models downloading slowly (AI Agents)
217+
- Watch `model-downloader` logs; ensure adequate bandwidth and disk space.
218+
- RStudio login issues
219+
- Ensure `.env` has `RSTUDIO_PASSWORD` and the service is reachable on 127.0.0.1:8787.
220+
- MinIO/RabbitMQ not starting
221+
- Verify `.env` secrets and that the `backend` Docker network exists (created by the run script).
222+
223+
## Development helpers (Ubuntu 24.04)
224+
225+
Folder: `development/` — curated scripts for setting up a workstation/server.
226+
227+
- `install_docker.sh` — Docker Engine + Compose
228+
- `install_conda.sh` — Miniconda
229+
- `uv_and_ruff.sh` — Python packaging (uv) and linting (ruff)
230+
- `install_dotnet_tools.sh` — .NET SDK
231+
- `install_azure_tools.sh` — Azure CLI/tools
232+
- `git_configure.sh` — Git username/email and quality-of-life settings
233+
- plus others for CI/CD, system design, and optional tools
234+
235+
General OS tips live in `ubuntu-os/README.md` (packages, disks, network, users, and more).
236+
32237

33-
## Installation
238+
## Notes on reverse proxy (Traefik)
34239

35-
1. sh [install_core.sh](/src/install_core.sh) &>logs/install_core.log
36-
2. sh [install_docker.sh](/src/install_docker.sh) &>logs/install_docker.log
37-
3. sh [git_configure.sh](/src/git_configure.sh) &>logs/git_configure.log <sup>1</sup>
38-
4. sh [install_dotnet_tools.sh](/src/install_dotnet_tools.sh) &>logs/install_dotnet_core.log
39-
5. sh [install_ds_python.sh](/src/install_ds_python.sh) > log/install_ds_python.log
40-
6. sh [install_deep_learning.sh](/src/install_deep_learning.sh) &>logs/install_deep_learning.log
41-
7. sh [install_r_env.sh](/src/install_r_env.sh) &>logs/install_r.log
42-
8. pt [install_r_packages.R](/src/install_r_packages.R) &>logs/install_r_packages.log <sup>1</sup>
43-
9. sh [install_lightgbm.sh](/src/install_lightgbm.sh) &>logs/install_lightgbm.log <sup>1</sup>
240+
Some compose files include labels for Traefik and refer to `PRIMARY_DOMAIN` in a `.env`. If you're not running a reverse proxy, you can still use the services via the published ports shown above or optionally add explicit `ports:` mappings to the compose files.
44241

45-
<sup>1</sup> Install under RStudio user
46242

47-
### Tests
243+
## Contributing
48244

49-
1. [Keras installation tests](/tests/keras_install_tests.R)
50-
1. [LightGBM installation test](/tests/lightgbm_install_tests.R)
51-
1. [Jupyter Notebook installation tests](/tests/hello_jupyter.ipynb)
245+
Contributions are welcome! If you spot an issue or have an improvement:
246+
- Open an issue describing the problem or proposal
247+
- For changes, fork the repo and open a PR with a concise description and testing notes
248+
- Keep changes focused and documented; prefer small, reviewable PRs

0 commit comments

Comments
 (0)