Skip to content

Commit 2e5a555

Browse files
committed
update docs to PhiloLogic5
1 parent 439e78e commit 2e5a555

7 files changed

Lines changed: 344 additions & 77 deletions

File tree

Dockerfile

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,27 @@ FROM ubuntu:24.04
22

33
ENV DEBIAN_FRONTEND=noninteractive
44

5-
# Install Python 3.12
6-
RUN apt-get update && apt-get install -y python3 python3-venv python3-dev curl python3-pip
7-
8-
# Install dependencies (no Apache needed — Gunicorn serves directly)
9-
RUN apt-get update && apt-get upgrade -y && \
10-
apt-get install -y --no-install-recommends libxml2-dev libxslt-dev zlib1g-dev libgdbm-dev liblz4-tool brotli ripgrep gcc make wget sudo && \
11-
apt-get clean && rm -rf /var/lib/apt
12-
13-
# Install PhiloLogic (nvm and Node.js are installed by install.sh)
5+
# Install system dependencies
6+
RUN apt-get update && \
7+
apt-get install -y --no-install-recommends \
8+
libxml2-dev libxslt-dev zlib1g-dev \
9+
liblz4-tool ripgrep curl sudo && \
10+
apt-get clean && rm -rf /var/lib/apt/lists/*
11+
12+
# Install PhiloLogic (uv, nvm, Node.js and Python are installed by install.sh)
1413
COPY . /PhiloLogic5
1514
WORKDIR /PhiloLogic5
16-
# Delete the tests directory to reduce image size
1715
RUN rm -rf tests
1816
RUN ./install.sh && mkdir -p /var/www/html/philologic
1917

2018
# Configure global variables
2119
RUN sed -i 's/database_root = None/database_root = "\/var\/www\/html\/philologic\/"/' /etc/philologic/philologic5.cfg && \
2220
sed -i 's/url_root = None/url_root = "http:\/\/localhost\/philologic\/"/' /etc/philologic/philologic5.cfg
2321

24-
COPY docker_apache_restart.sh /autostart.sh
25-
RUN chmod +x /autostart.sh
22+
COPY docker_entrypoint.sh /docker_entrypoint.sh
23+
RUN chmod +x /docker_entrypoint.sh
2624

2725
WORKDIR /
2826

2927
EXPOSE 8000
30-
ENTRYPOINT ["/autostart.sh"]
28+
ENTRYPOINT ["/docker_entrypoint.sh"]

docs/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,6 @@ description, you can refer to [our blog](<http://artfl.blogspot.com>).
1717

1818
### IMPORTANT
1919

20-
- PhiloLogic5 will only work on Unix-based systems (Linux, \*BSD) though MacOS is not supported and guaranteed to work.
21-
- PhiloLogic5 will only run on the Apache Webserver
22-
- PhiloLogic5 has only been tested on Python 3.10 and up
20+
- PhiloLogic5 runs on Linux (Ubuntu, Debian, RHEL, etc.). macOS is not officially supported.
21+
- PhiloLogic5 uses Gunicorn as its WSGI server, with Apache or Nginx as a reverse proxy.
22+
- Python 3.11 or higher is required.

docs/installation.md

Lines changed: 206 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,57 +2,228 @@
22
title: Installation
33
---
44

5-
Installing PhiloLogic consists of two steps:
5+
## Overview
66

7-
1. Run the install.sh script which installs PhiloLogic5 in `/var/lib/philologic5/`
8-
2. Set up a directory in your web server to serve databases from
9-
3. Edit /etc/philologic/philologic5.cfg according to your machine
7+
PhiloLogic5 runs as a Gunicorn WSGI application behind a reverse proxy (Apache or Nginx). The installer handles Python, Node.js, and all Python dependencies automatically via [uv](https://docs.astral.sh/uv/).
108

11-
You can find more detailed installation instructions for specific OSes here:
9+
Installation steps:
1210

13-
- [RedHat (and CentOS)](specific_installations/redhat_installation.md)
14-
- [Ubuntu](specific_installations/ubuntu_installation.md)
11+
1. Install system dependencies
12+
2. Run `install.sh`
13+
3. Configure `/etc/philologic/philologic5.cfg`
14+
4. Set up your web server as a reverse proxy
15+
5. Enable and start the Gunicorn service
1516

16-
### Downloading
17+
## System Requirements
1718

18-
IMPORTANT: Do not install from the master branch on github: this is the development branch and is in no way garanteed to be stable
19+
- Linux (Ubuntu 22.04+, Debian 12+, RHEL 9+, or similar)
20+
- Python 3.11+ (the installer downloads its own Python via uv by default)
21+
- Root/sudo access for installing to `/var/lib/philologic5/`
1922

20-
You can find a copy of the latest version of PhiloLogic5 [here](../../../releases/).
23+
## System Dependencies
2124

22-
### Prerequisites
25+
### Ubuntu / Debian
2326

24-
- Apache Webserver
25-
- Python 3.10 and up
26-
- LZ4
27-
- Brotli (for Apache compression)
28-
- Ripgrep
27+
```bash
28+
sudo apt-get update
29+
sudo apt-get install -y \
30+
libxml2-dev libxslt-dev zlib1g-dev \
31+
liblz4-tool ripgrep curl
32+
```
2933

30-
### Installing
34+
### RHEL / CentOS / Fedora
3135

32-
Installing PhiloLogic's libraries requires administrator privileges.
33-
Just run the install.sh in the top level directory of the PhiloLogic4 you downloaded to install PhiloLogic and its dependencies:
36+
```bash
37+
sudo dnf install -y \
38+
libxml2-devel libxslt-devel zlib-devel \
39+
lz4 ripgrep curl
40+
```
3441

35-
`./install.sh`
42+
**Notes:**
43+
- `libxml2-dev`/`libxslt-dev`/`zlib1g-dev`: required for building `lxml` (XML parsing)
44+
- `liblz4-tool`/`lz4`: used at database load time for compressing word indexes
45+
- `ripgrep`: used at database load time for filtering parser output
46+
- `curl`: used by the installer to download [uv](https://docs.astral.sh/uv/) and [nvm](https://github.com/nvm-sh/nvm)
47+
- The installer downloads its own Python via uv, so system Python packages are not required
3648

37-
You can specify a different version of Python with the `-p` flag followed by the python executable to use, e.g.:
38-
`./install.sh -p python3.12`
49+
## Installing PhiloLogic
3950

40-
### <a name="global-config"></a>Global Configuration
51+
Clone or download the repository, then run the install script:
4152

42-
The installer creates a file in `/etc/philologic/philologic5.cfg` which contains several important global variables:
53+
```bash
54+
git clone https://github.com/ARTFL-Project/PhiloLogic5.git
55+
cd PhiloLogic5
56+
sudo ./install.sh
57+
```
4358

44-
- `database_root` defines the filesytem path to the root web directory for your PhiloLogic install such as `/var/www/html/philologic`. Make sure your user or group has full write permissions to that directory.
45-
- `url_root` defines the URL path to the same root directory for your philologic install, such as http://localhost/philologic/
59+
### Installer Options
4660

47-
### Setting up PhiloLogic Web Application
61+
| Flag | Description |
62+
|------|-------------|
63+
| `-p VERSION` | Python version to use (default: `3.12`) |
64+
| `-t` | Install transformer support (includes spacy-transformers with CUDA) |
4865

49-
Each new PhiloLogic database you load, containing one or more files, will be served
50-
by a its own dedicated copy of PhiloLogic web application.
51-
By convention, this database and web app reside together in a directory
52-
accessible via an HTTP server configured to run Python CGI scripts.
66+
Examples:
5367

54-
Make sure you configure the `/etc/philologic/philologic5.cfg` appropriately.
68+
```bash
69+
# Use Python 3.13
70+
sudo ./install.sh -p 3.13
5571

56-
Configuring your web server is outside of the scope of this document; but the web install
57-
does come with a preconfigured .htaccess file that allows you to run the Web App.
58-
Therefore, you need to make sure your server is configured to allow htaccess files.
72+
# Install with transformer support
73+
sudo ./install.sh -t
74+
```
75+
76+
### What the Installer Does
77+
78+
The installer:
79+
80+
1. Installs [uv](https://docs.astral.sh/uv/) (if not already present)
81+
2. Downloads the specified Python version via uv
82+
3. Creates a virtual environment at `/var/lib/philologic5/philologic_env/`
83+
4. Installs [nvm](https://github.com/nvm-sh/nvm) and Node.js 22 (for building the web app)
84+
5. Builds and installs the PhiloLogic Python package with all dependencies (numpy, numba, lmdb, spacy, etc.)
85+
6. Installs Gunicorn and Falcon
86+
7. Copies the web application to `/var/lib/philologic5/web_app/`
87+
8. Installs the `philoload5` command to `/usr/local/bin/`
88+
9. Creates the global config at `/etc/philologic/philologic5.cfg` (if it doesn't exist)
89+
10. Installs a systemd service file for Gunicorn
90+
91+
### Installation Layout
92+
93+
```
94+
/var/lib/philologic5/
95+
├── philologic_env/ # Python virtual environment
96+
├── web_app/ # Web application (Falcon + JS frontend)
97+
│ ├── app.py # WSGI entry point
98+
│ ├── gunicorn.conf.py # Gunicorn configuration
99+
│ └── scripts/ # API endpoint scripts
100+
├── nvm/ # Node.js (used at load time for building the frontend)
101+
├── bin/
102+
│ └── philoload5 # Database loading command
103+
└── numba_cache/ # JIT compilation cache
104+
105+
/etc/philologic/
106+
└── philologic5.cfg # Global configuration
107+
108+
/usr/local/bin/
109+
└── philoload5 # Symlink to loader script
110+
```
111+
112+
## Global Configuration
113+
114+
Edit `/etc/philologic/philologic5.cfg` to set two required paths:
115+
116+
```python
117+
# Filesystem path where databases will be stored
118+
database_root = "/var/www/html/philologic5/"
119+
120+
# URL root matching the database_root location
121+
url_root = "http://localhost/philologic5/"
122+
```
123+
124+
Make sure the `database_root` directory exists and is writable by your user:
125+
126+
```bash
127+
sudo mkdir -p /var/www/html/philologic5
128+
sudo chown -R $USER:$USER /var/www/html/philologic5
129+
```
130+
131+
## Web Server Configuration
132+
133+
PhiloLogic5 runs behind Gunicorn, which listens on a Unix socket. You need a reverse proxy (Apache or Nginx) to forward HTTP requests to Gunicorn.
134+
135+
### Starting Gunicorn
136+
137+
```bash
138+
sudo systemctl enable philologic5-gunicorn
139+
sudo systemctl start philologic5-gunicorn
140+
```
141+
142+
Check status:
143+
144+
```bash
145+
sudo systemctl status philologic5-gunicorn
146+
journalctl -u philologic5-gunicorn -f # follow logs
147+
```
148+
149+
### Apache
150+
151+
Enable the required modules:
152+
153+
```bash
154+
sudo a2enmod proxy proxy_http
155+
sudo systemctl restart apache2
156+
```
157+
158+
Add to your `<VirtualHost>` block:
159+
160+
```apache
161+
ProxyTimeout 300
162+
<Location "/philologic5">
163+
ProxyPass unix:/var/run/philologic/gunicorn.sock|http://localhost/philologic5 flushpackets=on
164+
ProxyPassReverse unix:/var/run/philologic/gunicorn.sock|http://localhost/philologic5
165+
SetEnv no-gzip 1
166+
SetEnv force-no-buffering 1
167+
</Location>
168+
```
169+
170+
### Nginx
171+
172+
Add to your `server` block:
173+
174+
```nginx
175+
location /philologic5/ {
176+
proxy_pass http://unix:/var/run/philologic/gunicorn.sock;
177+
proxy_set_header Host $host;
178+
proxy_set_header X-Real-IP $remote_addr;
179+
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
180+
proxy_set_header X-Forwarded-Proto $scheme;
181+
proxy_read_timeout 300s;
182+
proxy_buffering off;
183+
}
184+
```
185+
186+
Adjust the URL prefix (`/philologic5`) to match your `url_root` setting in `/etc/philologic/philologic5.cfg`.
187+
188+
## Docker
189+
190+
A `Dockerfile` is included for containerized deployment:
191+
192+
```bash
193+
docker build -t philologic5 .
194+
docker run -p 8000:8000 -v /path/to/databases:/var/www/html/philologic philologic5
195+
```
196+
197+
In the container, Gunicorn binds directly to port 8000 (no reverse proxy needed inside the container).
198+
199+
## Tuning Gunicorn
200+
201+
The default configuration is in `/var/lib/philologic5/web_app/gunicorn.conf.py`. Key settings:
202+
203+
| Setting | Default | Description |
204+
|---------|---------|-------------|
205+
| `workers` | `min(cpu_count, 4)` | Number of worker processes |
206+
| `threads` | `4` | Threads per worker |
207+
| `timeout` | `300` | Request timeout (seconds) |
208+
| `max_requests` | `1000` | Requests before worker recycling |
209+
| `preload_app` | `True` | Preload app for memory efficiency |
210+
211+
The installer preserves any customizations to `gunicorn.conf.py` across reinstalls.
212+
213+
## Upgrading
214+
215+
To upgrade an existing installation, pull the latest code and rerun the installer:
216+
217+
```bash
218+
cd PhiloLogic5
219+
git pull
220+
sudo ./install.sh
221+
```
222+
223+
The installer will remove and recreate `/var/lib/philologic5/` but preserves your `gunicorn.conf.py` customizations and `/etc/philologic/philologic5.cfg`. Existing databases are not affected (they live under `database_root`).
224+
225+
After upgrading, restart Gunicorn:
226+
227+
```bash
228+
sudo systemctl restart philologic5-gunicorn
229+
```
Lines changed: 63 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,69 @@
11
---
2-
title: Installing PhiloLogic on RedHat (and CentOS)
2+
title: Installing PhiloLogic5 on RedHat (and CentOS)
33
---
44

5+
Tested on RHEL 9 and CentOS Stream 9.
56

6-
* Run install script
7+
### 1. Install System Dependencies
78

8-
`./install.sh`
9+
```bash
10+
sudo dnf install -y \
11+
libxml2-devel libxslt-devel zlib-devel \
12+
lz4 ripgrep curl
13+
```
914

10-
* Configure Apache
11-
* Make sure your prefered webspace allows full override for htaccess files: `AllowOverride All`
12-
* Make sure the correct permissions are set on the folder dedicated to PhiloLogic databases,
13-
i.e. write access for the user/group that will be building databases.
15+
### 2. Run the Installer
16+
17+
```bash
18+
cd PhiloLogic5
19+
sudo ./install.sh
20+
```
21+
22+
### 3. Configure PhiloLogic
23+
24+
Edit `/etc/philologic/philologic5.cfg`:
25+
26+
```python
27+
database_root = "/var/www/html/philologic5/"
28+
url_root = "http://localhost/philologic5/"
29+
```
30+
31+
Create the database directory:
32+
33+
```bash
34+
sudo mkdir -p /var/www/html/philologic5
35+
sudo chown -R $USER:$USER /var/www/html/philologic5
36+
```
37+
38+
### 4. Start Gunicorn
39+
40+
```bash
41+
sudo systemctl enable philologic5-gunicorn
42+
sudo systemctl start philologic5-gunicorn
43+
```
44+
45+
### 5. Configure Apache as Reverse Proxy
46+
47+
```bash
48+
sudo dnf install -y httpd mod_proxy_html
49+
```
50+
51+
Add to `/etc/httpd/conf.d/philologic5.conf`:
52+
53+
```apache
54+
ProxyTimeout 300
55+
<Location "/philologic5">
56+
ProxyPass unix:/var/run/philologic/gunicorn.sock|http://localhost/philologic5 flushpackets=on
57+
ProxyPassReverse unix:/var/run/philologic/gunicorn.sock|http://localhost/philologic5
58+
SetEnv no-gzip 1
59+
SetEnv force-no-buffering 1
60+
</Location>
61+
```
62+
63+
Restart Apache:
64+
65+
```bash
66+
sudo systemctl restart httpd
67+
```
68+
69+
Make sure the correct permissions are set on the database directory — the user building databases needs write access.

0 commit comments

Comments
 (0)