Skip to content

Feature/production containers#87

Merged
XanderVertegaal merged 45 commits into
developfrom
feature/production-containers
Apr 20, 2026
Merged

Feature/production containers#87
XanderVertegaal merged 45 commits into
developfrom
feature/production-containers

Conversation

@XanderVertegaal

@XanderVertegaal XanderVertegaal commented Mar 27, 2026

Copy link
Copy Markdown
Contributor

This PR adds production-ready containers to our Docker Compose setup, inspired by ParsePort and Docker's own guide for containerizing an Angular application. As discussed, it also adds an Nginx container through which all communications are funneled.

Clarifications to some of the decisions I made have been added to the code (where they are relevant), and I am happy to provide context and further explanation where desired. There are a few things I would like you to consider with special attention.

  • Instead of psycopg2, the backend uses psycopg2-binary. See my comment at requirements.in for more details.
  • settings.py has la-backend in ALLOWED_HOSTS and 'localhost' in CSRF_TRUSTED_ORIGINS. Since this is run as part of a Docker network and the backend should now no longer be contacted directly, I think this should be safe, but I could be wrong.
  • glue.py is no longer used. I think this is fine, since the backend is no longer serving the frontend static files and this setup intends to use settings.py file in backend/langpro_annotator.

If you agree, the following needs to be done (as part of separate issues/PRs, not here):

  • Hide the language select menu in the frontend until (if at all) we implement proper internationalization.
  • Add Angular and Nginx container log output to /logs folder.

I have tried to deploy this on the test server but I think it requires initializing the langpro-container and LangPro submodules, which I was unable to do. (It asks for my Git credentials, which are not accepted.)

Other containers can use this to refer to whatever container variant (dev or prod) turns out to be running.
Not necessary, but less unwieldy than 'langpro-annotator-langpro-backend-prod'.
This will be replaced in the near future.
Comment thread backend/Dockerfile
# Copy project files.
COPY . .

CMD python manage.py check && \

@XanderVertegaal XanderVertegaal Mar 27, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to the compose file so this Dockerfile can be used for both prod (with Gunicorn) and dev (Django dev server). The dev container does not need Gunicorn, of course, but that is a small price to pay for having one container that serves both profiles.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of using the same image for development and production. In development, I want all images to derive from buildpack-deps in order to save disk space. In production, it is preferable to go for leaner images. If I have to start using those leaner images in development, I will end up with lots of disjoint images on disk that are small in themselves but that effectively make me pay multiple times for software that I tend to use in nearly every project.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. I definitely support the idea of splitting them up.

Comment thread backend/requirements.in
django-livereload-server
django-revproxy>=0.10.0
psycopg2
psycopg2-binary

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using psycopg2, the container tries to build its own instance from scratch. The slim container does not have the tools for this but suggests using the psycop2-binary package, which seems to work well and is much faster. I am not what the difference is between installing the binary or building it in the container. Please review this carefully. If we need the non-binary package, we should probably revert to the non-slim container.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary

If we can make it work with the source package, that is probably better. The slimmer the image, the more tools you have to install yourself in the Dockerfile, generally.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll get it to work with the source package! 👍

Comment thread frontend/Dockerfile Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This file is not really deleted, but renamed to Dockerfile.dev and subsequently modified.)

Comment thread docker-compose.yml
services:
nginx:
container_name: la-nginx
restart: unless-stopped

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use restart: unless-stopped for prod services, so we can actually stop them if we want to and e.g. read the logs. Dev services get restart: no.

Comment thread docker-compose.yml Outdated
volumes:
- ./:/usr/src/app
- ./backend:/usr/src/app
- ./logs/django:/usr/src/app/logs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This puts the Gunicorn logs in a directory logs/django outside of the containers so we can read them even on the production server.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe provide the host directory through an environment as well so we can respect the $WEBROOT/logs convention.

Comment thread docker-compose.yml Outdated
- ./frontend:/usr/src/app
- frontend-node-modules:/usr/src/app/node_modules
- frontend-angular-cache:/usr/src/app/.angular
command: ng serve --host 0.0.0.0 --disable-host-check --poll 200

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The poll flag is added to enable live reloading.

The container will automatically create a DB with the provided config (host, user, password).
Comment thread backend/Dockerfile
@XanderVertegaal XanderVertegaal added the enhancement New feature or request label Mar 27, 2026
@XanderVertegaal XanderVertegaal marked this pull request as ready for review March 27, 2026 16:33

@jgonggrijp jgonggrijp left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very fond of the approach taken and I have some questions and concerns. That being said, having a way to deploy is currrently more important than having an approach that I like. I suggest that you stick with your current approach for now and only adjust it enough to make it work. We can improve on aspects later, deciding whether to change something on a case by case basis.

  • settings.py has la-backend in ALLOWED_HOSTS and 'localhost' in CSRF_TRUSTED_ORIGINS. Since this is run as part of a Docker network and the backend should now no longer be contacted directly, I think this should be safe, but I could be wrong.

This bit is a bit tricky. You used proxy_pass in the nginx config, which I think is a forward proxy. If I recall correctly, the target of a forward proxy still sees the hostname of the client, which would mean that your current setup wouldn't work in deployment where the hostname of the client isn't localhost. You could go with a reverse proxy instead, or use the deployment module to prepare an environment file that includes the hostname.

  • glue.py is no longer used. I think this is fine, since the backend is no longer serving the frontend static files and this setup intends to use settings.py file in backend/langpro_annotator.

This could also potentially go wrong. Currently, backend/langpro_annotator/index.py:index is injecting the CSRF cookie in the index page. Without that mechanism, the frontend is going to need another mechanism to obtain that cookie.

If you agree, the following needs to be done (as part of separate issues/PRs, not here):

  • Hide the language select menu in the frontend until (if at all) we implement proper internationalization.

  • Add Angular and Nginx container log output to /logs folder.

I agree.

I have tried to deploy this on the test server but I think it requires initializing the langpro-container and LangPro submodules, which I was unable to do. (It asks for my Git credentials, which are not accepted.)

You have to add the dhlabdevelopers account to those repos. See our internal documentation on deployment.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your wish to have one settings.py for development and production, but production settings usually differ more from the development settings than you've accounted for here. You can also see this on the langpro-annotator-prod-containers branch that you created in the deployment repo.

You can a best of both worlds: inject a separate settings.py in deployment. In the injected settings, import * from the settings in the source code. This way, you only have to add/override keys that differ from the development defaults.

Comment thread backend/Dockerfile
Comment thread backend/Dockerfile
# Copy project files.
COPY . .

CMD python manage.py check && \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of using the same image for development and production. In development, I want all images to derive from buildpack-deps in order to save disk space. In production, it is preferable to go for leaner images. If I have to start using those leaner images in development, I will end up with lots of disjoint images on disk that are small in themselves but that effectively make me pay multiple times for software that I tend to use in nearly every project.

Comment thread backend/requirements.in
django-livereload-server
django-revproxy>=0.10.0
psycopg2
psycopg2-binary

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.psycopg.org/docs/install.html#psycopg-vs-psycopg-binary

If we can make it work with the source package, that is probably better. The slimmer the image, the more tools you have to install yourself in the Dockerfile, generally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to supply this information through the deployment module. That can be postponed through an issue ticket, as far as I'm concerned.

Comment thread frontend/Dockerfile.dev
WORKDIR /usr/src/app

# Copy package.json and yarn.lock.
COPY package.json yarn.lock ./

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Comment thread frontend/Dockerfile.prod

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well documented.

Comment thread docker-compose.yml
restart: always
volumes:
- postgres-data:/var/lib/postgresql/data
- ./backend/create_db.sql:/docker-entrypoint-initdb.d/langpro.sql

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this line?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The referenced SQL file creates a new DB, but the postgres image does this for you, so it felt superfluous.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The postgres image creates a root user and a default database. The SQL file creates a non-default database and a lesser-privileged user specifically for the application.

https://hub.docker.com/_/postgres#initialization-scripts

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But don't we already create a non-default database and user with the env vars in the service definition? Also, the script has hardcoded (and checked in) username + password + database name, so we'd have to change those or make them dynamic.

        environment:
            - POSTGRES_USER=$POSTGRES_USER
            - POSTGRES_PASSWORD=$POSTGRES_PASSWORD
            - POSTGRES_DB=$POSTGRES_DB

Comment thread docker-compose.yml Outdated
volumes:
- ./:/usr/src/app
- ./backend:/usr/src/app
- ./logs/django:/usr/src/app/logs

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe provide the host directory through an environment as well so we can respect the $WEBROOT/logs convention.

Comment thread docker-compose.yml

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder whether the overlap between dev and prod is sufficient to justify profiles. I have a suspicion it would be simpler if we just had separate compose files for dev and prod, with a third compose file for the common services (nginx and postgres) that the other two include.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was quite torn between these two options but went for profiles in the end since I am most comfortable with them. I will keep them for now but create an issue for a refactor to separate compose files.

@XanderVertegaal XanderVertegaal merged commit c6a54de into develop Apr 20, 2026
1 check passed
@XanderVertegaal XanderVertegaal deleted the feature/production-containers branch April 20, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants