Skip to content

Commit 6e1ebbb

Browse files
authored
v0.4 init: File encryption, integrity report, deletion protection, job monitoring (#187)
* open-core setup, adding enterprise package * enterprise: Audit log API, UI * Audit-log docs * feat: Integrity report, allowing users to verify the integrity of archived emails and their attachments. - When an email is archived, Open Archiver calculates a unique cryptographic signature (a SHA256 hash) for the email's raw `.eml` file and for each of its attachments. These signatures are stored in the database alongside the email's metadata. - The integrity check feature recalculates these signatures for the stored files and compares them to the original signatures stored in the database. This process allows you to verify that the content of your archived emails has not been altered, corrupted, or tampered with since the moment they were archived. - Add docs of Integrity report * Update Docker-compose.yml to use bind mount for Open Archiver data. Fix API rate-limiter warning about trust proxy * File encryption support * Scope attachment deduplication to ingestion source Previously, attachment deduplication was handled globally by enforcing a unique constraint on the content hash (contentHashSha256) in the `attachments` table. This caused an issue where an attachment from one ingestion source would be incorrectly linked if the same attachment was processed by a different source. This commit refactors the deduplication logic to be scoped on a per-ingestion-source basis. Changes: - **Schema:** The `attachments` table schema has been updated to include a nullable `ingestionSourceId` column. A composite unique index has been added on `(ingestionSourceId, contentHashSha256)` to enforce per-source uniqueness. The `ingestionSourceId` is nullable to ensure backward compatibility with existing databases. - **Ingestion Logic:** The `IngestionService` has been updated to provide the `ingestionSourceId` when inserting attachment records. The `onConflictDoUpdate` clause now targets the new composite key, ensuring that attachments are only considered duplicates if they have the same hash and originate from the same ingestion source. * Scope attachment deduplication to ingestion source Previously, attachment deduplication was handled globally by enforcing a unique constraint on the content hash (contentHashSha256) in the `attachments` table. This caused an issue where an attachment from one ingestion source would be incorrectly linked if the same attachment was processed by a different source. This commit refactors the deduplication logic to be scoped on a per-ingestion-source basis. Changes: - **Schema:** The `attachments` table schema has been updated to include a nullable `ingestionSourceId` column. A composite unique index has been added on `(ingestionSourceId, contentHashSha256)` to enforce per-source uniqueness. The `ingestionSourceId` is nullable to ensure backward compatibility with existing databases. - **Ingestion Logic:** The `IngestionService` has been updated to provide the `ingestionSourceId` when inserting attachment records. The `onConflictDoUpdate` clause now targets the new composite key, ensuring that attachments are only considered duplicates if they have the same hash and originate from the same ingestion source. * Add option to disable deletions This commit introduces a new feature that allows admins to disable the deletion of emails and ingestion sources for the entire instance. This is a critical feature for compliance and data retention, as it prevents accidental or unauthorized deletions. Changes: - **Configuration**: Added an `ENABLE_DELETION` environment variable. If this variable is not set to `true`, all deletion operations will be disabled. - **Deletion Guard**: A centralized `checkDeletionEnabled` guard has been implemented to enforce this setting at both the controller and service levels, ensuring a robust and secure implementation. - **Documentation**: The installation guide has been updated to include the new `ENABLE_DELETION` environment variable and its behavior. - **Refactor**: The `IngestionService`'s `create` method was refactored to remove unnecessary calls to the `delete` method, simplifying the code and improving its robustness. * Adding position for menu items * feat(docker): Fix CORS errors This commit fixes CORS errors when running the app in Docker by introducing the `APP_URL` environment variable. A CORS policy is set up for the backend to only allow origin from the `APP_URL`. Key changes include: - New `APP_URL` and `ORIGIN` environment variables have been added to properly configure CORS and the SvelteKit adapter, making the application's public URL easily configurable. - Dockerfiles are updated to copy the entrypoint script, Drizzle config, and migration files into the final image. - Documentation and example files (`.env.example`, `docker-compose.yml`) have been updated to reflect these changes. * feat(attachments): De-duplicate attachment content by content hash This commit refactors attachment handling to allow multiple emails within the same ingestion source to reference attachments with identical content (same hash). Changes: - The unique index on the `attachments` table has been changed to a non-unique index to permit duplicate hash/source pairs. - The ingestion logic is updated to first check for an existing attachment with the same hash and source. If found, it reuses the existing record; otherwise, it creates a new one. This maintains storage de-duplication. - The email deletion logic is improved to be more robust. It now correctly removes the email-attachment link before checking if the attachment record and its corresponding file can be safely deleted. * Not filtering our Trash folder * feat(backend): Add BullMQ dashboard for job monitoring This commit introduces a web-based UI for monitoring and managing background jobs using Bullmq. Key changes: - A new `/api/v1/jobs` endpoint is created, serving the Bull Board dashboard. Access is restricted to authenticated administrators. - All BullMQ queue definitions (`ingestion`, `indexing`, `sync-scheduler`) have been centralized into a new `packages/backend/src/jobs/queues.ts` file. - Workers and services now import queue instances from this central file, improving code organization and removing redundant queue instantiations. * Add `ALL_INCLUSIVE_ARCHIVE` environment variable to disable jun filtering * Using BSL license * frontend: Responsive design for menu bar, pagination * License service/module * Remove demoMode logic * Formatting code * Remove enterprise packages * Fix package.json in packages * Search page responsive fix --------- Co-authored-by: Wayne <5291640+ringoinca@users.noreply.github.com>
1 parent 1e048fd commit 6e1ebbb

129 files changed

Lines changed: 9853 additions & 2747 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.env.example

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,15 @@
44
NODE_ENV=development
55
PORT_BACKEND=4000
66
PORT_FRONTEND=3000
7+
# The public-facing URL of your application. This is used by the backend to configure CORS.
8+
APP_URL=http://localhost:3000
9+
# This is used by the SvelteKit Node adapter to determine the server's public-facing URL.
10+
# It should always be set to the value of APP_URL.
11+
ORIGIN=$APP_URL
712
# The frequency of continuous email syncing. Default is every minutes, but you can change it to another value based on your needs.
813
SYNC_FREQUENCY='* * * * *'
14+
# Set to 'true' to include Junk and Trash folders in the email archive. Defaults to false.
15+
ALL_INCLUSIVE_ARCHIVE=false
916

1017
# --- Docker Compose Service Configuration ---
1118
# These variables are used by docker-compose.yml to configure the services. Leave them unchanged if you use Docker services for Postgresql, Valkey (Redis) and Meilisearch. If you decide to use your own instances of these services, you can substitute them with your own connection credentials.
@@ -40,7 +47,9 @@ BODY_SIZE_LIMIT=100M
4047
# --- Local Storage Settings ---
4148
# The path inside the container where files will be stored.
4249
# This is mapped to a Docker volume for persistence.
43-
# This is only used if STORAGE_TYPE is 'local'.
50+
# This is not an optional variable, it is where the Open Archiver service stores application data. Set this even if you are using S3 storage.
51+
# Make sure the user that runs the Open Archiver service has read and write access to this path.
52+
# Important: It is recommended to create this path manually before installation, otherwise you may face permission and ownership problems.
4453
STORAGE_LOCAL_ROOT_PATH=/var/data/open-archiver
4554

4655
# --- S3-Compatible Storage Settings ---
@@ -53,8 +62,18 @@ STORAGE_S3_REGION=
5362
# Set to 'true' for MinIO and other non-AWS S3 services
5463
STORAGE_S3_FORCE_PATH_STYLE=false
5564

65+
# --- Storage Encryption ---
66+
# IMPORTANT: Generate a secure, random 32-byte hex string for this key.
67+
# You can use `openssl rand -hex 32` to generate a key.
68+
# This key is used for AES-256 encryption of files at rest.
69+
# This is an optional variable, if not set, files will not be encrypted.
70+
STORAGE_ENCRYPTION_KEY=
71+
5672
# --- Security & Authentication ---
5773

74+
# Enable or disable deletion of emails and ingestion sources. Defaults to false.
75+
ENABLE_DELETION=false
76+
5877
# Rate Limiting
5978
# The window in milliseconds for which API requests are checked. Defaults to 60000 (1 minute).
6079
RATE_LIMIT_WINDOW_MS=60000
@@ -77,5 +96,3 @@ ENCRYPTION_KEY=
7796
# Apache Tika Integration
7897
# ONLY active if TIKA_URL is set
7998
TIKA_URL=http://tika:9998
80-
81-

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,17 @@ about: Create a report to help us improve
44
title: ''
55
labels: bug
66
assignees: ''
7-
87
---
98

109
**Describe the bug**
1110
A clear and concise description of what the bug is.
1211

1312
**To Reproduce**
1413
Steps to reproduce the behavior:
14+
1515
1. Go to '...'
1616
2. Click on '....'
17-
5. See error
17+
3. See error
1818

1919
**Expected behavior**
2020
A clear and concise description of what you expected to happen.
@@ -23,7 +23,8 @@ A clear and concise description of what you expected to happen.
2323
If applicable, add screenshots to help explain your problem.
2424

2525
**System:**
26-
- Open Archiver Version:
26+
27+
- Open Archiver Version:
2728

2829
**Relevant logs:**
2930
Any relevant logs (Redact sensitive information)

.github/ISSUE_TEMPLATE/feature_request.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,10 @@ about: Suggest an idea for this project
44
title: ''
55
labels: enhancement
66
assignees: ''
7-
87
---
98

109
**Is your feature request related to a problem? Please describe.**
11-
A clear and concise description of what the problem is.
10+
A clear and concise description of what the problem is.
1211

1312
**Describe the solution you'd like**
1413
A clear and concise description of what you want to happen.

.github/workflows/docker-deployment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ jobs:
3535
uses: docker/build-push-action@v6
3636
with:
3737
context: .
38-
file: ./docker/Dockerfile
38+
file: ./apps/open-archiver/Dockerfile
3939
platforms: linux/amd64,linux/arm64
4040
push: true
4141
tags: logiclabshq/open-archiver:${{ steps.sha.outputs.sha }}

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,7 @@ pnpm-debug.log
2424
# Vitepress
2525
docs/.vitepress/dist
2626
docs/.vitepress/cache
27+
28+
29+
# TS
30+
**/tsconfig.tsbuildinfo

LICENSE

Lines changed: 70 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -200,23 +200,23 @@ You may convey a work based on the Program, or the modifications to
200200
produce it from the Program, in the form of source code under the
201201
terms of section 4, provided that you also meet all of these conditions:
202202

203-
- **a)** The work must carry prominent notices stating that you modified
204-
it, and giving a relevant date.
205-
- **b)** The work must carry prominent notices stating that it is
206-
released under this License and any conditions added under section 7.
207-
This requirement modifies the requirement in section 4 to
208-
“keep intact all notices”.
209-
- **c)** You must license the entire work, as a whole, under this
210-
License to anyone who comes into possession of a copy. This
211-
License will therefore apply, along with any applicable section 7
212-
additional terms, to the whole of the work, and all its parts,
213-
regardless of how they are packaged. This License gives no
214-
permission to license the work in any other way, but it does not
215-
invalidate such permission if you have separately received it.
216-
- **d)** If the work has interactive user interfaces, each must display
217-
Appropriate Legal Notices; however, if the Program has interactive
218-
interfaces that do not display Appropriate Legal Notices, your
219-
work need not make them do so.
203+
- **a)** The work must carry prominent notices stating that you modified
204+
it, and giving a relevant date.
205+
- **b)** The work must carry prominent notices stating that it is
206+
released under this License and any conditions added under section 7.
207+
This requirement modifies the requirement in section 4 to
208+
“keep intact all notices”.
209+
- **c)** You must license the entire work, as a whole, under this
210+
License to anyone who comes into possession of a copy. This
211+
License will therefore apply, along with any applicable section 7
212+
additional terms, to the whole of the work, and all its parts,
213+
regardless of how they are packaged. This License gives no
214+
permission to license the work in any other way, but it does not
215+
invalidate such permission if you have separately received it.
216+
- **d)** If the work has interactive user interfaces, each must display
217+
Appropriate Legal Notices; however, if the Program has interactive
218+
interfaces that do not display Appropriate Legal Notices, your
219+
work need not make them do so.
220220

221221
A compilation of a covered work with other separate and independent
222222
works, which are not by their nature extensions of the covered work,
@@ -235,42 +235,42 @@ of sections 4 and 5, provided that you also convey the
235235
machine-readable Corresponding Source under the terms of this License,
236236
in one of these ways:
237237

238-
- **a)** Convey the object code in, or embodied in, a physical product
239-
(including a physical distribution medium), accompanied by the
240-
Corresponding Source fixed on a durable physical medium
241-
customarily used for software interchange.
242-
- **b)** Convey the object code in, or embodied in, a physical product
243-
(including a physical distribution medium), accompanied by a
244-
written offer, valid for at least three years and valid for as
245-
long as you offer spare parts or customer support for that product
246-
model, to give anyone who possesses the object code either **(1)** a
247-
copy of the Corresponding Source for all the software in the
248-
product that is covered by this License, on a durable physical
249-
medium customarily used for software interchange, for a price no
250-
more than your reasonable cost of physically performing this
251-
conveying of source, or **(2)** access to copy the
252-
Corresponding Source from a network server at no charge.
253-
- **c)** Convey individual copies of the object code with a copy of the
254-
written offer to provide the Corresponding Source. This
255-
alternative is allowed only occasionally and noncommercially, and
256-
only if you received the object code with such an offer, in accord
257-
with subsection 6b.
258-
- **d)** Convey the object code by offering access from a designated
259-
place (gratis or for a charge), and offer equivalent access to the
260-
Corresponding Source in the same way through the same place at no
261-
further charge. You need not require recipients to copy the
262-
Corresponding Source along with the object code. If the place to
263-
copy the object code is a network server, the Corresponding Source
264-
may be on a different server (operated by you or a third party)
265-
that supports equivalent copying facilities, provided you maintain
266-
clear directions next to the object code saying where to find the
267-
Corresponding Source. Regardless of what server hosts the
268-
Corresponding Source, you remain obligated to ensure that it is
269-
available for as long as needed to satisfy these requirements.
270-
- **e)** Convey the object code using peer-to-peer transmission, provided
271-
you inform other peers where the object code and Corresponding
272-
Source of the work are being offered to the general public at no
273-
charge under subsection 6d.
238+
- **a)** Convey the object code in, or embodied in, a physical product
239+
(including a physical distribution medium), accompanied by the
240+
Corresponding Source fixed on a durable physical medium
241+
customarily used for software interchange.
242+
- **b)** Convey the object code in, or embodied in, a physical product
243+
(including a physical distribution medium), accompanied by a
244+
written offer, valid for at least three years and valid for as
245+
long as you offer spare parts or customer support for that product
246+
model, to give anyone who possesses the object code either **(1)** a
247+
copy of the Corresponding Source for all the software in the
248+
product that is covered by this License, on a durable physical
249+
medium customarily used for software interchange, for a price no
250+
more than your reasonable cost of physically performing this
251+
conveying of source, or **(2)** access to copy the
252+
Corresponding Source from a network server at no charge.
253+
- **c)** Convey individual copies of the object code with a copy of the
254+
written offer to provide the Corresponding Source. This
255+
alternative is allowed only occasionally and noncommercially, and
256+
only if you received the object code with such an offer, in accord
257+
with subsection 6b.
258+
- **d)** Convey the object code by offering access from a designated
259+
place (gratis or for a charge), and offer equivalent access to the
260+
Corresponding Source in the same way through the same place at no
261+
further charge. You need not require recipients to copy the
262+
Corresponding Source along with the object code. If the place to
263+
copy the object code is a network server, the Corresponding Source
264+
may be on a different server (operated by you or a third party)
265+
that supports equivalent copying facilities, provided you maintain
266+
clear directions next to the object code saying where to find the
267+
Corresponding Source. Regardless of what server hosts the
268+
Corresponding Source, you remain obligated to ensure that it is
269+
available for as long as needed to satisfy these requirements.
270+
- **e)** Convey the object code using peer-to-peer transmission, provided
271+
you inform other peers where the object code and Corresponding
272+
Source of the work are being offered to the general public at no
273+
charge under subsection 6d.
274274

275275
A separable portion of the object code, whose source code is excluded
276276
from the Corresponding Source as a System Library, need not be
@@ -344,23 +344,23 @@ Notwithstanding any other provision of this License, for material you
344344
add to a covered work, you may (if authorized by the copyright holders of
345345
that material) supplement the terms of this License with terms:
346346

347-
- **a)** Disclaiming warranty or limiting liability differently from the
348-
terms of sections 15 and 16 of this License; or
349-
- **b)** Requiring preservation of specified reasonable legal notices or
350-
author attributions in that material or in the Appropriate Legal
351-
Notices displayed by works containing it; or
352-
- **c)** Prohibiting misrepresentation of the origin of that material, or
353-
requiring that modified versions of such material be marked in
354-
reasonable ways as different from the original version; or
355-
- **d)** Limiting the use for publicity purposes of names of licensors or
356-
authors of the material; or
357-
- **e)** Declining to grant rights under trademark law for use of some
358-
trade names, trademarks, or service marks; or
359-
- **f)** Requiring indemnification of licensors and authors of that
360-
material by anyone who conveys the material (or modified versions of
361-
it) with contractual assumptions of liability to the recipient, for
362-
any liability that these contractual assumptions directly impose on
363-
those licensors and authors.
347+
- **a)** Disclaiming warranty or limiting liability differently from the
348+
terms of sections 15 and 16 of this License; or
349+
- **b)** Requiring preservation of specified reasonable legal notices or
350+
author attributions in that material or in the Appropriate Legal
351+
Notices displayed by works containing it; or
352+
- **c)** Prohibiting misrepresentation of the origin of that material, or
353+
requiring that modified versions of such material be marked in
354+
reasonable ways as different from the original version; or
355+
- **d)** Limiting the use for publicity purposes of names of licensors or
356+
authors of the material; or
357+
- **e)** Declining to grant rights under trademark law for use of some
358+
trade names, trademarks, or service marks; or
359+
- **f)** Requiring indemnification of licensors and authors of that
360+
material by anyone who conveys the material (or modified versions of
361+
it) with contractual assumptions of liability to the recipient, for
362+
any liability that these contractual assumptions directly impose on
363+
those licensors and authors.
364364

365365
All other non-permissive additional terms are considered “further
366366
restrictions” within the meaning of section 10. If the Program as you
Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Dockerfile for Open Archiver
1+
# Dockerfile for the OSS version of Open Archiver
22

33
ARG BASE_IMAGE=node:22-alpine
44

@@ -15,32 +15,33 @@ COPY package.json pnpm-workspace.yaml pnpm-lock.yaml* ./
1515
COPY packages/backend/package.json ./packages/backend/
1616
COPY packages/frontend/package.json ./packages/frontend/
1717
COPY packages/types/package.json ./packages/types/
18+
COPY apps/open-archiver/package.json ./apps/open-archiver/
1819

1920
# 1. Build Stage: Install all dependencies and build the project
2021
FROM base AS build
2122
COPY packages/frontend/svelte.config.js ./packages/frontend/
2223

23-
# Install all dependencies. Use --shamefully-hoist to create a flat node_modules structure
24+
# Install all dependencies.
2425
ENV PNPM_HOME="/pnpm"
2526
RUN --mount=type=cache,id=pnpm,target=/pnpm/store \
2627
pnpm install --shamefully-hoist --frozen-lockfile --prod=false
2728

2829
# Copy the rest of the source code
2930
COPY . .
3031

31-
# Build all packages.
32-
RUN pnpm build
32+
# Build the OSS packages.
33+
RUN pnpm build:oss
3334

3435
# 2. Production Stage: Install only production dependencies and copy built artifacts
3536
FROM base AS production
3637

37-
3838
# Copy built application from build stage
3939
COPY --from=build /app/packages/backend/dist ./packages/backend/dist
40-
COPY --from=build /app/packages/frontend/build ./packages/frontend/build
41-
COPY --from=build /app/packages/types/dist ./packages/types/dist
4240
COPY --from=build /app/packages/backend/drizzle.config.ts ./packages/backend/drizzle.config.ts
4341
COPY --from=build /app/packages/backend/src/database/migrations ./packages/backend/src/database/migrations
42+
COPY --from=build /app/packages/frontend/build ./packages/frontend/build
43+
COPY --from=build /app/packages/types/dist ./packages/types/dist
44+
COPY --from=build /app/apps/open-archiver/dist ./apps/open-archiver/dist
4445

4546
# Copy the entrypoint script and make it executable
4647
COPY docker/docker-entrypoint.sh /usr/local/bin/
@@ -53,4 +54,4 @@ EXPOSE 3000
5354
ENTRYPOINT ["docker-entrypoint.sh"]
5455

5556
# Start the application
56-
CMD ["pnpm", "docker-start"]
57+
CMD ["pnpm", "docker-start:oss"]

apps/open-archiver/index.ts

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
import { createServer, logger } from '@open-archiver/backend';
2+
import * as dotenv from 'dotenv';
3+
4+
dotenv.config();
5+
6+
async function start() {
7+
// --- Environment Variable Validation ---
8+
const { PORT_BACKEND } = process.env;
9+
10+
if (!PORT_BACKEND) {
11+
throw new Error('Missing required environment variables for the backend: PORT_BACKEND.');
12+
}
13+
// Create the server instance (passing no modules for the default OSS version)
14+
const app = await createServer([]);
15+
16+
app.listen(PORT_BACKEND, () => {
17+
logger.info({}, `✅ Open Archiver (OSS) running on port ${PORT_BACKEND}`);
18+
});
19+
}
20+
21+
start().catch((error) => {
22+
logger.error({ error }, 'Failed to start the server:', error);
23+
process.exit(1);
24+
});

apps/open-archiver/package.json

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{
2+
"name": "open-archiver-app",
3+
"version": "1.0.0",
4+
"private": true,
5+
"scripts": {
6+
"dev": "ts-node-dev --respawn --transpile-only index.ts",
7+
"build": "tsc",
8+
"start": "node dist/index.js"
9+
},
10+
"dependencies": {
11+
"@open-archiver/backend": "workspace:*",
12+
"dotenv": "^17.2.0"
13+
},
14+
"devDependencies": {
15+
"@types/dotenv": "^8.2.3",
16+
"ts-node-dev": "^2.0.0"
17+
}
18+
}

apps/open-archiver/tsconfig.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"extends": "../../tsconfig.base.json",
3+
"compilerOptions": {
4+
"outDir": "dist"
5+
},
6+
"include": ["./**/*.ts"],
7+
"references": [{ "path": "../../packages/backend" }]
8+
}

0 commit comments

Comments
 (0)