Skip to content

Commit 5c5ee7f

Browse files
MrChengLenclaude
andcommitted
docs: add DPA template + move sprint-s1 history out of public repo
Two findings from the post-merge 4-eyes audit (3 expert agents: PM, Business, Security). Both surgical — no code changes. 1. NEW docs/dpa-template.md (Article 28 GDPR skeleton) The Compliance-Edition pitch in /enterprise and COMMERCIAL-LICENSE.md promises a "DPA template" inclusive in every tier. Until now, no such file lived in the repo — a procurement reviewer asking "send the DPA template" got nothing back. Business-Agent flagged this as a HIGH liability and trust gap before first customer. The new docs/dpa-template.md is a 12-section Art. 28 GDPR skeleton with bracketed placeholders that get filled in the pilot conversation: parties, subject matter, nature of processing, data subjects + data categories, audit log + integrity attestation, sub-processors (cross- links existing docs/sub-processors.md), TOMs (cross-links security/ threat-model/patch-policy/incident-response/release-signing), the controller's instructions and rights, breach notification (72 h), return/deletion at end, liability, governing law (Hamburg). The finalisation note describes the contact path (legal@filemorph.io) and turnaround. This is published as a template — not a binding contract — so a reviewer can read the substance before requesting a signed instance. The wording matches the existing /enterprise template language ("DPA template in pilot conversation"), so the public claim is now backed by a public artefact. 2. REMOVE docs/sprint-s1-technology-first.md (moved to docs-internal/) PM-Agent + Business-Agent both flagged this as sprint-internal history unsuitable for the public repo. The file lists S1 commit SHAs with internal rationale ("our 2 GB output cap would have OOM-killed a small- RAM server", "we audited as one batch before push") — useful when reviewing historical engineering decisions, not useful for self-hosters or compliance reviewers. Self-host onboarding lives in README + docs/installation.md + docs/self-hosting.md, not in this sprint recap. The file is preserved locally under docs-internal/sprint-history/ (gitignored, intentionally outside the public repo). Per the security audit's recommendation, no force-push or filter-repo: the historical content remains retrievable via git log on this commit, which matches the project's transparency posture (git history is a feature, not a liability, for a Compliance-Edition product whose customers audit provenance). What this PR does NOT touch (deferred): - .github/workflows/notify-ops.yml — moving requires coordinated change in MrChengLen/filemorph-ops (set up reverse polling first), then delete here; otherwise a deploy gap opens. Tracked for a follow-up PR that includes both sides. - enterprise.html / COMMERCIAL-LICENSE.md claims-audit — three promise/reality wording fixes need legal review before commit (Business-Agent recommendation). Tracked for PR-Audit-Claims-1. - .env.example sectioning + Phase 2/3 of the post-audit plan — separate PRs to keep diffs reviewable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 8665ca2 commit 5c5ee7f

4 files changed

Lines changed: 208 additions & 68 deletions

File tree

.githooks/pre-commit

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ FAIL=0
1313
# locale/ catalogs are mechanically extracted from the impressum/privacy/terms templates above —
1414
# they cannot avoid carrying the same address/email strings. Treating them as public is consistent
1515
# with the source templates being public.
16-
ALLOW_RE='^(app/templates/(impressum|privacy|terms)\.html|COMMERCIAL-LICENSE\.md|docs/gdpr-account-deletion-design\.md|docs/api-usage-guide\.md|docs/self-hosting\.md|docs-internal/.*|\.githooks/.*|\.github/workflows/scope-guard\.yml|CHANGELOG\.md|locale/.*\.(po|pot|mo))$'
16+
ALLOW_RE='^(app/templates/(impressum|privacy|terms)\.html|COMMERCIAL-LICENSE\.md|docs/gdpr-account-deletion-design\.md|docs/api-usage-guide\.md|docs/self-hosting\.md|docs/dpa-template\.md|docs-internal/.*|\.githooks/.*|\.github/workflows/scope-guard\.yml|CHANGELOG\.md|locale/.*\.(po|pot|mo))$'
1717

1818
# Personal/operational identifiers that should never land in public code.
1919
PATTERNS='lennart\.seidel@icloud\.com|lennart@filemorph\.io|Reetwerder|21029 Hamburg'

.githooks/pre-push

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ set -e
1515
ZERO=0000000000000000000000000000000000000000
1616

1717
# Same patterns as pre-commit — keep in sync.
18-
ALLOW_RE='^(app/templates/(impressum|privacy|terms)\.html|COMMERCIAL-LICENSE\.md|docs/gdpr-account-deletion-design\.md|docs/api-usage-guide\.md|docs/self-hosting\.md|docs-internal/.*|\.githooks/.*|\.github/workflows/scope-guard\.yml|CHANGELOG\.md|locale/.*\.(po|pot|mo))$'
18+
ALLOW_RE='^(app/templates/(impressum|privacy|terms)\.html|COMMERCIAL-LICENSE\.md|docs/gdpr-account-deletion-design\.md|docs/api-usage-guide\.md|docs/self-hosting\.md|docs/dpa-template\.md|docs-internal/.*|\.githooks/.*|\.github/workflows/scope-guard\.yml|CHANGELOG\.md|locale/.*\.(po|pot|mo))$'
1919
PATTERNS='lennart\.seidel@icloud\.com|lennart@filemorph\.io|Reetwerder|21029 Hamburg'
2020
OPS_PATTERNS='/opt/filemorph(/|$|[[:space:]])|/var/log/filemorph|/home/deploy([[:space:]]|/)|Hetzner CX|HETZNER_HOST|HETZNER_SSH_USER|HETZNER_SSH_KEY|OPS_REPO_DISPATCH_PAT|GHCR_PAT|appleboy/ssh-action'
2121
SECRET_ASSIGN='(JWT_SECRET|SMTP_PASSWORD|STRIPE_SECRET_KEY|STRIPE_WEBHOOK_SECRET|DATABASE_URL|API_KEY|POSTGRES_PASSWORD|GHCR_PAT|OPS_REPO_DISPATCH_PAT|HETZNER_SSH_KEY)[[:space:]]*=[[:space:]]*[^[:space:]$]'

docs/dpa-template.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,206 @@
1+
# Data Processing Agreement (DPA) — Template
2+
3+
**Status:** Skeleton template, finalised individually in pilot conversations.
4+
**Last reviewed:** 2026-05-08
5+
6+
This document is the starting point for a Data Processing Agreement (DPA)
7+
under Article 28 GDPR between a FileMorph Compliance-Edition customer
8+
(*controller*) and the FileMorph operator (*processor*). It is published
9+
in the open-source repository so a procurement reviewer can read the
10+
substance before requesting a binding contract.
11+
12+
The text below is **not a binding contract** as-is. The final DPA is
13+
drafted in the pilot conversation with each customer, with the bracketed
14+
placeholders filled in from the concrete deployment context (instance
15+
location, scope of processing, named contact, etc.). When you are ready
16+
to finalise, contact `legal@filemorph.io`.
17+
18+
For the public sub-processor list referenced in §6 below, see
19+
[`docs/sub-processors.md`](sub-processors.md).
20+
21+
---
22+
23+
## 1. Parties
24+
25+
**Controller** (the customer):
26+
- Legal name: `[CUSTOMER LEGAL NAME]`
27+
- Address: `[CUSTOMER ADDRESS]`
28+
- Authorised signatory: `[NAME, ROLE]`
29+
30+
**Processor** (the FileMorph operator):
31+
- Legal name: Lennart Seidel
32+
- Address: Reetwerder 25b, 21029 Hamburg, Germany
33+
- Contact: `legal@filemorph.io`
34+
35+
The processor is the operator of the FileMorph Compliance-Edition
36+
deployment named in §3 (the "Service"). For self-hosted deployments
37+
operated entirely on the controller's own infrastructure, the controller
38+
is also the operator and this template does not apply — there is no
39+
processor relationship.
40+
41+
## 2. Subject matter and duration
42+
43+
The processor processes personal data on behalf of the controller for
44+
the sole purpose of operating the Service. Processing begins on
45+
`[EFFECTIVE DATE]` and continues for the term of the underlying service
46+
agreement, ending no later than thirty (30) days after termination
47+
(during which residual processing for deletion or export is permitted).
48+
49+
## 3. Nature and purpose of processing
50+
51+
The Service performs file conversion, compression, and integrity-attested
52+
output generation for files uploaded by the controller's authorised
53+
users. Processing operations include:
54+
55+
- Receiving uploaded files via HTTPS
56+
- Running format conversion / compression in transient memory and
57+
ephemeral filesystem locations
58+
- Returning the converted output and a SHA-256 integrity header
59+
- Writing structured logs (no file content; only metadata: tier, format
60+
pair, byte counts, duration, success flag)
61+
- Recording audit events for actions affecting accounts or entitlements
62+
(registration, login, key creation, deletion, billing changes)
63+
64+
The Service does **not** perform any analytics, profiling, advertising,
65+
or data sale.
66+
67+
## 4. Categories of data subjects and personal data
68+
69+
**Data subjects:**
70+
- The controller's employees, agents, or contractors who use the
71+
Service (the "users")
72+
- Any natural persons whose personal data appears in files the users
73+
upload — categories not known to the processor in advance
74+
75+
**Personal data:**
76+
- User identifiers: email address (registration), bcrypt password hash,
77+
IP address (request logs only, rotated within 30 days), session JWT
78+
identifiers
79+
- File contents during processing — deleted from memory and disk
80+
immediately after the converted output is returned (typical
81+
retention: seconds; absolute upper bound: 10 minutes via startup
82+
sweep, see `app/main.py`)
83+
- Audit-event records (see §5 below) — retained per the controller's
84+
configured retention policy
85+
86+
## 5. Audit log and integrity attestation
87+
88+
Every Compliance-Edition deployment writes a tamper-evident audit log
89+
(SHA-256 hash chain, see `app/core/audit.py` and Migration 005). Each
90+
entry contains:
91+
92+
- Event type, timestamp, actor identifier, actor IP, payload digest
93+
- Hash of the previous event (chain integrity)
94+
95+
The audit log is the controller's record of processing activities under
96+
Article 30 GDPR. The retention period defaults to `[RETENTION DAYS]`
97+
and is configurable via the `AUDIT_RETENTION_DAYS` environment variable.
98+
99+
Each converted output carries an `X-Output-SHA256` response header so
100+
the controller can independently verify integrity.
101+
102+
## 6. Sub-processors
103+
104+
The processor uses the sub-processors listed in
105+
[`docs/sub-processors.md`](sub-processors.md). The default list applies
106+
unless the controller and processor agree in writing to a reduced
107+
scope at finalisation.
108+
109+
The processor will inform the controller of any intended additions or
110+
replacements at least thirty (30) days in advance. The controller may
111+
object on reasonable grounds; in such case the parties will negotiate
112+
in good faith, and absent agreement either party may terminate the
113+
service agreement.
114+
115+
## 7. Technical and organisational measures (TOM)
116+
117+
The processor implements the measures documented in:
118+
- [`docs/security-overview.md`](security-overview.md)
119+
- [`docs/threat-model.md`](threat-model.md)
120+
- [`docs/patch-policy.md`](patch-policy.md)
121+
- [`docs/incident-response.md`](incident-response.md)
122+
- [`docs/release-signing.md`](release-signing.md)
123+
124+
These cover: encryption in transit (TLS 1.2+, HSTS), at-rest scope (no
125+
persistent file storage by design), access control (timing-safe API key
126+
validation, JWT-bound roles, admin role with database recheck per
127+
request), key management, software-supply-chain hardening (cosign-signed
128+
images, signed Git tags, CycloneDX SBOM), and incident-response
129+
timelines.
130+
131+
A summary of the measures is appended at finalisation as
132+
"Annex II — Technical and Organisational Measures" tailored to the
133+
specific deployment.
134+
135+
## 8. Controller's instructions and rights
136+
137+
The processor processes personal data only on documented instructions
138+
from the controller, including with regard to transfers to third
139+
countries. The instructions are this DPA and any subsequent written
140+
instructions from the named contact in §1.
141+
142+
The controller has the right to:
143+
144+
- Receive on request, in a commonly used machine-readable format, all
145+
personal data processed on its behalf (Art. 20 GDPR)
146+
- Audit the processor's compliance with this DPA, on reasonable notice
147+
and at the controller's expense, no more than once per twelve months
148+
unless an incident has been reported
149+
- Demand erasure of all personal data after termination, except where
150+
Union or Member-State law requires retention (notably: tax-relevant
151+
records under HGB §257 / AO §147, ten-year period)
152+
153+
## 9. Personal data breach notification
154+
155+
If the processor becomes aware of a personal data breach affecting the
156+
controller's data, the processor will notify the controller without
157+
undue delay and in any case **within 72 hours** of becoming aware. The
158+
notification will include: nature of the breach, categories and
159+
approximate number of data subjects and records concerned, likely
160+
consequences, measures taken or proposed.
161+
162+
The processor's incident-response procedure (see
163+
[`docs/incident-response.md`](incident-response.md)) governs the
164+
internal handling of the breach.
165+
166+
## 10. Return or deletion at end of provision
167+
168+
Upon termination of the underlying service agreement, the processor
169+
will, at the controller's choice:
170+
171+
- Return all personal data in a commonly used machine-readable format
172+
within thirty (30) days, or
173+
- Delete all personal data and certify the deletion in writing
174+
175+
Records that the processor is legally obliged to retain (tax records,
176+
fraud-prevention records under §257 HGB / §147 AO) are retained for
177+
the statutory period and deleted thereafter without further request.
178+
179+
## 11. Liability and limitations
180+
181+
Liability for breach of this DPA is governed by the underlying service
182+
agreement. Each party is liable for its own infringements of Articles
183+
82–84 GDPR. Joint and several liability under Art. 82(4) GDPR is not
184+
excluded.
185+
186+
## 12. Governing law
187+
188+
This DPA is governed by the laws of the Federal Republic of Germany.
189+
Place of jurisdiction is Hamburg, Germany.
190+
191+
---
192+
193+
## How to finalise
194+
195+
1. Review the bracketed placeholders in §1 and §2 and fill them with
196+
the deployment context.
197+
2. Replace `[RETENTION DAYS]` in §5 with the configured value.
198+
3. Append "Annex II — Technical and Organisational Measures" with the
199+
measures specific to the deployment (instance location, network
200+
segmentation, on-call, penetration-test status).
201+
4. Both parties counter-sign a printed PDF or qualified-electronic
202+
signature; the FileMorph operator counter-signature is provided
203+
from `legal@filemorph.io` after the pilot conversation closes.
204+
205+
Send the completed draft to `legal@filemorph.io`. Turnaround is
206+
typically two business days.

docs/sprint-s1-technology-first.md

Lines changed: 0 additions & 66 deletions
This file was deleted.

0 commit comments

Comments
 (0)