Skip to content

✨(rfc5322) switch back to Python's stdlib for email composition#657

Closed
sylvinus wants to merge 1 commit into
mainfrom
stdlib_composer
Closed

✨(rfc5322) switch back to Python's stdlib for email composition#657
sylvinus wants to merge 1 commit into
mainfrom
stdlib_composer

Conversation

@sylvinus
Copy link
Copy Markdown
Member

@sylvinus sylvinus commented May 5, 2026

Latest fixes in the stdlib make it a more solid alternative for strict composition than Flanker. We keep Flanker for now for lenient inbound parsing.

Summary by CodeRabbit

  • Chores

    • Updated Python runtime to 3.14.4 across services; CI/dev tooling tweaks and new intensive fuzz target and test-run forwarding.
  • New Features

    • Improved email composition with stricter RFC5322 compliance and safer header/attachment handling.
    • Dual-entry parsing model for distinct inbound (lenient) and outbound (strict) flows.
  • Documentation

    • Expanded email docs with architecture, security notes, examples.
  • Tests

    • Added extensive unit, regression and fuzz tests; adjusted tests to tolerate MIME encoding/line-folding quirks.
  • Bug Fixes

    • Fixed formatting/assertion brittleness around CRLF and quoted-printable line folding.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Warning

Rate limit exceeded

@sylvinus has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 40 minutes and 28 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 65611afd-dc20-417f-87d4-2cf1db1eed6e

📥 Commits

Reviewing files that changed from the base of the PR and between 2916f9c and f3e9b67.

⛔ Files ignored due to path filters (3)
  • src/backend/uv.lock is excluded by !**/*.lock
  • src/mta-in/uv.lock is excluded by !**/*.lock
  • src/mta-out/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • Makefile
  • bin/pytest
  • src/backend/Dockerfile
  • src/backend/core/factories.py
  • src/backend/core/mda/rfc5322/README.md
  • src/backend/core/mda/rfc5322/composer.py
  • src/backend/core/services/importer/pst.py
  • src/backend/core/tests/api/test_inbound_widget.py
  • src/backend/core/tests/api/test_send_message_signature.py
  • src/backend/core/tests/mda/test_outbound.py
  • src/backend/core/tests/mda/test_rfc5322_composer.py
  • src/backend/core/tests/mda/test_rfc5322_composer_fuzz.py
  • src/backend/pyproject.toml
  • src/mta-in/Dockerfile
  • src/mta-in/pyproject.toml
  • src/mta-out/Dockerfile
  • src/mta-out/pyproject.toml
📝 Walkthrough

Walkthrough

This PR upgrades Python to 3.14.4 across images and pyproject metadata, replaces the outbound RFC5322 composer with a stdlib-based implementation (header/date sanitization, attachment/CID handling, keep_bcc), expands unit and Hypothesis fuzz tests, and adds Ruff tooling config.

Changes

Infrastructure & Build Updates

Layer / File(s) Summary
Base Image & Python Version
src/backend/Dockerfile, src/mta-in/Dockerfile, src/mta-out/Dockerfile
Python base image updated from 3.14.3 to 3.14.4 in Dockerfiles; backend uv base digest updated and PYDIR/python install updated to 3.14.4.
Project Metadata
src/backend/pyproject.toml, src/mta-in/pyproject.toml, src/mta-out/pyproject.toml
Classifiers updated to Python 3.14; requires-python tightened to >=3.14.4,<4.0 across services. Backend pyproject.toml adds a Ruff configuration block.
Makefile / Test Runner
Makefile, bin/pytest
New fuzz-back-intensive target added; bin/pytest now forwards FUZZ_EXAMPLES into container env when set.

RFC5322 Email Composition Architecture Refactor

Layer / File(s) Summary
Architecture & Documentation
src/backend/core/mda/rfc5322/README.md
README rewritten to present dual-entry design: stdlib-based composer for strict RFC5322 outbound and Flanker-based parser for lenient inbound parsing; documents JMAP shape and usage examples.
Core Primitives & Validation
src/backend/core/mda/rfc5322/composer.py
Introduces stdlib MIMEPart usage, SMTP policy clone, reserved-header filtering, address formatting, header sanitization, Message-ID and date normalization helpers, and header-name validation.
Attachment & Body Construction
src/backend/core/mda/rfc5322/composer.py
Attachment part creation with base64 decoding, content-type parsing, filename/CID sanitization, inline vs regular partitioning, body helpers for text/plain/text/html, and multipart nesting (alternative/related/mixed).
Compose & Serialize
src/backend/core/mda/rfc5322/composer.py
create_multipart_message and compose_email serialize MIME tree to RFC5322 bytes with BytesGenerator under SMTP policy, support safe prepend_headers, preserve Bcc when requested (keep_bcc=True), and wrap stdlib errors as EmailComposeError.
Reply/Forward Builders
src/backend/core/mda/rfc5322/composer.py
Shared embedding helper for replies/forwards, resilient original-date formatting, quoted text/HTML generation, and conditional threading header emission only for validated Message-IDs.
Composer Tests
src/backend/core/tests/mda/test_rfc5322_composer.py
Expanded tests: address-formatting, long-line constraints, Flanker-regression suite, security/hardening tests (T1–T8), reply/threading parametrized cases, and payload-decoding adjustments.
Fuzz Tests
src/backend/core/tests/mda/test_rfc5322_composer_fuzz.py
New Hypothesis-based fuzzing module covering compose paths, invariants (CRLF, line length, headers), Bcc handling, determinism, and no state leak.
API / Outbound Test Adjustments
src/backend/core/tests/api/*, src/backend/core/tests/mda/test_outbound.py
Tests updated to normalize/unfold quoted-printable line breaks (remove =\r\n) and strip trailing CRLF before assertions to be robust to MIME encoding line folding.
Importer & Factories
src/backend/core/services/importer/pst.py, src/backend/core/factories.py
PST reconstruct_eml now calls compose_email(..., keep_bcc=True); MessageFactory mime_id updated to include a test domain suffix.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • suitenumerique/messages#556: Related through shared build/tooling surface changes (Dockerfiles, pyproject.toml, uv-based packaging updates).
  • suitenumerique/messages#656: Related through RFC5322 composition/parsing test modifications and Flanker-related handling.
  • suitenumerique/messages#599: Related via integration points that consume parse/compose entry points (SubmitRawEmailView, outbound.prepare_outbound_message).

Suggested reviewers

  • sdemagny
  • jbpenrath
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: switching email composition from Flanker to Python's stdlib. It directly aligns with the PR objectives and core file modifications.
Docstring Coverage ✅ Passed Docstring coverage is 92.06% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/core/mda/rfc5322/composer.py`:
- Around line 196-200: The _split_content_type function can yield a subtype that
still contains parameters (e.g. "jpeg; name=foo"), so change its parsing to drop
any parameters and trim whitespace: split the incoming content_type on the first
"/" to get maintype and raw_subtype, then split raw_subtype on ";" and take the
first token, strip both maintype and subtype of whitespace, and fall back to
"application","octet-stream" if either is empty; update _split_content_type
accordingly so callers (e.g. set_content) never receive parameterized subtype
strings.
- Around line 372-385: The current partitioning of attachments uses a costly
dict-membership check ([a for a in attachments if a not in inline_attachments])
which relies on dict equality and is O(n*m); replace it with a single-pass
partition: iterate once over attachments, check if a.get("disposition") ==
"inline" and a.get("cid") and append to inline_attachments or
regular_attachments accordingly so inline_attachments and regular_attachments
are built in one pass (this affects the variables attachments,
inline_attachments, regular_attachments used before calling
_wrap_with_inline_images and _wrap_with_attachments).
- Around line 86-112: The call to .replace("Z", "+00:00") in _normalize_date is
redundant for datetime.fromisoformat on Python ≥3.11; remove the .replace(...)
so the code uses datetime.datetime.fromisoformat(date) directly, keeping the
subsequent tzinfo checks and timezone.make_aware(parsed, datetime.timezone.utc)
behavior intact; update the first try block in _normalize_date that currently
assigns parsed = datetime.datetime.fromisoformat(date.replace("Z", "+00:00")) to
parsed = datetime.datetime.fromisoformat(date) and leave the
parsedate_to_datetime fallback and other logic unchanged.
- Around line 418-431: The current clear-and-rebuild of msg when prepend_headers
is set relies on policy round-tripping and is fragile; instead, modify the
prepend_headers handling in composer.py to insert sanitized headers directly
into the message's internal header list (msg._headers) at the front to avoid
re-parsing by the policy: build a list of (k, _sanitize_header_value(str(v)))
for each prepend header and splice it into msg._headers at index 0 so existing
headers remain unchanged, then continue to use BytesGenerator(...,
policy=_POLICY).flatten(msg); if you prefer not to touch internals, add a short
inline comment next to the prepend_headers block explaining the current
dependency on header_store_parse behavior (mentioning _POLICY) so future
maintainers understand the implicit contract.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6a2ce586-2ca2-4953-8304-5775e3b7b42f

📥 Commits

Reviewing files that changed from the base of the PR and between d0de96c and c3d9c3c.

⛔ Files ignored due to path filters (3)
  • src/backend/uv.lock is excluded by !**/*.lock
  • src/mta-in/uv.lock is excluded by !**/*.lock
  • src/mta-out/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • src/backend/Dockerfile
  • src/backend/core/mda/rfc5322/README.md
  • src/backend/core/mda/rfc5322/composer.py
  • src/backend/core/tests/api/test_inbound_widget.py
  • src/backend/core/tests/api/test_send_message_signature.py
  • src/backend/core/tests/mda/test_outbound.py
  • src/backend/core/tests/mda/test_rfc5322_composer.py
  • src/backend/pyproject.toml
  • src/mta-in/Dockerfile
  • src/mta-in/pyproject.toml
  • src/mta-out/Dockerfile
  • src/mta-out/pyproject.toml

Comment thread src/backend/core/mda/rfc5322/composer.py
Comment thread src/backend/core/mda/rfc5322/composer.py
Comment thread src/backend/core/mda/rfc5322/composer.py Outdated
Comment thread src/backend/core/mda/rfc5322/composer.py
Comment thread src/backend/core/mda/rfc5322/composer.py Fixed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
src/backend/core/api/serializers.py (1)

1719-1725: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix Python 3 multi-except syntax.

Line 1723 uses invalid Python 2-style except ValueError, TypeError: syntax. This prevents src/backend/core/api/serializers.py from importing and breaks all API endpoints that depend on this serializer.

Suggested fix
         for tag_id in tags:
             # Validate UUID format
             try:
                 tag_uuid = uuid.UUID(str(tag_id))
-            except ValueError, TypeError:
+            except (ValueError, TypeError):
                 invalid_tags.append(tag_id)
                 continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/api/serializers.py` around lines 1719 - 1725, The except
clause in the tag validation loop uses Python 2 syntax and must be updated; in
the loop iterating "for tag_id in tags" (where tag_uuid = uuid.UUID(str(tag_id))
and invalid_tags is appended on failure) replace the invalid "except ValueError,
TypeError:" with a modern multi-except grouping "except (ValueError,
TypeError):" so both exceptions are caught correctly and the serializer (in
src/backend/core/api/serializers.py around the tag validation block) can import
under Python 3.
src/backend/core/admin.py (1)

43-48: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix Python 3 syntax error in exception handler on line 47.

The except clause uses Python 2 comma syntax (except A, B:) which is invalid in Python 3. This will cause a SyntaxError when Django imports the admin module, breaking the application startup.

Suggested fix
     def format_value(self, value):
         if isinstance(value, str):
             try:
                 value = json.dumps(json.loads(value), indent=2, ensure_ascii=False)
-            except json.JSONDecodeError, TypeError:
+            except (json.JSONDecodeError, TypeError):
                 pass
         return value
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/admin.py` around lines 43 - 48, The except clause in
format_value is using Python 2 comma syntax and must be updated to Python 3
tuple-style exception handling; change the handler in the format_value method
(which currently reads the invalid "except json.JSONDecodeError, TypeError:") to
catch both exceptions using a tuple (i.e. except (json.JSONDecodeError,
TypeError):) so JSON decoding errors and TypeError are properly ignored without
causing a SyntaxError.
src/backend/core/api/permissions.py (1)

397-402: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix the exception tuple syntax—Python 3 requires parentheses here.

Line 401 uses Python 2-style except A, B: syntax. Python 3.11 raises SyntaxError: multiple exception types must be parenthesized, preventing src/backend/core/api/permissions.py from importing at all. All permission checks relying on this module will fail.

Suggested fix
         try:
             target_mailbox = models.Mailbox.objects.select_related("domain").get(
                 pk=mailbox_id_from_url
             )
-        except models.Mailbox.DoesNotExist, ValueError:  # ValueError for invalid UUID
+        except (models.Mailbox.DoesNotExist, ValueError):  # ValueError for invalid UUID
             return False
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/api/permissions.py` around lines 397 - 402, The except
clause on the Mailbox lookup uses Python 2 tuple syntax and must be
parenthesized; update the exception handler around the target_mailbox lookup in
the permission check (the try block that calls
models.Mailbox.objects.select_related("domain").get(pk=mailbox_id_from_url)) to
use a parenthesized tuple like except (models.Mailbox.DoesNotExist, ValueError):
and return False as before so the module imports cleanly under Python 3.
src/backend/core/tests/mda/test_rfc5322_composer.py (2)

1534-1555: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

test_create_reply_with_empty_original is duplicated verbatim from TestReplyGeneration.

The identical test already exists at lines 1311–1332. Having two copies in different classes silently diverge if the first is maintained and the second is not. Remove the duplicate from TestErrorHandling or replace it with a narrower error-specific variant.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/tests/mda/test_rfc5322_composer.py` around lines 1534 -
1555, Remove the duplicated test method test_create_reply_with_empty_original
from the TestErrorHandling test class (it already exists in TestReplyGeneration)
to avoid divergence; either delete the duplicate or replace it with a focused
error-specific variant that exercises error handling behavior (e.g., asserting
that missing original fields are handled gracefully by create_reply_message) and
keep references to create_reply_message, TestReplyGeneration, and
TestErrorHandling so the reviewer can find and modify the correct location.

636-641: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

get_payload() without decode=True is fragile and inconsistent with all other payload assertions in this file.

For content that gets a quoted-printable or base64 transfer encoding, get_payload() returns the encoded form, not the original text, so the in assertion would silently fail. Every other payload check in this file uses get_payload(decode=True).decode(charset or "utf-8") — apply the same pattern here.

💡 Proposed fix for both occurrences
-        assert "This email has no subject." in msg.get_payload()
+        payload = msg.get_payload(decode=True).decode(msg.get_content_charset() or "utf-8")
+        assert "This email has no subject." in payload
-        assert "Minimal email." in msg.get_payload()
+        payload = msg.get_payload(decode=True).decode(msg.get_content_charset() or "utf-8")
+        assert "Minimal email." in payload

Also applies to: 656-659

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/tests/mda/test_rfc5322_composer.py` around lines 636 - 641,
Replace the fragile use of msg.get_payload() with the same decoded pattern used
elsewhere: call msg.get_payload(decode=True) and decode it with the message
charset (e.g. payload =
msg.get_payload(decode=True).decode(msg.get_content_charset() or "utf-8")) and
then assert the substring is in that decoded payload; update both occurrences
that use msg.get_payload() (the assertions around the no-subject email and the
later block at the other occurrence).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/core/api/viewsets/image_proxy.py`:
- Line 130: The except clause uses Python 2 syntax ("except TypeError,
ValueError:") which is invalid in Python 3; change it to a valid Python 3 form
such as "except (TypeError, ValueError) as e:" (or separate except blocks) in
the image proxy error-handling block so the module can import; update any
subsequent references to the exception variable to use the new name (e) or
adjust logic if you opt for separate except TypeError: and except ValueError:
handlers.

In `@src/backend/core/api/viewsets/submit.py`:
- Line 67: Replace the invalid Python 2-style except clause that reads "except
models.Mailbox.DoesNotExist, ValueError, DjangoValidationError:" with the Python
3-compatible tuple syntax "except (models.Mailbox.DoesNotExist, ValueError,
DjangoValidationError):" so the exception handler in the submit viewset (the
block catching models.Mailbox.DoesNotExist and DjangoValidationError) parses
correctly; keep the existing handler logic unchanged.

In `@src/backend/core/entitlements/backends/deploycenter.py`:
- Line 70: Replace the Python2-style exception clause "except
requests.RequestException, ValueError:" with valid Python 3 syntax by grouping
the exception types and optionally binding the exception instance; e.g. change
to "except (requests.RequestException, ValueError) as err:" or split into two
except blocks ("except requests.RequestException as err:" and "except ValueError
as err:") in the function where this clause appears so the module can parse
under Python 3; update any references to the caught exception variable to use
the chosen name (e.g., err).

In `@src/backend/core/mda/outbound_direct.py`:
- Line 42: Replace the invalid Python 2 style exception clause with a Python 3
tuple or separate excepts: change the line using "except dns.resolver.NXDOMAIN,
dns.resolver.YXDOMAIN:" to "except (dns.resolver.NXDOMAIN,
dns.resolver.YXDOMAIN):" (or use two separate except blocks) and, if you need
the exception object, add "as e" (e.g. "except (dns.resolver.NXDOMAIN,
dns.resolver.YXDOMAIN) as e:"). This change should be made in the try/except
handling around DNS resolution in outbound_direct.py where dns.resolver.NXDOMAIN
and dns.resolver.YXDOMAIN are currently caught.

In `@src/backend/core/mda/rfc5322/composer.py`:
- Around line 86-88: The _HEADER_INJECTION_CHARS literal contains invisible
Unicode characters which are ambiguous in diffs; replace them with explicit
escape sequences and update the translation table construction: define
_HEADER_INJECTION_CHARS using visible escapes like
"\r\n\v\f\x00\x85\u2028\u2029" (including U+0085 NEL and U+2028/U+2029 as \u
escapes) and then rebuild _HEADER_INJECTION_TABLE = str.maketrans("", "",
_HEADER_INJECTION_CHARS) so the code and docstring clearly show which characters
are stripped (refer to symbols _HEADER_INJECTION_CHARS and
_HEADER_INJECTION_TABLE).
- Around line 100-101: _sanitize_header_value currently returns non-str inputs
unchanged, violating its -> str signature; update _sanitize_header_value to
always return a str by handling bytes (decode to str with a safe error handler)
and otherwise coercing non-str values with str(value), then perform the existing
sanitization/whitespace trimming before returning; ensure the function signature
and callers (e.g., where jmap_data["messageId"] is passed) receive a string.
- Around line 142-147: The logger.warning call in composer.py currently logs the
raw date value (`date`) which may contain PII; update the `logger.warning`
invocation to avoid printing `date` and only log the type (keep
`type(date).__name__`) and a generic message such as "Could not parse date (type
%s); falling back to current UTC" so the date value is not included; locate the
offending `logger.warning` call (references to `date` and `type(date).__name__`)
and remove the `%r`/`date` argument from the format and args.
- Around line 123-139: The except clauses use Python 2 syntax and must be
updated to Python 3 tuple-style exceptions: replace "except ValueError, OSError,
OverflowError:" with "except (ValueError, OSError, OverflowError):", replace
"except ValueError, TypeError:" with "except (ValueError, TypeError):", and
replace "except ValueError, TypeError, IndexError:" with "except (ValueError,
TypeError, IndexError):" in the date-parsing code that calls
datetime.datetime.fromisoformat, parsedate_to_datetime, and timezone.make_aware
so the module can import under Python 3.
- Around line 559-564: _in _embed_original_message the call to
parsedate_to_datetime(orig_date) is not guarded and can raise on malformed
RFC2822 dates; wrap the parse in a try/except (catch ValueError and IndexError,
or a broad Exception if you prefer parity with _normalize_date) and on exception
fall back to using orig_date as the formatted string (i.e., set date_str =
orig_date), or alternatively call _normalize_date(orig_date) instead of directly
calling parsedate_to_datetime to reuse the existing safe parsing logic.

In `@src/backend/core/mda/rfc5322/parser.py`:
- Line 71: The except statement in src/backend/core/mda/rfc5322/parser.py uses
Python 2 syntax "except LookupError, UnicodeDecodeError:" which is invalid in
Python 3; change it to the multi-exception tuple form "except (LookupError,
UnicodeDecodeError):" (or "except (LookupError, UnicodeDecodeError) as e:" if
the exception object is needed) to restore correct Python 3 exception handling
for the parser module.

In `@src/backend/core/models.py`:
- Around line 393-396: The exception handlers use Python2 syntax and the role
lookup is using an index-based access that never raises
MailDomainAccess.DoesNotExist; update each occurrence (e.g., the block using
self.accesses.filter(user=user).values("role")[0]["role"]) to use
.values_list("role", flat=True).first() to get the role safely and remove the
dead DoesNotExist handling, and replace any Python2-style except
MailDomainAccess.DoesNotExist, IndexError: with Python3 tuple-style or separate
excepts if truly needed (or delete the except entirely after switching to
.first()); search for occurrences around the symbols
self.accesses.filter(...).values(...) and MailDomainAccess.DoesNotExist and
apply the same change at all listed locations.

In `@src/backend/core/services/dns/providers/scaleway.py`:
- Line 197: The except clause in
src/backend/core/services/dns/providers/scaleway.py uses Python 2 comma syntax
("except ValueError, KeyError:") which is invalid in Python 3; change it to use
tuple syntax ("except (ValueError, KeyError):") at the same except block (the
exception handling in the Scaleway provider code) so both ValueError and
KeyError are caught correctly and the block remains syntactically valid.

In `@src/backend/core/services/importer/pst.py`:
- Around line 579-582: Replace the Python2-style comma-separated except clause
that wraps the num_recipients parse with valid Python3 exception syntax and
avoid the redundant catching of Exception; change the invalid "except
AttributeError, TypeError, ValueError, Exception:" to a Python3 tuple form
catching the specific errors like "except (AttributeError, TypeError,
ValueError):" and return result, and apply the same fix to the other identical
clause in this PST importer module (the one around parsing
message.number_of_recipients) so the file imports under Python3.

In `@src/backend/core/signals.py`:
- Line 291: The except clause in src/backend/core/signals.py (the except
ValueError, AttributeError: on line 291) uses Python 2 syntax; change it to
Python 3 form by catching both exceptions with a tuple: use except (ValueError,
AttributeError): and, if the block expects the exception object, use except
(ValueError, AttributeError) as e: and update any references to the exception
variable accordingly.

In `@src/backend/core/tests/mda/test_rfc5322_composer.py`:
- Around line 1703-1721: The test
test_t2_cid_with_unicode_line_separator_is_sanitized embeds ambiguous literal
line-separator characters; change the attachment cid value to use explicit
escapes (e.g. "id\u2028split" or "id\u2029split" as appropriate) instead of
literal U+2028/U+2029, and update the two assertions to explicitly check for
"\u2028" and "\u2029" (assert "\u2028" not in cid and assert "\u2029" not in
cid) so the intent and sanitizer behavior are unambiguous; locate these changes
around the jmap attachment cid and the checks that read the variable cid in
test_t2_cid_with_unicode_line_separator_is_sanitized.

In `@src/backend/messages/settings.py`:
- Line 40: The except clause in settings.py currently uses Python 2 syntax
("except FileNotFoundError, KeyError:") which causes a SyntaxError at import;
change it to a valid Python 3 form by combining exceptions with a tuple or using
separate except blocks (i.e., replace the offending clause with "except
(FileNotFoundError, KeyError):" or handle FileNotFoundError and KeyError in two
distinct except handlers) where the try/except around the settings file
read/parse occurs so the module can import cleanly.

---

Outside diff comments:
In `@src/backend/core/admin.py`:
- Around line 43-48: The except clause in format_value is using Python 2 comma
syntax and must be updated to Python 3 tuple-style exception handling; change
the handler in the format_value method (which currently reads the invalid
"except json.JSONDecodeError, TypeError:") to catch both exceptions using a
tuple (i.e. except (json.JSONDecodeError, TypeError):) so JSON decoding errors
and TypeError are properly ignored without causing a SyntaxError.

In `@src/backend/core/api/permissions.py`:
- Around line 397-402: The except clause on the Mailbox lookup uses Python 2
tuple syntax and must be parenthesized; update the exception handler around the
target_mailbox lookup in the permission check (the try block that calls
models.Mailbox.objects.select_related("domain").get(pk=mailbox_id_from_url)) to
use a parenthesized tuple like except (models.Mailbox.DoesNotExist, ValueError):
and return False as before so the module imports cleanly under Python 3.

In `@src/backend/core/api/serializers.py`:
- Around line 1719-1725: The except clause in the tag validation loop uses
Python 2 syntax and must be updated; in the loop iterating "for tag_id in tags"
(where tag_uuid = uuid.UUID(str(tag_id)) and invalid_tags is appended on
failure) replace the invalid "except ValueError, TypeError:" with a modern
multi-except grouping "except (ValueError, TypeError):" so both exceptions are
caught correctly and the serializer (in src/backend/core/api/serializers.py
around the tag validation block) can import under Python 3.

In `@src/backend/core/tests/mda/test_rfc5322_composer.py`:
- Around line 1534-1555: Remove the duplicated test method
test_create_reply_with_empty_original from the TestErrorHandling test class (it
already exists in TestReplyGeneration) to avoid divergence; either delete the
duplicate or replace it with a focused error-specific variant that exercises
error handling behavior (e.g., asserting that missing original fields are
handled gracefully by create_reply_message) and keep references to
create_reply_message, TestReplyGeneration, and TestErrorHandling so the reviewer
can find and modify the correct location.
- Around line 636-641: Replace the fragile use of msg.get_payload() with the
same decoded pattern used elsewhere: call msg.get_payload(decode=True) and
decode it with the message charset (e.g. payload =
msg.get_payload(decode=True).decode(msg.get_content_charset() or "utf-8")) and
then assert the substring is in that decoded payload; update both occurrences
that use msg.get_payload() (the assertions around the no-subject email and the
later block at the other occurrence).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20332ad2-dc35-4a5f-9217-615760461a09

📥 Commits

Reviewing files that changed from the base of the PR and between c3d9c3c and 0a9580e.

📒 Files selected for processing (15)
  • src/backend/core/admin.py
  • src/backend/core/api/permissions.py
  • src/backend/core/api/serializers.py
  • src/backend/core/api/viewsets/image_proxy.py
  • src/backend/core/api/viewsets/submit.py
  • src/backend/core/entitlements/backends/deploycenter.py
  • src/backend/core/mda/outbound_direct.py
  • src/backend/core/mda/rfc5322/composer.py
  • src/backend/core/mda/rfc5322/parser.py
  • src/backend/core/models.py
  • src/backend/core/services/dns/providers/scaleway.py
  • src/backend/core/services/importer/pst.py
  • src/backend/core/signals.py
  • src/backend/core/tests/mda/test_rfc5322_composer.py
  • src/backend/messages/settings.py

Comment thread src/backend/core/api/viewsets/image_proxy.py Outdated
Comment thread src/backend/core/api/viewsets/submit.py Outdated
Comment thread src/backend/core/entitlements/backends/deploycenter.py Outdated
Comment thread src/backend/core/mda/outbound_direct.py Outdated
Comment thread src/backend/core/mda/rfc5322/composer.py Outdated
Comment thread src/backend/core/services/dns/providers/scaleway.py Outdated
Comment thread src/backend/core/services/importer/pst.py
Comment thread src/backend/core/signals.py Outdated
Comment thread src/backend/core/tests/mda/test_rfc5322_composer.py Outdated
Comment thread src/backend/messages/settings.py Outdated
@sylvinus sylvinus force-pushed the stdlib_composer branch from dfbaf97 to afb8374 Compare May 6, 2026 20:20
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/core/mda/rfc5322/composer.py`:
- Around line 187-196: The Message-ID/In-Reply-To header wrapping is incorrect
because it only adds brackets when both sides are missing; update the code that
sets message_part["Message-ID"] and message_part["In-Reply-To"] to use a
per-side normalizer instead of the current conjunction check: extract a helper
(e.g., _ensure_angle_brackets) that prepends '<' if missing and appends '>' if
missing, run values through _sanitize_header_value then _ensure_angle_brackets
(or call the existing _normalize_cid which can delegate to that helper), and
remove the duplicate faulty logic in create_reply_message so both locations
share the same correct helper.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6d4fd06b-bc4f-41e8-81ef-78face0bae22

📥 Commits

Reviewing files that changed from the base of the PR and between 0a9580e and afb8374.

📒 Files selected for processing (3)
  • src/backend/core/mda/rfc5322/composer.py
  • src/backend/core/tests/mda/test_rfc5322_composer.py
  • src/backend/pyproject.toml

Comment thread src/backend/core/mda/rfc5322/composer.py Outdated
Comment thread src/backend/core/mda/rfc5322/composer.py Fixed
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/backend/core/mda/rfc5322/composer.py (1)

842-850: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop logging the malformed Message-ID value verbatim.

orig_message_id comes from inbound mail and can contain mailbox identifiers. Logging it with %r leaks that user-controlled value into backend logs for a path that is already handled safely by dropping the header.

Minimal fix
         except EmailComposeError:
             logger.warning(
-                "Dropping malformed inbound Message-ID %r from reply threading",
-                orig_message_id,
+                "Dropping malformed inbound Message-ID from reply threading"
             )

As per coding guidelines, "Do not log sensitive information (tokens, passwords, financial/health data, PII)".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backend/core/mda/rfc5322/composer.py` around lines 842 - 850, The warning
logs the raw inbound header value (orig_message_id) which may contain sensitive
mailbox identifiers; update the exception handler around _validate_msg_id (in
the In-Reply-To handling) to stop interpolating orig_message_id into
logger.warning and instead emit a generic message (e.g., "Dropping malformed
inbound Message-ID from reply threading") or include only a non-sensitive
indicator such as the header name or a boolean flag; modify the logger.warning
call in the except EmailComposeError block to remove the %r argument and message
interpolation while keeping the contextual text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backend/core/mda/rfc5322/composer.py`:
- Around line 375-407: The current attachment creation block in composer.py
returns None on base64 decode or MIME part creation errors causing
_wrap_with_inline_images/_wrap_with_attachments to drop attachments silently;
instead, surface the failure by raising a clear exception (e.g. ValueError or a
custom AttachmentError) with contextual details (include filename, content_id,
and the original error) when decoding or MIMEPart construction fails, so the
caller can abort composition rather than filtering out the part silently; update
the except handlers around base64.b64decode and MIMEPart.set_content to raise
that exception and ensure callers of this helper (notably
_wrap_with_inline_images and _wrap_with_attachments) propagate or handle the
exception to fail composition.

---

Duplicate comments:
In `@src/backend/core/mda/rfc5322/composer.py`:
- Around line 842-850: The warning logs the raw inbound header value
(orig_message_id) which may contain sensitive mailbox identifiers; update the
exception handler around _validate_msg_id (in the In-Reply-To handling) to stop
interpolating orig_message_id into logger.warning and instead emit a generic
message (e.g., "Dropping malformed inbound Message-ID from reply threading") or
include only a non-sensitive indicator such as the header name or a boolean
flag; modify the logger.warning call in the except EmailComposeError block to
remove the %r argument and message interpolation while keeping the contextual
text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 88159047-8e71-4630-ad84-8462e097de49

📥 Commits

Reviewing files that changed from the base of the PR and between afb8374 and 2916f9c.

📒 Files selected for processing (7)
  • Makefile
  • bin/pytest
  • src/backend/core/factories.py
  • src/backend/core/mda/rfc5322/composer.py
  • src/backend/core/services/importer/pst.py
  • src/backend/core/tests/mda/test_rfc5322_composer.py
  • src/backend/core/tests/mda/test_rfc5322_composer_fuzz.py

Comment thread src/backend/core/mda/rfc5322/composer.py
Latest fixes in the stdlib make it a more solid alternative for strict composition than Flanker. We keep Flanker for now for lenient inbound parsing. We add stronger tests and fuzzing to validate we didn't regress.
@sylvinus sylvinus force-pushed the stdlib_composer branch from 2916f9c to f3e9b67 Compare May 7, 2026 12:47
@sylvinus
Copy link
Copy Markdown
Member Author

Merged as 6eab1af

@sylvinus sylvinus closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants