MSC4456: Harms taxonomy by turt2live · Pull Request #4456 · matrix-org/matrix-spec-proposals

turt2live · 2026-04-28T19:07:29Z

Warning

Content Warning: This proposal discuses and identifies harmful content, but does not attempt
to describe the harm posed in detail. This includes identifiers for child safety, sexual abuse,
self-harm, and other types of harm a user may encounter on the open internet.

Rendered

This proposal was split out from MSC4387.

turt2live · 2026-04-28T19:27:25Z

Implementation requirements:

2+ MSCs in different areas using this. For example, search redirection and reporting.

Technically met by the following two MSCs, but I'd like to see each further along in the spec process before considering them true implementations of this MSC:

MSC4387: M_SAFETY error code #4387

MSC4457: Generic reporting API #4457

thetayloredman · 2026-04-28T20:29:57Z

+* `m.spam.fraud` - Fraud/Phishing
+* `m.spam.impersonation` - Impersonation
+* `m.spam.election_interference` - Election Interference
+* `m.spam.flooding` - Flooding


It may be worthwhile to have a separate m.misinformation category, especially to include other forms of synthetic media/deepfakes besides m.adult.deepfake (not all deepfakes are inherently sexual, so there could be e.g. m.misinformation.deepfake). "Election interference" feels like it wouldn't always be a subcategory of spam.

Having them be independent also means you can classify m.misinformation.fraud alongside m.spam where they come in a list.

Additionally, if the UX was designed to align with these categories, it wouldn't make sense for a user to view these under "spam," IMHO.

Looking at Bluesky (where the list is inspired in part from), they label "spam" as "Misleading - spam or other inauthentic behaviour or deception". We might want to adopt similar labeling for the "Spam" category we have here.

They also consider deepfakes to be primarily adult content. I suspect that if a user was reporting a deepfake that wasn't easily classified as adult content then they'd use "impersonation" or "other misleading content" (to use the Bluesky terms).

ara4n · 2026-05-01T17:18:48Z

@@ -0,0 +1,145 @@
+# MSC4456: Harms taxonomy


I am very dubious that we should be baking a taxonomy like this into the Matrix spec, because:

The spec is already huge

Harms can be very subjective and will encourage bikeshedding or bloat. e.g. where is Lese-majesty on the list?

In practice we'll always need an 'other' fallback with a natural language explanation anyway - why not use natural language all along?

Why do we care about semantic codes here at all?

I suspect that we're going to see more and more LLM-based moderation functionality in future, which will be quite happy to process natural language reasons rather than trying to create a set of reason enumerations

At the least, i'd expect the reasons to sit in an external registry somewhere to avoid bloating the spec.

To be fair, many online reporting platforms have a flow for specifically selecting harms information which I believe to be the point of the MSC:

It feels reasonable to want to put a list of some common harms in to make the interface better and potentially aid in tooling without the intrinsic requirement for LLMs :D

I do agree that it might be better outside of the spec, but the question for me is where does this get defined, because it feels very necessary.

The spec is already huge

Building a better way to communicate necessarily involves a bunch of detail!

Harms can be very subjective and will encourage bikeshedding or bloat.

That is a risk. Defining a baseline taxonomy rather than a comprehensive one may help to mitigate that risk.

In practice we'll always need an 'other' fallback with a natural language explanation anyway - why not use natural language all along?

Why do we care about semantic codes here at all?

With semantic codes, clients can more easily build better reporting flows, offering tailored advice based on the type of harm the user has experienced (e.g. direction to helplines, law enforcement, how to keep themselves safe). Servers and communities can use the codes to communicate why enforcement action was taken against a user or piece of content (a requirement in many safety laws). Safety teams can use user-provided codes to triage reports more effectively, both with human teams, and by routing to the most cost/time-effective automated flows. When all servers in a Matrix federation share a common taxonomy of harms, it simplifies sharing details of those harms over federation.

I suspect that we're going to see more and more LLM-based moderation functionality in future, which will be quite happy to process natural language reasons rather than trying to create a set of reason enumerations

Using semantic reasons enables the use of more cost-effective & faster single purpose models rather than slower, more expensive general models, and enables routing to appropriate humans for review.

At the least, i'd expect the reasons to sit in an external registry somewhere to avoid bloating the spec.

An alternative: reference an external standard, as we do with RFCs elsewhere in the spec. Unfortunately, there doesn't appear to be a suitable standard that provides this taxonomy at present, but this could be something to explore and then replace this proposal down the line. The DTSP framework (ISO/IEC 25389) is an example of nascent work in safety standardisation, that doesn't do what we need here. The spec could also offer appendices for this type of content, if there are concerns? I think the spec should contain appropriate guidance to building safe servers and clients, so I'd be comfortable with us including it in the body of the spec.

The closest I can find for an existing reference are AT Proto's com.atproto.moderation.defs and tools.ozone.report.defs models. Obviously, these definitions are highly targeted at AT Proto's use cases, but the parallels in this MSC should be fairly evident as well :)

We may benefit from just copying AT Proto's definitions directly, or working with them to create an external standard that works for both of us. This MSC currently suggests we do something similar to what AT Proto did: create an appendix/definition that exists within their world and refer to it as needed.

The next closest I can find for an existing reference is the European Commission's Transparency Database API which describes content in two ways: a Category and a Category Specification. By nature of it being backed by the Digital Services Act (DSA), it's highly focused on that particular regulatory environment - it does not easily apply to other environments such as the UK, US, Australia, or Canada (despite these places copying most of each other's work in law creation).

Related work is from the World Economic Forum (WEF) which attempts to describe harms in ways that users can understand, but is hardly a "harm identifier" list. Their report can be found here.

The Trust & Safety Professional Association (TSPA) attempts to list the types of abuse, but also doesn't define machine-friendly identifiers for those abuse types. It may be possible to ask them to create a machine-friendly taxonomy for their list, though I expect it'll be too broad for our purposes in Matrix.

IFTAS is primarily used by ActivityPub, but is not a standards organization. They do however provide definitions for 3 types of harmful content (and how to deal with it): by actor, behaviour, or content. Like TSPA, we might be able to ask them to consider a machine-friendly specification for these types of harms. Being associated with ActivityPub might make them more applicable to Matrix too.

If we really don't want to host the list as Matrix, we can probably look to the W3C Data Privacy Vocabularies and Controls Community Group (DPVCG) to establish a set of identifiers. The DPVCG might not take on the work because not all harms are privacy related.

OASIS might be able to help create a standard external to Matrix as well, though their primary output locations are the ISO and IEC. It may be faster/easier/different to go through a local national body instead, like the Standards Council of Canada (SCC). The DTSP Safe Framework Specification is hosted as ISO/IEC 25389 (as Jim mentions), so it's plausible that we could get a similar harms taxonomy specification there too.

DTSP might also be able to help create a standard to reference.

MSC: Harms taxonomy

23afc0b

turt2live changed the title ~~MSC: Harms taxonomy~~ MSC4456: Harms taxonomy Apr 28, 2026

turt2live marked this pull request as ready for review April 28, 2026 19:08

turt2live mentioned this pull request Apr 28, 2026

MSC4387: M_SAFETY error code #4387

Open

turt2live commented Apr 28, 2026

View reviewed changes

thetayloredman reviewed Apr 28, 2026

View reviewed changes

turt2live added the safety label Apr 28, 2026

ara4n reviewed May 1, 2026

View reviewed changes

turt2live mentioned this pull request May 13, 2026

Draft: MSC3215: Aristotle - Moderation in all things #3215

Closed

turt2live added 2 commits May 15, 2026 17:35

Credit Bluesky

283df1e

Add an "other" category/harm

50d1781

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSC4456: Harms taxonomy#4456

MSC4456: Harms taxonomy#4456
turt2live wants to merge 3 commits into
mainfrom
travis/msc/harms-appendix

turt2live commented Apr 28, 2026 •

edited

Loading

Uh oh!

turt2live Apr 28, 2026

Uh oh!

turt2live Apr 28, 2026

Uh oh!

Uh oh!

thetayloredman Apr 28, 2026

Uh oh!

turt2live May 15, 2026

Uh oh!

ara4n May 1, 2026

Uh oh!

thetayloredman May 1, 2026 •

edited

Loading

Uh oh!

jimmackenzie May 4, 2026

Uh oh!

turt2live May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

turt2live commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turt2live Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

turt2live Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thetayloredman Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

turt2live May 15, 2026

Choose a reason for hiding this comment

Uh oh!

ara4n May 1, 2026

Choose a reason for hiding this comment

Uh oh!

thetayloredman May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jimmackenzie May 4, 2026

Choose a reason for hiding this comment

Uh oh!

turt2live May 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

turt2live commented Apr 28, 2026 •

edited

Loading

thetayloredman May 1, 2026 •

edited

Loading