Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 219 additions & 0 deletions proposals/4457-generic-reporting-api.md
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Client (using new API)
  • Server (offering new API)

Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
# MSC4457: Generic reporting API

*This MSC is part of “Reporting v2” - a project led by the Foundation’s T&S team to improve communication
and effectiveness of reports on Matrix.*

Matrix’s [existing module](https://spec.matrix.org/v1.18/client-server-api/#reporting-content) for
reporting content gives capability for users to report other users, rooms, and events. These reports
are supported by 3 separate APIs, each with similar-but-different semantics. Those API endpoints
additionally do not support reporting media items, server names, complaints about the system itself,
or appeals to past moderation action.

There are other issues with the reporting APIs, such as the vast majority of reports ending up in a
database for other tooling to pull from. This proposal does *not* solve those other concerns, but
does outline what future (or existing, in some cases) MSCs might do to help solve these problems.
Collectively, the series of MSCs to fix reporting is called “Reporting v2”.

This MSC generifies the 3 report endpoints to a single endpoint. Support for future MSCs is carved out,
though unspecified, by this refactoring as well.


## Proposal

The following endpoints are deprecated in favour of a new single endpoint:

* [`POST /_matrix/client/v3/rooms/{roomId}/report`](https://spec.matrix.org/v1.18/client-server-api/#post_matrixclientv3roomsroomidreport)
* [`POST /_matrix/client/v3/rooms/{roomId/report/{eventId}`](https://spec.matrix.org/v1.18/client-server-api/#post_matrixclientv3roomsroomidreporteventid)
* [`POST /_matrix/client/v3/users/{userId}/report`](https://spec.matrix.org/v1.18/client-server-api/#post_matrixclientv3usersuseridreport)

The new single endpoint is defined to cover a broader range of reportable entities:

```
POST /_matrix/client/v1/safety/report/{txnId}
Authorization: <normal Client-Server API authentication>
Content-Type: application/json

{
"type": "complaint", // future scope: "appeal" and possibly other types
// ... other fields per `type`
}
```

Currently, only the `complaint` type is specified. A future MSC will add an `appeal` type. Other types
may be added by other MSCs.

The `txnId` is a [Transaction Identifier](https://spec.matrix.org/v1.18/client-server-api/#transaction-identifiers).

If the `type` is `complaint`, the following *additional* fields are present on the request body at
the top level:

```jsonc
{
// The *primary* identifier (user ID, etc) the complaint is regarding. Currently can be one of the following:
// * A user ID
// * An event ID
// * A room ID
// * A room alias (noting that room ID reports are more reliable because aliases can drift between rooms)
// * A server name (prefixed with "server:" to distinguish it from a namespaced ID below)
// * An MXC URI (media URI)
// * The string "m.system" to denote a complaint regarding the reporting system itself
// * A common namespaced identifier (https://spec.matrix.org/v1.18/appendices/#common-namespaced-identifier-grammar)
//
// The above identifiers are structured so the server can identify each one individually. For example, the server
// knows it's dealing with a user report if `regarding` starts with `@`.
//
// REQUIRED.
"regarding": "<identifier>",

// The type of harm being reported in this complaint. Currently, the available harms are defined
// by MSC4456: https://github.com/matrix-org/matrix-spec-proposals/pull/4456
// A future MSC is expected to advertise which custom harms (if any) the server supports.
//
// Note: This is the reporter's opinion and is not necessarily fact.
//
// REQUIRED.
"harm": "<harm identifier>",

// The text the user supplied to support this complaint. The input field presented to the user SHOULD ask them
// to *briefly* describe the content or harm caused.
//
// Cannot exceed 1024 bytes (before trimming whitespace).
//
// REQUIRED (cannot be an empty string, after trimming whitespace).
"description": "This user is spamming",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some harms it doesn't make sense to require a description - the subject is enough to make a decision. Requiring it is likely to cause some portion of reports to have content like "."

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a content of "." in some proposals where it isn't needed is imo better than people not providing a description where it would be necessary.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently thinking that description is probably going to become required for some types of harm and optional for others. Real world data shows most reports are filed with unhelpful reasons - asking the user to pick the type of harm is infinitely more helpful than the reasons currently filed.

Some examples of reasons we've seen (based on clicking random reports in our system):

  • "Reporting user @someone:example.org" - No indication for why they're being reported. The event referenced is their membership event in a room.
  • "spam" - The event sometimes is obviously spam, but other times it's things like "hello" or a github URL.
  • "embargo" - Presumably this is either a translation issue or the word means something to someone else.
  • "SPAM" - A few users have pressed extra keys to make it all caps instead.

Of the ~50 reports I opened, 3 had actionable reasons beyond "spam" and 1 used a slur to describe the user they were reporting. The remainder had one of the above 4 reasons.

}
```

*Author's note*: The choice to use `regarding` is deliberate for a bit of natural language: a user is
**reporting** a **complaint** **regarding** something/someone which caused **harm**. `description`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is slightly clunky, although not too important.

  • Reporting an Appeal (or other type of object) doesn't make as much sense
  • The user is making a complaint with a subject and a description, and the moderator looks at the subject of the complaint, not the regarding of the complaint.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, yea, the example breaks down quite a bit with appeals. The important distinction I'm trying to make is it's not a reason for the concern, it's a description of the concern. It's subtle, but "reason" implies that the report is being made in error up front while "description" asks for more information.

replaces what used to be called `reason` because the word "reason" feels wrong in this context. The
reporter is adding information (a "description"), not justifying their actions.

*Note:* Future MSCs are expected to add more fields, like where/who to send the report to (community
moderators, server admin, remote server, etc).

*Note*: The `regarding` field doesn't cover the case where a reporter wishes to say "all of Alice's
conduct in {this} room is bad". It's expected that when a future MSC introduces attachments that the
client can send a report with a *primary identifier* of `@alice:example.org` then append specific
events or room IDs in a followup. Clients can work around this for now by manually adding the room
ID or event IDs to the `description` alongside the user's own text.

If the report was successful, the server responds with:

```jsonc
{
// An opaque identifier (https://spec.matrix.org/v1.18/appendices/#opaque-identifiers) to uniquely
// represent this report in the homeserver's system. Currently has no use.
//
// REQUIRED.
"report_id": "<opaque>"
}
```

*Note:* Future MSCs will use the report ID to allow appending further evidence of harm (screenshots,
events, etc) and likely support communication to the reporter regarding their report. Clients can
discard the report ID for now.

If the report was *not* successful, it will be for one of the following [error codes](https://spec.matrix.org/v1.18/client-server-api/#standard-error-response):

* `429 M_LIMIT_EXCEEDED` - Rate limited (“too many reports too quickly”).
* `400 M_BAD_JSON` - A `REQUIRED` field is missing, especially for the `type`, or a field is too long/short.
* `400 M_INVALID_PARAM` - The `type` is unknown to the server. (Note: servers MUST support at least
`complaint` and more in future MSCs).
* `404 M_NOT_FOUND` - The server cannot process the report because the reported identifier does not
exist (or is unknown). Servers MUST NOT return this error for `m.system` complaints.
* `403 M_FORBIDDEN` - The caller cannot file this report. For example, the server has banned the
caller from filing reports or the server has determined that the caller does not have visibility
on the reported object/entity (ie: can’t see the event they’re complaining about, or can’t appeal
someone else’s ban).
Comment on lines +125 to +128
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should clarify that self-reports are valid, even outside of appeals. We don't currently see such reports in the wild because clients disallow it, but self-reporting is a way for some people to reach out to get help.

During implementation I think we should experiment with providing self-report as an option and measure how effective it is.



*Note:* Servers MAY use `404 M_NOT_FOUND` and `403 M_FORBIDDEN`. The remaining error codes MUST only
be returned in applicable cases. This optionality is to allow servers to customize their operations
to their individual regulatory requirements and needs. For example, a server might choose to validate
that a user can see an event, but also choose *not* to return an error if they can’t (instead, it’d
get flagged internally on the report).
Comment on lines +131 to +135
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"MUST only" is trying to carry a requirement for a future compliance suite: a server which always returns a 404 for local users, even if they exist, is non-compliant.


The new report endpoint is available to [guests](https://spec.matrix.org/v1.18/client-server-api/#guest-access).

How the server processes the report remains an implementation detail. A future MSC will clarify where
exactly a report should be routed to. Implementations SHOULD NOT assume that reports will only go to
a single place in the future. This is important for implementations which provide the endpoint as
separate software from the bulk of the homeserver, as otherwise that software might assume that it’s
only going to have to populate a single destination queue.
Comment on lines +140 to +143
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Implementations SHOULD NOT assume..." might make more sense as future MSCs are written. I'm mostly trying to say "don't treat this endpoint as only reporting to server admins - it'll have routing information later, and your software will need to be capable of sending communications as needed"



## Future considerations

Future proposals are expected to expand this API's capabilities in the following ways:

* Being able to route reports to community moderators or other servers.
* Possibly based on the harm itself ("you cannot refuse to send this to your server admin, but you
can optionally send it to the mods too").
* Possibly based on choice too ("do you want to send this to `remote.example.org` too? Should we
tell them you send the report?").
* Adding attachments (screenshots, events, additional info) and amendments.
* Closing/cancelling reports (and generally the open/closed status of a report).
* Communication frameworks, like who to send updates to. Possibly including a "reply to" address to
direct safety teams at a more responsive user ID, for example.
* Appeals with an actions database/ID. For example: "appeal {$this} ban".
* Possibly adding evidence to someone else's report, similar to a character witness statement.
* This may be better handled by a new `information` report type.
Comment on lines +160 to +161
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may further be best to just leave to the backend. Individual users can submit distinct complaints that are deduplicated/aggregated on the backend.

* Endpoints to support a UX that can prevent a report from being submitted if it would be rejected.
For example, ensuring required fields and information is present.
* Anything not listed above :)


## Potential issues

**TODO**: This section needs completing before proposing FCP.

Implementation and unstable usage of this MSC is expected to populate this section. Currently, expected
risks include:

* Not having a generic enough API to support appeals.
* Clients implementing something which makes it harder to add routing later.
* Returning a report ID only to discard it is a bit strange.
* Internal handling of reports may be difficult.
* Limiting to authenticated users (including guests) may prove to be an issue.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • A user filling out a report only to hit a forbidden or rate limited error code, wasting the work / losing the data

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to prove this during implementation, but we're expecting that only users which intentionally work around a compliant client's reporting flow will encounter these errors. For example, hammering the report endpoint with curl or force-opening a dialog on an event where the button to report isn't shown.

In other words: users who see forbidden or rate limit errors should already be expecting that outcome.


## Alternatives

No significant alternatives.


## Security considerations

* To avoid pages upon pages of text, `description` is limited to 1024 bytes. Bytes rather than characters
was chosen to avoid the "is an emoji 1 character or 2" question. The specific number of 1024 was chosen
arbitrarily: we don't want to limit users such that they can only provide minimal information, but we
also don't want to support the Bee Movie script. 1024 was specifically chosen over 512 to permit system
complaints to have more detail.

* Users can spam the reporting system by flooding it with reports. This is mitigated by rate limiting.

* Existence of a user or a user's ability to see an event can be hidden by returning 200 OK.


## Unstable prefix

While this proposal is not considered stable, implementations should use `/_matrix/client/unstable/org.matrix.msc4457/safety/report/{txnId}`
in place of `/_matrix/client/v1/safety/report/{txnId}`.

Support for the unstable endpoint is advertised as a [`/versions`](https://spec.matrix.org/v1.18/client-server-api/#get_matrixclientversions)
unstable feature flag: `org.matrix.msc4457_report_api`.

Implementations which support the unstable endpoint SHOULD continue to do so for at least 6 months
after this proposal is in a tagged release of the specification. This is to ensure that users on outdated
clients continue seeing modern reporting flows.

There is no `/versions` flag for the time where the endpoint is stable but unreleased. Implementations
can continue using the unstable endpoint and switch to the stable one when they see a server supports
that spec version.


## Dependencies

* [MSC4456: Harms taxonomy](https://github.com/matrix-org/matrix-spec-proposals/pull/4456) - Used to
populate the `harm` field on a complaint report.
Loading