Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: MSC3215: Aristotle - Moderation in all things #3215

Draft
wants to merge 12 commits into
base: old_master
Choose a base branch
from
134 changes: 97 additions & 37 deletions proposals/3215-towards-decentralized-moderation.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,18 @@ can be invited to moderation rooms act upon abuse reports:
}
```

- Each room MAY have state events `m.room.moderator_of`. A room that has a state event `m.room.moderation.
- Each room MAY have state events `m.room.moderator_of`.
Yoric marked this conversation as resolved.
Show resolved Hide resolved

```jsonc
{
"state_key": "m.room.moderation.moderator_of", // A bot used to forward reports to this room.
"type": "m.room.moderation.moderator_of",
"content": {
"user_id": XXX, // The bot in charge of forwarding reports to this room.
}
// ... usual fields
}
```

```jsonc
Yoric marked this conversation as resolved.
Show resolved Hide resolved
{
Expand Down Expand Up @@ -131,16 +142,17 @@ with content
| room_id | **Required** id of the room in which the event took place. |
| moderated_by_id | **Required** id of the moderation room, as taken from `m.room.moderated_by`. |
| nature | **Required** The nature of the event, see below. |
| reporter | **Required** The user reporting the event. |
| comment | Optional. String. A freeform description of the reason for sending this abuse report. |

`nature` is an enum:

- `m.abuse.disagreement`: disagree with other user;
- `m.abuse.toxic`: toxic behavior, including insults, unsollicited invites;
- `m.abuse.illegal`: illegal behavior, including child pornography, death threats,...;
- `m.abuse.spam`: commercial spam, propaganda, ... whether from a bot or a human user;
- `m.abuse.room`: report the entire room, e.g. for voluntarily hosting behavior that violates server ToS;
- `m.abuse.other`: doesn't fit in any category above.
- `m.abuse.nature.disagreement`: disagree with other user;
- `m.abuse.nature.toxic`: toxic behavior, including insults, unsollicited invites;
- `m.abuse.nature.illegal`: illegal behavior, including child pornography, death threats,...;
- `m.abuse.nature.spam`: commercial spam, propaganda, ... whether from a bot or a human user;
- `m.abuse.nature.room`: report the entire room, e.g. for voluntarily hosting behavior that violates server ToS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the moderators of a room should do with a report that the contents of the room violates the ToS of a server. Should it be up to the room moderators to ACL the server, or up to the homeserver to pull its users out of the room?

If a user from a homeserver with a very restricted ToS happens to join your public room, it probably shouldn't be up to the room moderators to deal with that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's where the current abuse report API comes in play. I'll clarify this.

- `m.abuse.nature.other`: doesn't fit in any category above.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coding a list seems destined to failure. Maybe the list of forbidden content should be in the room somewhere? For example what if adult content isn't allowed? Or discussion of drugs? I think it makes sense for the moderators to make this list. Especially "a Client may give to give a user the opportunity to think a little about whether the behavior they report truly is abuse" is very difficult when this spec-provided list may not be aligned in any way with what is actually allowed in the room.

Copy link
Author

@Yoric Yoric May 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea of making the list extensible. It definitely makes sense.

On the other hand, if we think tooling, I believe that having a standardized list is the way to go because:

  1. it makes internationalization possible;
  2. it makes it easier to write bots or other tools to display abuse reports in a human-readable manner for non-technical moderators;
  3. it makes it easier to customize clients to handle abuse-specific case.

Additionally:

  • if we do not have a default list of abuse natures, we increase the difficulty of setting up new rooms;
  • my personal experience with e.g. Reddit or Twitter suggests that having a list that is too long makes it harder to pick one abuse nature;
  • if we allow full customization of the list, an evil moderator running illegal activities could probably take advantage of this to hide "report this room" behind a fake "report this room" and use it to deanonymize abuse reporters who believe that they are reporting the entire room to a homeserver administrator.

What do you think about the following?

  1. having a (possibly long) list of standardized abuse natures, initialized in this MSC and extended in future MSCs;
  2. having a mechanism that will let the moderation room and/or the community room compose a list of abuse natures from both standard and non-standard value (the latter will require the room to also specify some internationalization) and that will let the client use natures from this list;
  3. having a mechanism that will specify a default list of standardized abuse natures when creating a new community (or perhaps moderation?) room;
  4. think of (UX-based?) counter-measures to avoid the fake "report this room" button.

While points 2+ don't seem very complicated at first, I feel that they deserve their own MSC. I can rephrase the current MSC to leave room for them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes internationalization possible

This is a good point. Maybe we can have "well known" types of abuse that can be translated by clients. For example "nature": ["m.abuse.nature.toxic", "custom.Sent a message that wasn't the phrase \"Cat.\""]. This way the well-known ones can be auto-translated but the room doesn't need to use all of the well-known types and can add their own. Of course making a consistent UX across well-known and custom categories may be hard.

On the other hand this probably isn't much of an issue. The moderators that create these rules will enter them in the language(s) that their community uses. Just like the moderators will need to translate the ToS at signup, set the room topic or similar.

it makes it easier to write bots or other tools to display abuse reports in a human-readable manner for non-technical moderators;

I'm not sure that this provides a huge benefit. I think the reason would usually just be shown. If there is any filtering this is probably customized by the mod team anyways. I would be interested in any examples where this would be helpful.

it makes it easier to customize clients to handle abuse-specific case.

What do you have in mind? I can't think of an example here.

if we do not have a default list of abuse natures, we increase the difficulty of setting up new rooms;

I think this is a client issue. The client can provide a "quick list" of categories if they think it will be useful. They can also save it to the user account or pull a server default list if required. I don't think the spec is the best place to store a helpful list of abuse types to be honest.

What do you think about the following?

I think this makes sense. I'm still not sure how much value the well-known list provides but I don't think they hurt much if they are optional to use and you can add custom ones. I agree that future MSCs can extend the list and add mechanisms to help clients suggest good defaults.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes internationalization possible

This is a good point. Maybe we can have "well known" types of abuse that can be translated by clients. For example "nature": ["m.abuse.nature.toxic", "custom.Sent a message that wasn't the phrase \"Cat.\""]. This way the well-known ones can be auto-translated but the room doesn't need to use all of the well-known types and can add their own. Of course making a consistent UX across well-known and custom categories may be hard.

On the other hand this probably isn't much of an issue. The moderators that create these rules will enter them in the language(s) that their community uses. Just like the moderators will need to translate the ToS at signup, set the room topic or similar.

Agreed with both points, but I still feel that this deserves its own MSC :)

it makes it easier to write bots or other tools to display abuse reports in a human-readable manner for non-technical moderators;

I'm not sure that this provides a huge benefit. I think the reason would usually just be shown. If there is any filtering this is probably customized by the mod team anyways. I would be interested in any examples where this would be helpful.

It's basically the same thing as translation, just for moderators.

it makes it easier to customize clients to handle abuse-specific case.

What do you have in mind? I can't think of an example here.

Actually, I can't think of a good example, either (I was thinking for instance of using the report button to report calls for help in case of suicide threats, but it's not very convincing).

On the other hand, having a standard list would be very useful for bots/tools. We can imagine a bot lurking both in the Moderation Room and in the Community Room and that watches specifically for m.abuse.nature.spam reports. Whenever it receives one in the Moderation Room, it checks in the Community Room whether the message truly looks like spam, using whatever heuristics are at hand, and may either take decisions such as auto-kicking or ping a moderator with a human-readable message to suggest kicking the offender. Similarly, a sufficiently smart bot could use libnsfw to discard or deprioritize m.abuse.nature.porn reports that don't seem to be porn, etc.

In a very different scenario, we can imagine a bot that files reports on GitLab. In certain cases, the bot should file the abuse report with as much context as it can possibly find - if the bot is present in the Community Room, it can attach the content of messages, copy links to images, etc. Except if the abuse is, say m.abuse.nature.gore or m.abuse.nature.rape or ... the moderators probably don't want to see the image.

if we do not have a default list of abuse natures, we increase the difficulty of setting up new rooms;

I think this is a client issue. The client can provide a "quick list" of categories if they think it will be useful. They can also save it to the user account or pull a server default list if required. I don't think the spec is the best place to store a helpful list of abuse types to be honest.

Agreed. Although I believe that we need a few entries to bootstrap testing.

What do you think about the following?

I think this makes sense. I'm still not sure how much value the well-known list provides but I don't think they hurt much if they are optional to use and you can add custom ones. I agree that future MSCs can extend the list and add mechanisms to help clients suggest good defaults.

Are we in agreement that this MSC can start with a short list and that the customization mechanism can wait for a followup MSC?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically the same thing as translation

One downside of translation is that it is basically putting words in the moderators mouth. Every client and every translation will interpret the boundaries of each nature differently. For a mod team it seems desirable to know exactly what categories you provide and exactly what they mean.

On the other hand, having a standard list would be very useful for bots/tools

I don't fine these examples too convincing as they can easily be setup when configuring the bot. Even with a custom list the bot is still receiving predictable natures as the list is picked by the moderators. I guess it makes a bot slightly easier to use across mod rooms or policies but this still think it is a minor benefit at best.

Agreed. Although I believe that we need a few entries to bootstrap testing.

What type of testing do you think you need? I'm confused by this comment.

Are we in agreement that this MSC can start with a short list and that the customization mechanism can wait for a followup MSC?

I don't think I agree. It sounds like this is a breaking change. I would rather get the customization mechanism specified from the onset so that clients don't need to be changed when it gets added. At the least we would need to explicitly reserve some format for the custom messages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically the same thing as translation

One downside of translation is that it is basically putting words in the moderators mouth. Every client and every translation will interpret the boundaries of each nature differently. For a mod team it seems desirable to know exactly what categories you provide and exactly what they mean.

I don't think we can escape translation for the end-user.

On the other hand, having a standard list would be very useful for bots/tools

I don't fine these examples too convincing as they can easily be setup when configuring the bot. Even with a custom list the bot is still receiving predictable natures as the list is picked by the moderators. I guess it makes a bot slightly easier to use across mod rooms or policies but this still think it is a minor benefit at best.

I disagree on this point. Making natures non-standard by default feels like a footgun to me.

Agreed. Although I believe that we need a few entries to bootstrap testing.

What type of testing do you think you need? I'm confused by this comment.

Once the MSC feels stable enough, I'm planning to prototype this MSC (probably as part of develop.element.io and matrix.org) to gather feedback from actual moderators and end-users. That's what I meant by "testing".

Are we in agreement that this MSC can start with a short list and that the customization mechanism can wait for a followup MSC?

I don't think I agree. It sounds like this is a breaking change. I would rather get the customization mechanism specified from the onset so that clients don't need to be changed when it gets added. At the least we would need to explicitly reserve some format for the custom messages.

I don't think this is a breaking change if we specify that clients that do not support customization may fallback to the standard list.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can escape translation for the end-user.

I'm not convinced. If we step away from creating a single moderation policy for the whole world it becomes very reasonable that the acceptable content policy is written in just the languages that the moderators operate. I'm not sure how effective a mod can be if they don't speak the same language as the user anyways.

I disagree on this point. Making natures non-standard by default feels like a footgun to me.

What is the danger you see in this?

I don't think this is a breaking change if we specify that clients that do not support customization may fallback to the standard list.

I meant that clients will still need to know how to display these custom messages. Otherwise these users will no longer be able to report abuse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the danger you see in this?

Any case in which we'd need two tools to communicate with each other. I don't have specific examples yet, but I'd be really surprised if it didn't show up quickly.

I meant that clients will still need to know how to display these custom messages. Otherwise these users will no longer be able to report abuse.

Agreed. I still believe that this can be done in a followup MSC, though.

Copy link

@erkinalp erkinalp May 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `m.abuse.nature.other`: doesn't fit in any category above.
- `m.abuse.nature.other`: abuse that doesn't fit in any category above.
- `m.filter.copying`: automated screening to account for laws like Directive of Copyright
in the Digital Single Market.

Signed-off by: Erkin Alp Güney erkinalp9035@gmail.com


We expect that this enum will be amended by further MSCs.

Expand All @@ -161,25 +173,19 @@ This proposal does not specify behavior when `m.room.moderated_by` is not set or

Users should not need to join the moderation room to be able to send `m.abuse.report` messages to it, as it would
let them snoop on reports from other users. Rather, we introduce a built-in bot as part of this specification: the
Routing Bot. This Routing Bot is part of the server and has access to priviledged information such as room membership.
Routing Bot.

1. When the Routing Bot is invited to a room, it always accepts invites.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm writing this regardless of the status of the MSC in case it gets picked up again later by someone else, even if that's in another form.)

It would be really useful for the client to give the room a distinct type. Currently in Mjolnir (which has a partial implementation of the routing bot) this behaviour is problematic as it clashes with the acceptInvitesFromSpace behaviour and also protectAllJoinedRooms. matrix-org/mjolnir#475

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be happy with a solution that requires a bot on my homeserver joining all rooms it's invited into. This seems too abusable. I want my server only participating in rooms that my users explicitly joined.

2. When the Routing Bot receives a message other than `m.abuse.report`, it ignores the message.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this is an event with type m.abuse.report, rather than an m.room.message event with "msgtype": "m.abuse.report", correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's a message event with type m.abuse.report.

3. When the Routing Bot receives a message _M_ with type `m.abuse.report` from Alice:
- If the Routing Bot is not a member of _M_`.moderated_by_id`, reject the message.
- If Alice is not a member of _M_.`room_id`, reject the message.
- If _M_.`reporter` is not Alice, reject the message.
- If room _M_.`moderated_by_id` does not contain a state event `m.room.moderation.moderator_of.XXX`, where `XXX`
is _M_.`room_id`
- Reject the message.
- Otherwise
- Call _S_ the above state event
- If _S_ does not have type `m.room.moderation.moderator_of`, reject the message.
- If _S_ is missing field `user_id`, reject the message.
- If _S_.`user_id` is not the id of the Routing Bot, reject the message.
- If event _M_.`event_id` did not take place in room _M_.`room_id`, reject the message.
- If Alice could not witness event _M_.`event_id`, reject the message.
- Copy the message to room _M_.

is _M_.`room_id`, reject the message. Otherwise, call _S_ this state event.
- If _S_ does not have type `m.room.moderation.moderator_of`, reject the message.
- If _S_ is missing field `user_id`, reject the message.
- If _S_.`user_id` is not the id of the Routing Bot, reject the message.
Yoric marked this conversation as resolved.
Show resolved Hide resolved
- Copy the `content` of _M_ as a new `m.abuse.report` message in room _M_.`room_id`.

### Possible Moderation Bot behavior

Expand All @@ -195,7 +201,7 @@ A possible setup would involve two Moderation Bots, both members of a moderation

## Security considerations

### Routing
### Routing, 1

This proposal introduces a (very limited) mechanism that lets users send (some) events to a room without being part of that
room. There is the possibility that this mechanism could be abused.
Expand All @@ -206,34 +212,88 @@ of the moderation room.
However, it is possible that it can become a vector for attacks if combined with a bot that treats said structured data messages,
e.g. a Classifier Bot and/or a Ban Bot.


### Routing, 2

The Routing Bot does not have access to priviledged information. In particular, it CANNOT check whether:
- Alice is a member of _M_.`room_id`.
- Event _M_.`event_id` took place in room _M_.`room_id`.
- Alice could witness event _M_.`event_id`.
Yoric marked this conversation as resolved.
Show resolved Hide resolved

This means that **it is possible to send bogus abuse reports**, as is already the case with the current Abuse Report API.

This is probably something that SHOULD BE FIXED before merging this spec.

### Revealing oneself

If a end-user doesn't understand the difference between `abuse.room` and other kinds of abuse report, there is the possibility
If a end-user doesn't understand the difference between `m.abuse.nature.room` and other kinds of abuse report, there is the possibility
that this end-user may end up revealing themself by sending a report against a moderator or against the room to the very
moderators of that room.

### Snooping administrators
The author believes that this is a problem that can and should be solved by UX.

### Snooping administrators (user homeserver)

Consider the following case:

- homeserver compromised.org is administered by an evil administrator Marvin;
- user @alice:compromised.org is a moderator of room _R_ with moderation room _MR_;
- user @alice:compromised.org is a member of Community Room _CR_;
- user @alice:compromised.org posts an abuse report against @bob:somewhere.org as DM to the Routing Bot;
- Marvin can witness that @alice:compromised.org has sent a message to the Routing Bot
but cannot witness the contents of the message (assuming encryption);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, building encryption support into a bot that's part of the homeserver may be tricky...

Copy link
Author

@Yoric Yoric May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it doesn't really have to be part of the homeserver. Truly, I'd probably prefer if it wasn't, because it would make all the retry logics easier to write without complicating the homeserver.

- as @alice:compromised.org is a member of _CR_, Marvin can witness when @bob:somewhere.org is kicked/banned,
even if _CR_ is encrypted;
- Marvin can deduce that @alice:compromised.org has denounced @bob:somewhere.org.

This is BAD. However, this is better as the current situation in which Marvin can directly
read the report posted by @alice:compromised.org using the reporting API.

### Snooping administrators (moderator homeserver)

Consider the following case:

- homeserver compromised.org is administered by an evil administrator Marvin;
- user @alice:compromised.org is a moderator of room _CR_ with moderation room _MR_;
- user @bob:innocent.org is a member of room _R_;
- @bob:innocent.org posts an abuse report _AR_ to _MR_;
- Marvin may achieve access to the metadata on report _AR_, including:
- the fact that @bob:innocent.org has reported something to room _AR_;
- Marvin also has access to the metadata on _R_, including:
- the fact that _MR_ is the moderation room for _R_;
- the fact that @charlie:toxic.org was just banned from _R_;
- Marvin may deduce that @bob:innocent.org has reported @charlie:toxic.org.
- @bob:innocent.org posts an abuse report as DM to the Routing Bot;
- Marvin does not witness this;
- Marvin sees that the Routing Bot posts a message to _MR_ but the metadata does not
contain any data on @bob:innocent.org;
- if the room is encrypted, Marvin cannot determine that @bob:innocent.org has posted
an abuse report.

This is GOOD.

It is not clear how bad this is.
### Interfering administrator (moderator homeserver)

Consider the following case:

- homeserver compromised.org is administered by an evil administrator Marvin;
- user @alice:compromised.org joins a moderation room _MR_;
- Marvin can impersonate @alice:compromised.org and set `m.room.moderation.moderator_of`
to point to a malicious bot EvilBot;
- when @alice:compromised.org becomes moderator for room _CR_ and sets _MR_ as moderation
room, EvilBot becomes the Routing Bot;
- every abuse report in room _CR_ is deanonymized by EvilBot.

This is BAD. This may suggest that the Routing Bot mechanism may be a bad idea.

### Interfering administrator (moderator homeserver)

Consider the following case:

- homeserver compromised.org is administered by an evil administrator Marvin;
- user @alice:compromised.org is a moderator of room _CR_ with moderation room _MR_;
- Marvin can impersonate @alice:compromised.org and set `m.room.moderation.moderated_by`
to point to a moderation room under its control;
- every abuse report in room _CR_ is deanonymized by EvilBot.

It is also not clear how to decrease the risk.
This is BAD. This actually suggests that the problem goes beyond the Routing Bot.
Yoric marked this conversation as resolved.
Show resolved Hide resolved

### Snooping bots

As bots are invited to moderation rooms, a compromised bot has access to all moderation data for that room.
As bots are invited to moderation rooms, a compromised bot (whether it's Routing Bot,
Classifier Bot or Ban Bot) has access to all moderation data for that room.

## Alternatives

Expand All @@ -257,8 +317,8 @@ a higher risk and result in code that is harder to test and trust.

During experimentation

- `m.room.moderated_by` will be prefixed `org.matrix.msc3215.room.moderated_by`;
- `m.room.moderator_of` will be prefixed `org.matrix.msc3215.room.moderator_of`;
- `m.room.moderation.moderated_by` will be prefixed `org.matrix.msc3215.room.moderation.moderated_by`;
- `m.room.moderation.moderator_of` will be prefixed `org.matrix.msc3215.room.moderation.moderator_of`;
- `m.abuse.report` will be prefixed `org.matrix.msc3215.abuse.report`;
- `abuse.*` will be prefixed `org.matrix.msc3215.abuse.nature.*`.
- `m.abuse.nature.*` will be prefixed `org.matrix.msc3215.abuse.nature.*`.