handle encryption per user instead of per device #886

huguesdk · 2021-08-19T21:22:45Z

disclaimer: i am not an encryption expert. there are possibly big mistakes in this proposal, due to my lack of in-depth understanding of encryption mechanisms. maybe this proposal doesn’t make sense for security reasons.

as far as i understand how e2ee in matrix works, encryption keys depend on the devices (sessions) of each user in a room. this means that each time a user adds (and removes?) a device, encryption keys must change. i think that this causes many problems. i have been using matrix heavily for more than 3 years now, and here are the problems that i encountered (even recently), which are mainly caused by the fact that e2ee is handled per device.

current issues

encryption is brittle and breaks too easily

this is the main issue.

countless times did people who i use matrix with tell me that they cannot decrypt messages. for some, it happens on some messages from time to time. for others (recently, for all the people i know who were on ios), all new messages were suddenly non-decryptable.

there were people who just gave up on matrix completely because it just didn’t work for them (despite trying several times). the messages could never be decrypted.

for my part, i had some problems too, but thanks to the fact that i use many devices, i was always able to eventually decrypt the messages. however, one time i had to ask a friend to send me his keys. what happened was that my homeserver was offline for a short period of time (less than an hour), but during that time, a friend created a matrix account on another server and joined a room i was in. because his server could not join mine, it had no access to my device list, so none of his messages could be decrypted by me (but could by others).

device verification is complex

when a user logs in with a new device, they will have to verify it, or it will be marked as not trusted, and this will be visible by other users. the device verification process works well most of the time (by scanning a qr code or comparing emojis), but is nevertheless a complex technical process, and it sometimes fails. several people i know had to reset their cross-signing state and start over. also, this process must be implemented by all clients who want to support end-to-end encryption.

the device list can be empty

if a user logs out of all of their devices, their device list will be empty. this means that all messages that are sent to them while their device list is empty will never be decryptable by them. this is, i think, a flaw in the protocol design.

encryption keys take too much space

after more than 3 years of matrix usage, i have now 7283 encryption keys, taking up 4,2 mib in json format. this is simply too much for about 30 rooms and less than 50 people. if the encryption system does not change, this will continue to grow over the years, and i will need to keep them to be able to read older messages. as more and more people join matrix, this will grow faster and faster. is this really what we want?

device lists are a privacy issue

when you know the mxid of a user, you can access their device list (without asking for permission), which contain a human readable description of each device (“app.element.io (firefox, ubuntu)”). this can possibly be a privacy issue, as it gives information about what kind of devices the user has and which clients they are using.

improvements?

over the years, e2ee handling has already improved. 2 years ago, people had to manually verify all devices of all users on all their devices. thanks to the hard work of the element team, this is now a thing of the past (which surely nobody misses ☺). thanks to cross-signing, verifying a user now means verifying only one thing, regardless of the number of devices. this is already much better, but to me it still feels like a big workaround, which makes the whole system even more complicated.

what if we handled end-to-end encryption per user instead of per device? i’ve always felt that having devices show up in the protocol was too low-level and strange. what if there was no such thing as a device in the matrix protocol? this would be much simpler.

there could be only one key per user that would be used to create the megolm session. this would mean that a session would change only when a user joins or leaves a room, which happen much less often than a change in devices. having the session change less often would decrease the chance that messages could not be decrypted.

this user key could be cached by the servers the user communicates with. this would avoid problems in case their homeserver is unreachable for some time.

what about security?

surely, the current system is more secure, as devices could be individually deleted or marked as not trusted. but do we really need this? isn’t all this working against us, causing more problems than solving them? with per-user e2ee, if a user thinks that one of their devices could be compromised, they could change their password and key, which would cause all of their devices to be logged out, then they could simply log in again on the devices they still control. this is similar to how most online services work.

how to transition to this?

this change is of course a breaking change, but if i understand correctly, it could be handled progressively by using a new room version.

uhoreg · 2021-08-20T00:47:43Z

Some of the issues that you raise are being addressed in different ways.

encryption is brittle and breaks too easily

This is mostly due to implementation issues rather than a protocol issue. Of course, it could be argued that a simpler protocol would be less prone to implementation bugs. But the current system is much more secure than using a single per-user key, so it is better to fix the implementation issues.

However, we are also looking into switching to Messaging Layer Security (MLS), an upcoming IETF standard for encryption for instant messaging, which has improvements over our current system. We'll have to see how well it performs.

device verification is complex

Element is looking into improving the experience when something goes wrong with verification, so that users can recover from failures more easily.

the device list can be empty

This will be fixed by dehydrated devices

encryption keys take too much space

Whenever you encrypt a message for multiple users, you generate a symmetric key for that message, encrypt the message using that key, and send the key to each user. Otherwise, you need to encrypt the message separately for each user, which would take up more space. So every encryption system will still need to have encryption keys somewhere.

If clients are using the key backup feature, they don't have to store all the keys; they could forget keys for old messages and just retrieve them when needed. I don't think any clients currently do that, but it would be quite doable.

device lists are a privacy issue

This could be fixed by making device names private. Device names were necessary before cross-signing, to help identify the device that you were verifying, and they can still be helpful sometimes for debugging, but we will likely drop them at some point.

kevincox · 2021-08-28T14:03:17Z

I also agree with this. The fact that I have added a new device or how many devices I have shouldn't need to be shared. If I add a new device and cross-sign it that should be my business. If I add a new device and don't cross-sign then it can't receive new messages. Remote users should only need to know if I rotate my "master" key.

Whenever you encrypt a message for multiple users, you generate a symmetric key for that message, encrypt the message using that key, and send the key to each user...

This can be mitigated by generating a shared room key that is used for many messages. However this needs to be rotated at least every time someone leaves the group. It also needs to be rotated upon joins if history is set to "since joining". However it is still a good idea to rotate the shared key often and the limit of that is every message or every couple of messages which becomes equivalent to the Matrix case. IIUC this is now MLS works.

huguesdk · 2021-09-03T21:18:39Z

thanks @uhoreg for your detailed answer!

encryption is brittle and breaks too easily

This is mostly due to implementation issues rather than a protocol issue. Of course, it could be argued that a simpler protocol would be less prone to implementation bugs. But the current system is much more secure than using a single per-user key, so it is better to fix the implementation issues.

a more complex protocol will always lead to more implementation issues. each new server and client that will be written will probably have very similar issues. i understand that more secure = better, but more complex ≠ better. what if we made the protocol even more secure by invalidating all keys every 24 hours? ☺ surely clients would be able to eventually handle this nicely. do we really want client implementations to be responsible to “fix” the (over)complexity of the protocol?

However, we are also looking into switching to Messaging Layer Security (MLS), an upcoming IETF standard for encryption for instant messaging, which has improvements over our current system. We'll have to see how well it performs.

to me, at first look, it seems that this standard is targeted at message passing protocols, like xmpp and its derivatives. indeed, in these protocols, different devices communicate together, and only one device (the “active” one) will receive the message (or at least it was like this the last time i used xmpp with multiple devices).

but matrix is very different. it is based on conversation history synchronization. and this happens at the server level, not at the device level. so why should devices exist at all in the protocol? devices only query the homeserver to read messages, similar to how imap mail clients fetch mail.

of course, individual devices might exist because of end-to-end encryption, because we want to truly identify each “end”. to me, each “end” should be people, not devices. there is little point in being able to decrypt a message on one device but not on another (as it unfortunately still often happens because of implementation issues). if each user has the knowledge (keys) to decrypt their messages stored on the server, it is “end to end” encryption.

device verification is complex

Element is looking into improving the experience when something goes wrong with verification, so that users can recover from failures more easily.

simpler = better 😉

the device list can be empty

This will be fixed by dehydrated devices

i’ve read about it before writing this. indeed, it addresses the problem, and i appreciate the effort that is made. but wait… think about it for a moment. do we really want this? do you think it is an elegant solution? what about a phantom email client needed by an smtp server in order to work? (i know the comparison doesn’t work because there is no e2ee, but it’s just to illustrate the idea.) what would you think of such a solution? sorry, but it feels like a workaround due to a design issue.

encryption keys take too much space

Whenever you encrypt a message for multiple users, you generate a symmetric key for that message, encrypt the message using that key, and send the key to each user. Otherwise, you need to encrypt the message separately for each user, which would take up more space. So every encryption system will still need to have encryption keys somewhere.

does this mean that new keys must be stored for each new message? this has never been clear to me. i thought (and hoped) that keys were only changed when the users+devices configuration of a room changed, and otherwise new keys were computed using the ratchet, and so didn’t need to be stored. sorry if this sounds stupid, i’m not an encryption expert, and did not yet find an easy-to-understand explanation of the encryption mechanisms.

If clients are using the key backup feature, they don't have to store all the keys; they could forget keys for old messages and just retrieve them when needed. I don't think any clients currently do that, but it would be quite doable.

key backup is a good example of a security/user-friendliness tradeoff. but isn’t it funny to have a complex per-device e2ee and then storing the (encrypted) keys on the server? ☺ i understand that even with per-user keys we would need some sort of key backup, so it would not solve the issue, but maybe it would look a little less paradoxal.

device lists are a privacy issue

This could be fixed by making device names private. Device names were necessary before cross-signing, to help identify the device that you were verifying, and they can still be helpful sometimes for debugging, but we will likely drop them at some point.

good to hear that this is being addressed.

another point that i forgot to write about related to device identifiers, is that each message contains the identifier of the device that was used to send it. to me it is a privacy issue, as people can know from which device (even with private names) the message was sent, giving extra (unneeded) information.

sorry if i sound quite harsh. i do not mean to offend anyone. i know how hard people are working to advance matrix. many thanks to them. i hope my criticism can help us move forward toward a better solution.

bwindels · 2021-09-15T21:18:39Z

does this mean that new keys must be stored for each new message? this has never been clear to me. i thought (and hoped) that keys were only changed when the users+devices configuration of a room changed, and otherwise new keys were computed using the ratchet, and so didn’t need to be stored. sorry if this sounds stupid, i’m not an encryption expert, and did not yet find an easy-to-understand explanation of the encryption mechanisms.

megolm keys are usually rotated every 100 messages or every week, whichever comes first. Some clients (like element web) also create a new key every time you close and open the client and you send a message in a room. So not exactly for every message, but they do get rotated.

https://matrix.org/docs/guides/end-to-end-encryption-implementation-guide is probably the best explanation of the basics megolm that we have so far, although missing recent additions like dehydration and fallback keys.

bwindels · 2021-09-15T21:30:33Z

another point that i forgot to write about related to device identifiers, is that each message contains the identifier of the device that was used to send it. to me it is a privacy issue, as people can know from which device (even with private names) the message was sent, giving extra (unneeded) information.

Matrix is one of the few (I can't think of any other?) protocols that supports end-to-end encryption AND multiple sessions per user. Megolm keys are shared over olm, which is a secure communication channel between two devices. Megolm keys are referenced by encrypted messages, so even if we wouldn't include the device id in the message itself, clients can still track which olm session they received a megolm key over, and thus which device a message came from. I might be wrong, but I'm not sure there is a way around this while supporting e2ee with multiple devices per user.

Other services proxy messages through the main device (your phone) and can get away only supporting one device like that. In practice, this does sort of obfuscate which device sent the message, but I think the downsides are bigger.

kevincox · 2021-09-22T22:19:30Z

I'm not sure there is a way around this while supporting e2ee with multiple devices per user.

Of course there is. But it becomes harder to do forward secrecy and other features that we love.

One option would be to just generate a symmetric "room key" which is shared to all participants. It is then rotated when membership changes (or periodically). In this case it is harder to do forward secrecy and you lose authenticity.

You could also generate an account key which is shared across all devices. Of course now you need to worry about "kicking out" a lost/stolen device. This can be done by rotating the account key.

I think if we did want to remove devices from the protocol we would probably need something like an account key. This would need to be regularly rotated and synced between all devices. Likely you would still have device keys but they are only used "within" an account (likely just for distributing the freshly rotated account key). Using this all of the devices would look the same to remote parties as they have effectively the same state. Of course there are lots of side-channels to be look out for so I wouldn't want to say that you can perfectly hide your devices. But at least it isn't as in-your-face.

Other services proxy messages through the main device

Yes, let's definitely not do that.

uhoreg · 2021-09-23T14:02:58Z

One option would be to just generate a symmetric "room key" which is shared to all participants.

MLS essentially does this. But this is orthogonal to the question of whether encryption is done per-user or per-device, as even if you have a room key, you are still left with the issue how to distribute the (information needed to rotate the) new room key. "Just generat[ing] a symmetric 'room key' which is shared to all participants" is not trivial if you want to do it securely and efficiently.

Anyways, the current status is that:

we are currently looking at MLS to replace the existing system
we do not have time to look at other things at the moment
if someone else wants to look at something else and propose something concrete, then we could consider it

richvdh · 2024-09-16T21:59:53Z

I don't really understand what this is proposing. How would we actually, securely, get the keys to the recipient devices?

I'm going to go ahead and close this, as it doesn't seem particularly practical.

huguesdk added the enhancement A suggestion for a relatively simple improvement to the protocol label Aug 19, 2021

uhoreg added the A-E2EE Issues about end-to-end encryption label Aug 19, 2021

huguesdk mentioned this issue Jan 4, 2022

messages sent while offline cannot be decrypted element-hq/element-android#4853

Open

richvdh transferred this issue from matrix-org/matrix-spec-proposals Mar 2, 2022

richvdh closed this as completed Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

handle encryption per user instead of per device #886

handle encryption per user instead of per device #886

huguesdk commented Aug 19, 2021

uhoreg commented Aug 20, 2021

kevincox commented Aug 28, 2021

huguesdk commented Sep 3, 2021

bwindels commented Sep 15, 2021 •

edited

Loading

bwindels commented Sep 15, 2021

kevincox commented Sep 22, 2021 •

edited

Loading

uhoreg commented Sep 23, 2021

richvdh commented Sep 16, 2024

handle encryption per user instead of per device #886

handle encryption per user instead of per device #886

Comments

huguesdk commented Aug 19, 2021

current issues

encryption is brittle and breaks too easily

device verification is complex

the device list can be empty

encryption keys take too much space

device lists are a privacy issue

improvements?

what about security?

how to transition to this?

uhoreg commented Aug 20, 2021

kevincox commented Aug 28, 2021

huguesdk commented Sep 3, 2021

bwindels commented Sep 15, 2021 • edited Loading

bwindels commented Sep 15, 2021

kevincox commented Sep 22, 2021 • edited Loading

uhoreg commented Sep 23, 2021

richvdh commented Sep 16, 2024

bwindels commented Sep 15, 2021 •

edited

Loading

kevincox commented Sep 22, 2021 •

edited

Loading