Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Detect unknown remote devices and mark cache as stale #6776

Merged
merged 5 commits into from
Jan 28, 2020

Conversation

erikjohnston
Copy link
Member

When we receive to device messages or encrypted events from an unknown device we should log and mark the cache as stale. Currently we don't actually do anything about it, but that will come in a future PR.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks somewhat racy. How do we know that the device update isn't on its way, or something? Or maybe that doesn't matter?

local_messages[user_id] = messages_by_device

if message_type == "m.room_key_request":
# Get the sending device IDs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

factor out the body of this if statement to a separate function? it's getting a bit hard to follow.

@erikjohnston
Copy link
Member Author

This is very much a "err on the side of caution" style thing, and assumes resyncing is always a safe operation to do.

I don't think we're likely to see messages from a new device before we get the update unless something has gone quite wibbly, so hopefully this is rare enough that resyncing in that case is the right thing to do.

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, though I bet you a doughnut that we'll find a whole bunch of conditions that cause this.

@erikjohnston erikjohnston merged commit e17a110 into develop Jan 28, 2020
erikjohnston added a commit that referenced this pull request Feb 4, 2020
We were looking at the wrong event type (`m.room.encryption` vs
`m.room.encrypted`).

Also fixup the duplicate `EvenTypes` entries.

Introduced in #6776.
erikjohnston added a commit that referenced this pull request Feb 4, 2020
We were looking at the wrong event type (`m.room.encryption` vs
`m.room.encrypted`).

Also fixup the duplicate `EvenTypes` entries.

Introduced in #6776.
@erikjohnston erikjohnston deleted the erikj/detect_stale_device_lists branch February 5, 2020 17:35
babolivier added a commit that referenced this pull request Feb 12, 2020
Synapse 1.10.0 (2020-02-12)
===========================

**WARNING to client developers**: As of this release Synapse validates `client_secret` parameters in the Client-Server API as per the spec. See [\#6766](#6766) for details.

Updates to the Docker image
---------------------------

- Update the docker images to Alpine Linux 3.11. ([\#6897](#6897))

Synapse 1.10.0rc5 (2020-02-11)
==============================

Bugfixes
--------

- Fix the filtering introduced in 1.10.0rc3 to also apply to the state blocks returned by `/sync`. ([\#6884](#6884))

Synapse 1.10.0rc4 (2020-02-11)
==============================

This release candidate was built incorrectly and is superceded by 1.10.0rc5.

Synapse 1.10.0rc3 (2020-02-10)
==============================

Features
--------

- Filter out `m.room.aliases` from the CS API to mitigate abuse while a better solution is specced. ([\#6878](#6878))

Internal Changes
----------------

- Fix continuous integration failures with old versions of `pip`, which were introduced by a release of the `zipp` library. ([\#6880](#6880))

Synapse 1.10.0rc2 (2020-02-06)
==============================

Bugfixes
--------

- Fix an issue with cross-signing where device signatures were not sent to remote servers. ([\#6844](#6844))
- Fix to the unknown remote device detection which was introduced in 1.10.rc1. ([\#6848](#6848))

Internal Changes
----------------

- Detect unexpected sender keys on remote encrypted events and resync device lists. ([\#6850](#6850))

Synapse 1.10.0rc1 (2020-01-31)
==============================

Features
--------

- Add experimental support for updated authorization rules for aliases events, from [MSC2260](matrix-org/matrix-spec-proposals#2260). ([\#6787](#6787), [\#6790](#6790), [\#6794](#6794))

Bugfixes
--------

- Warn if postgres database has a non-C locale, as that can cause issues when upgrading locales (e.g. due to upgrading OS). ([\#6734](#6734))
- Minor fixes to `PUT /_synapse/admin/v2/users` admin api. ([\#6761](#6761))
- Validate `client_secret` parameter using the regex provided by the Client-Server API, temporarily allowing `:` characters for older clients. The `:` character will be removed in a future release. ([\#6767](#6767))
- Fix persisting redaction events that have been redacted (or otherwise don't have a redacts key). ([\#6771](#6771))
- Fix outbound federation request metrics. ([\#6795](#6795))
- Fix bug where querying a remote user's device keys that weren't cached resulted in only returning a single device. ([\#6796](#6796))
- Fix race in federation sender worker that delayed sending of device updates. ([\#6799](#6799), [\#6800](#6800))
- Fix bug where Synapse didn't invalidate cache of remote users' devices when Synapse left a room. ([\#6801](#6801))
- Fix waking up other workers when remote server is detected to have come back online. ([\#6811](#6811))

Improved Documentation
----------------------

- Clarify documentation related to `user_dir` and `federation_reader` workers. ([\#6775](#6775))

Internal Changes
----------------

- Record room versions in the `rooms` table. ([\#6729](#6729), [\#6788](#6788), [\#6810](#6810))
- Propagate cache invalidates from workers to other workers. ([\#6748](#6748))
- Remove some unnecessary admin handler abstraction methods. ([\#6751](#6751))
- Add some debugging for media storage providers. ([\#6757](#6757))
- Detect unknown remote devices and mark cache as stale. ([\#6776](#6776), [\#6819](#6819))
- Attempt to resync remote users' devices when detected as stale. ([\#6786](#6786))
- Delete current state from the database when server leaves a room. ([\#6792](#6792))
- When a client asks for a remote user's device keys check if the local cache for that user has been marked as potentially stale. ([\#6797](#6797))
- Add background update to clean out left rooms from current state. ([\#6802](#6802), [\#6816](#6816))
- Refactoring work in preparation for changing the event redaction algorithm. ([\#6803](#6803), [\#6805](#6805), [\#6806](#6806), [\#6807](#6807), [\#6820](#6820))
babolivier added a commit that referenced this pull request May 21, 2020
When a call to `user_device_resync` fails, we don't currently mark the remote user's device list as out of sync, nor do we retry to sync it.

#6776 introduced some code infrastructure to mark device lists as stale/out of sync.

This commit uses that code infrastructure to mark device lists as out of sync if processing an incoming device list update makes the device handler realise that the device list is out of sync, but we can't resync right now.

It also adds a looping call to retry all failed resync every 30s. This shouldn't cause too much spam in the logs as this commit also removes the "Failed to handle device list update for..." warning logs when catching `NotRetryingDestination`.

Fixes #7418
phil-flex pushed a commit to phil-flex/synapse that referenced this pull request Jun 16, 2020
When a call to `user_device_resync` fails, we don't currently mark the remote user's device list as out of sync, nor do we retry to sync it.

matrix-org#6776 introduced some code infrastructure to mark device lists as stale/out of sync.

This commit uses that code infrastructure to mark device lists as out of sync if processing an incoming device list update makes the device handler realise that the device list is out of sync, but we can't resync right now.

It also adds a looping call to retry all failed resync every 30s. This shouldn't cause too much spam in the logs as this commit also removes the "Failed to handle device list update for..." warning logs when catching `NotRetryingDestination`.

Fixes matrix-org#7418
babolivier pushed a commit that referenced this pull request Sep 1, 2021
* commit 'e17a11066':
  Detect unknown remote devices and mark cache as stale (#6776)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants