Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

cross-signing signature was not sent out over federation #14251

Closed
richvdh opened this issue Oct 20, 2022 · 8 comments
Closed

cross-signing signature was not sent out over federation #14251

richvdh opened this issue Oct 20, 2022 · 8 comments
Assignees
Labels
A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@richvdh
Copy link
Member

richvdh commented Oct 20, 2022

@clokep cross-signed his new device KIRDUXDNFX:

2022-10-19 15:06:52,790 - synapse.access.http.8080 - 460 - INFO - POST-2669640 - 207.172.67.131 - 8080 - {@clokep:matrix.org} Processed request: 0.310sec/0.117sec (0.006sec, 0.005sec) (0.010sec/0.054sec/4) 15B 200 "POST /_matrix/client/unstable/keys/signatures/upload HTTP/1.1" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) ElementNightly/2022101901 Chrome/104.0.5112.124 Electron/20.3.2 Safari/537.36" [0 dbevts]

This does not appear to have been received at various federated servers, including element.io and sw1v.org.

This added a cross-signing signature, visible on matrix.org:

matrix=> select * from e2e_cross_signing_signatures where user_id='@clokep:matrix.org' and target_device_id='KIRDUXDNFX';
      user_id       |                       key_id                        |   target_user_id   | target_device_id |                                       signature                                        
--------------------+-----------------------------------------------------+--------------------+------------------+----------------------------------------------------------------------------------------
 @clokep:matrix.org | ed25519:9vBJWMPmxp3Lhzcu/tFKhglm57yL8yKeYtyrgIuQLzI | @clokep:matrix.org | KIRDUXDNFX       | vkJ6iY31o9QFwIvb2YotDrQuMkIT5j8ViQU6vfmOzZV0uxtleYZEgCInjPNs94Kxlxos8/ouOr6mROrRcmDFBA

This looks to correspond to

2022-10-19 15:06:52,920 - synapse.handlers.device - 730 - INFO - _handle_new_device_update_async-29398 - Sending device list update notif for '@clokep:matrix.org' to: {'riot.ovh', 'half-shot.uk', 'uhoreg.ca', 'lant.uk', 'element.io', 'robin.town', 'connecteu.rs', 'flipdot.org', 'ergaster.org', 't2l.io', 'vector.modular.im'}

... however, I can't see any outgoing device list updates on the federation sender. (There were device list updates at 15:05:24 and 15:05:57, but those precede the cross-signature).

@richvdh
Copy link
Member Author

richvdh commented Oct 20, 2022

Rows are deleted from device_lists_outbound_pokes after they are successfully sent, but we can get an idea of what would have been there for element.io from rows for unreachable servers:

matrix=> select to_timestamp(ts/1000),* from device_lists_outbound_pokes where stream_id >= 5499801286 and user_id='@clokep:matrix.org' and device_id='KIRDUXDNFX' limit 1;
      to_timestamp      |  destination   | stream_id  |      user_id       | device_id  | sent |      ts       | opentracing_context 
------------------------+----------------+------------+--------------------+------------+------+---------------+---------------------
 2022-10-19 16:06:53+01 | redacted.com   | 5499801305 | @clokep:matrix.org | KIRDUXDNFX | f    | 1666192013105 | {}
(1 row)

The federation sender appears to skip that stream id:

synapse@grindylow:~/var$ tail -c +514000000 federation_sender2.log-20221019 | grep 'device update to element.io'
2022-10-19 15:06:36,492 - synapse.8631_debug - 440 - DEBUG - federation_transaction_transmission_loop-349328- - device update to element.io for m.device_list_update from 5499787008 to 5499787245: {'user_id': '@igulek:matrix.org', 'device
2022-10-19 15:06:59,303 - synapse.8631_debug - 440 - DEBUG - federation_transaction_transmission_loop-349807- - device update to element.io for m.signing_key_update from 5499791496 to 5499792454: {'master_key': {'signatures': {'@magmagse
2022-10-19 15:06:59,304 - synapse.8631_debug - 440 - DEBUG - federation_transaction_transmission_loop-349807- - device update to element.io for org.matrix.signing_key_update from 5499791496 to 5499792454: {'master_key': {'signatures': {'
2022-10-19 15:07:18,252 - synapse.8631_debug - 440 - DEBUG - federation_transaction_transmission_loop-350217- - device update to element.io for m.device_list_update from 5499810373 to 5499810374: {'user_id': '@kadev2:matrix.org', 'device

... which I think means either (a) no poke was ever written to device_lists_outbound_pokes for element.io, or (b) _device_list_federation_stream_cache.has_entity_changed('element.io', 5499792454) returned False despite there being pokes in the db.

@richvdh richvdh added A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Oct 20, 2022
@richvdh richvdh self-assigned this Oct 21, 2022
@uhoreg
Copy link
Member

uhoreg commented Oct 21, 2022

This does not appear to have been received at various federated servers, including element.io and sw1v.org.

FWIW, it seems to have been received at uhoreg.ca, so it worked for some destinations and not for others.

richvdh added a commit that referenced this issue Oct 21, 2022
Some frantic clutching at straws to debug
#14251.
@Fizzadar
Copy link
Contributor

Regarding the stream change cache - there is a race with federation there, see #14158 (comment) and logging PR #14239, we're seeing pretty regular logs indicating that stream cache is behind and has potential to return invalid results. Just pushed one possible fix in #14269.

@richvdh
Copy link
Member Author

richvdh commented Oct 24, 2022

Interesting, thanks for the heads-up!

@richvdh
Copy link
Member Author

richvdh commented Jan 9, 2023

Seems like this might have been fixed by #14723?

@clokep
Copy link
Member

clokep commented Jan 9, 2023

Seems like this might have been fixed by #14723?

I hope so? I'm not sure how we would prove that right or wrong though. Should we close this unless we see it happening again?

@richvdh
Copy link
Member Author

richvdh commented Jan 10, 2023

Yes, if it's at all plausible it's been fixed, let's close it.

Thanks @Fizzadar for your work here - it's super valuable!

@richvdh richvdh closed this as completed Jan 10, 2023
@clokep
Copy link
Member

clokep commented Jan 10, 2023

Yes, if it's at all plausible it's been fixed, let's close it.

It is also plausible that the fixes to the StreamChangeCache would have fixed this (#14435, specifically).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Device-List-Tracking Telling clients about other devices. Often related to E2EE. O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

4 participants