Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple uploads resulting in CANCELED #25756

Open
Cyberes opened this issue Jul 11, 2023 · 13 comments
Open

Multiple uploads resulting in CANCELED #25756

Cyberes opened this issue Jul 11, 2023 · 13 comments
Labels
A-File-Upload Attachments and file uploads T-Other Questions, user support, anything else X-Needs-Investigation

Comments

@Cyberes
Copy link

Cyberes commented Jul 11, 2023

Steps to reproduce

I am attempting to upload two video files (each about 30 MB) to a room at the same time. This always fails with a generic error message popup (The file '[filename]' failed to upload.). There is no error in the console and the network of the debug window shows the fetch request as (canceled). Synapse logs don't show any errors and neither does Openresty. Uploading the files one at a time works, but uploading them together always fails.

We've investigated the Synapse side of things and everything seems to be working -> matrix-org/synapse#15841

  • My upload limit in Synapse and Openresty (reverse proxy) is 40M. This has been an issue for a while and I have not discovered the root cause yet.
  • My client is desktop and web Element on Ubuntu.
  • I have a media worker with an S3 storage provider setup (see config below).
media_store_path:                                 "/mnt/matrix-storage/media"

media_storage_providers:
- module:                                         s3_storage_provider.S3StorageProviderBackend
  store_local:                                    True
  store_remote:                                   True
  store_synchronous:                              True
  config:
      bucket:                                     matrix-synapse
      endpoint_url:                               'https://xxx.r2.cloudflarestorage.com/matrix-synapse'
      access_key_id:                              'xxx'
      secret_access_key:                          'xxx'

Outcome

What did you expect?

Files are uploaded.

What happened instead?

Element said the upload failed and the request in the network console showed (CANCELED).

Operating system

Ubuntu

Application version

1.11.34

How did you install the app?

Repository

Homeserver

matrix.evulid.cc

Will you send logs?

Yes, but I didn't find any relevant logs during my investigation.

@t3chguy
Copy link
Member

t3chguy commented Jul 11, 2023

Please try again with the network tab open and inspect the underlying failure. Ctrl Shift I > Network > then try

@t3chguy t3chguy added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 11, 2023
@Cyberes
Copy link
Author

Cyberes commented Jul 11, 2023

I checked the debugger and saw that I am triggering the timeout at https://github.com/matrix-org/matrix-js-sdk/blob/acbcb4658a5d5903dfd557e3e115241d0a6f38bb/src/http-api/index.ts#L102-L111. The upload only takes a few seconds but the bar sits at 100% for a bit before the timeout kills it.

Element Desktop HAR.
https://github.com/matrix-org/synapse/files/11996114/vector.fixed.har.txt

You can read through the linked Synapse issue to get up to speed.

@t3chguy
Copy link
Member

t3chguy commented Jul 11, 2023

That implies that there was no XHR progress for 30 seconds, https://github.com/matrix-org/matrix-js-sdk/blob/acbcb4658a5d5903dfd557e3e115241d0a6f38bb/src/http-api/index.ts#L102C24-L102C24 resets the cancellation timeout on any progress. I would argue that no progress on the upload in 30s then something is up with the connection.

@t3chguy t3chguy added T-Other Questions, user support, anything else and removed X-Needs-Info This issue is blocked awaiting information from the reporter T-Defect labels Jul 11, 2023
@Cyberes
Copy link
Author

Cyberes commented Jul 11, 2023

I saw that the uploaded file appears in the media store which means that Synapse received the file correctly. Is there something with the communication between Synapse and Element around upload completion that would lead to this canceled issue?

@t3chguy
Copy link
Member

t3chguy commented Jul 11, 2023

Yes seems like either Synapse or your reverse proxy or any proxies inbetween hang for >30s at the end of the upload not sending any progress or closing the connection & sending a status code.

@Cyberes
Copy link
Author

Cyberes commented Jul 11, 2023

Okay, I'll do some more digging into my setup.

@Cyberes Cyberes closed this as completed Jul 11, 2023
@MadLittleMods MadLittleMods added the A-File-Upload Attachments and file uploads label Jul 11, 2023
@Cyberes Cyberes reopened this Jul 13, 2023
@t3chguy t3chguy added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jul 13, 2023
@Cyberes
Copy link
Author

Cyberes commented Jul 13, 2023

Okay, I have some new logs to support my position that this is an Element issue. Here's the final part of the log where the client cancels the upload because it hits its internal timeout:

Log
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2078802
SSL_read: 7168
SSL_read: 3072
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2089042
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2099282
SSL_read: 3072
SSL_read: 7168
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2109522
SSL_read: 9216
SSL_read: 1024
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2119762
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2130002
SSL_read: 5120
SSL_read: 5120
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2140242
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2150482
SSL_read: 1024
SSL_read: 9216
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2160722
SSL_read: 7168
SSL_read: 3072
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2170962
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2181202
SSL_read: 3072
SSL_read: 7168
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2191442
SSL_read: 9216
SSL_read: 1024
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2201682
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2211922
SSL_read: 5120
SSL_read: 5120
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2222162
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2232402
SSL_read: 1024
SSL_read: 9216
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2242642
SSL_read: 7168
SSL_read: 3072
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2252882
SSL_read: 10240
delete posted event 00005645A6C2E150
post event 00005645A6C2E150
SSL_read: avail:0
http client request body recv 10240
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 10240 file: 0, size: 0
http write client request body, bufs 00005645A6C71848
write: 43, 00005645A6DDBD90, 10240, 2263122
SSL_read: 3072
SSL_read: 0
SSL_get_error: 6
peer shutdown SSL cleanly
http client request body recv 3072
http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 3072 file: 0, size: 0
http client request body rest 30108589
http client request body recv 0
client prematurely closed connection, client: 10.0.0.9, server: , request: "POST /_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Video.mp4 HTTP/1.1", host: "matrix.evulid.cc"
http finalize request: 400, "/_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Video.mp4" a:1, c:1
http terminate request count:1
http terminate cleanup count:1 blk:0
http posted request: "/_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Video.mp4"
http terminate handler count:1
http request count:1 blk:0
http close request
http log handler
run cleanup: 00005645A6C71858
file cleanup: fd:43
free: 00005645A6DDBD90
free: 00005645A6D706B0, unused: 1
free: 00005645A6C70870, unused: 0
free: 00005645A6E3F6F0, unused: 2808
close http connection: 42
SSL_shutdown: 1
delete posted event 00005645A6C2E150
reusable connection: 0
free: 00005645A6C702B0
free: 00005645A6DE0D90, unused: 40

Here are the important bits:

The log shows that the client prematurely closed the connection:

client prematurely closed connection, client: 10.0.0.9, server: , request: "POST /_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Video.mp4 HTTP/1.1", host: "matrix.evulid.cc"

We already know this is happening, so nothing new here.

But, take a look at this line right before the client prematurely closed connection:

http client request body rest 30108589
http client request body recv 0

The server is still in the process of receiving the request body when the client closes the connection. If I understand Nginx debug logging, recv 0 means the server is expecting to receive more data, but didn't.

@t3chguy t3chguy added X-Needs-Investigation and removed X-Needs-Info This issue is blocked awaiting information from the reporter labels Jul 13, 2023
@t3chguy
Copy link
Member

t3chguy commented Jul 13, 2023

@Cyberes while you have the logging enabled could you compare it to a working 1-file upload?

@Cyberes
Copy link
Author

Cyberes commented Jul 13, 2023

Here you go. It's the complete process, all 3000 lines. (Hopefully I removed all my personal information)
debug.txt
The file was a 940 kB image. The main POST starts on line 1984.

@Cyberes
Copy link
Author

Cyberes commented Jul 13, 2023

There is one line that shows

recv() failed (104: Connection reset by peer) while sending to client, client: 10.0.0.9, server: , request: "POST /_matrix/media/r0/upload?filename=IMG_6512.jpg HTTP/1.1", upstream: "http://127.0.0.1:8114/_matrix/media/r0/upload?filename=IMG_6512.jpg", host: "matrix.evulid.cc"

Which I don't think matters since Nginx sent 200 OK moments before: http finalize request: 0, "/_matrix/media/r0/upload?filename=IMG_6512.jpg" a:1, c:1

Could there be something weird going on where the client isn't cleanly closing the connection and its only causing issues with large file uploads?


Here is the end of the failed 35 MB video upload:

http body new buf t:1 f:0 00005645A6DDBD90, pos 00005645A6DDBD90, size: 3072 file: 0, size: 0
http client request body rest 30108589
http client request body recv 0
client prematurely closed connection, client: 10.0.0.9, server: , request: "POST /_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Porn.mp4 HTTP/1.1", host: "matrix.evulid.cc"
http finalize request: 400, "/_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Porn.mp4" a:1, c:1
http terminate request count:1
http terminate cleanup count:1 blk:0
http posted request: "/_matrix/media/r0/upload?filename=DCS%2520Fan%2520Boys%2520Rejoice%2520as%2520Ukraine%2520Releases%2520Fresh%2520Jet%2520Porn.mp4"
http terminate handler count:1
http request count:1 blk:0
http close request
http log handler
run cleanup: 00005645A6C71858
file cleanup: fd:43
free: 00005645A6DDBD90
free: 00005645A6D706B0, unused: 1
free: 00005645A6C70870, unused: 0
free: 00005645A6E3F6F0, unused: 2808
close http connection: 42
SSL_shutdown: 1
delete posted event 00005645A6C2E150
reusable connection: 0
free: 00005645A6C702B0
free: 00005645A6DE0D90, unused: 40

Vs. the end of the small image file:

http copy filter: 0 "/_matrix/media/r0/upload?filename=IMG_6512.jpg"
http finalize request: 0, "/_matrix/media/r0/upload?filename=IMG_6512.jpg" a:1, c:1
SSL_shutdown: 1
reusable connection: 1
event timer add: 44: 5000:1212692033
http lingering close handler
recv: eof:0, avail:-1
recv: fd:44 -1 of 4096
recv() not ready (11: Resource temporarily unavailable)
lingering read: -2
event timer: 44, old: 1212692033, new: 1212692033
http lingering close handler
recv: eof:1, avail:-1
recv: fd:44 31 of 4096
lingering read: 31
recv: eof:1, avail:0
recv: fd:44 -1 of 4096
recv() failed (104: Connection reset by peer) while sending to client, client: 10.0.0.9, server: , request: "POST /_matrix/media/r0/upload?filename=IMG_6512.jpg HTTP/1.1", upstream: "http://127.0.0.1:8114/_matrix/media/r0/upload?filename=IMG_6512.jpg", host: "matrix.evulid.cc"
lingering read: -1
http request count:1 blk:0
http close request
http log handler
free: 00005645A6DF56D0
free: 0000000000000000
free: 00005645A6D94710
free: 00005645A6DDBD90
free: 00005645A6D9D070, unused: 1
free: 00005645A6E1DA80, unused: 0
free: 00005645A6E3F6F0, unused: 0
free: 00005645A6C772F0, unused: 2579
close http connection: 44
SSL_shutdown: 1
event timer del: 44: 1212692033
reusable connection: 0
free: 00005645A6EEF640
free: 00005645A6E4EAB0
free: 00005645A6C65FA0, unused: 24
free: 00005645A6D79D60, unused: 400

@Cyberes
Copy link
Author

Cyberes commented Oct 18, 2023

@t3chguy I know its been a few months, but I'm still having this issue on version 1.94.0.

@t3chguy
Copy link
Member

t3chguy commented Oct 18, 2023

@t3chguy I know its been a few months, but I'm still having this issue on version 1.94.0.

That isn't an Element Web/Desktop version.

@Cyberes
Copy link
Author

Cyberes commented Oct 18, 2023

I'm referring to the Synapse version. Anyway, I disabled Cloudflare for matrix.evulid.cc and I was able to upload a few dozen files without any (CANCELED) error. I then switched CF back on and got the error again.

I've never encountered an issue like this with CF. Do you know of any conflicts that may be occurring between Element -> Cloudflare -> Synapse?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-File-Upload Attachments and file uploads T-Other Questions, user support, anything else X-Needs-Investigation
Projects
None yet
Development

No branches or pull requests

3 participants