Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic/SIGSEGV in unwind cleanup #66781

Closed
sfackler opened this issue Nov 26, 2019 · 6 comments
Closed

Panic/SIGSEGV in unwind cleanup #66781

sfackler opened this issue Nov 26, 2019 · 6 comments
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@sfackler
Copy link
Member

sfackler commented Nov 26, 2019

We had a Rust (1.39.0 stable on x86_64-unknown-linux-gnu) service running on a server that started to exhaust its resources - in particular, mmap calls were returning ENOMEM. While in this state, the standard library's unwinding glue started encountering some issues:

  1. Unwrapping a None value:
...
   5: rust_begin_unwind
             at src/libstd/panicking.rs:307
   6: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
   7: core::panicking::panic
             at src/libcore/panicking.rs:49
   8: core::option::Option<T>::unwrap
             at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libcore/macros.rs:12
      panic_unwind::imp::cleanup
             at src/libpanic_unwind/gcc.rs:91
      __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:83
   9: std::panicking::try
             at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/panicking.rs:271
      std::panic::catch_unwind
             at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/panic.rs:394
...

That corresponds to this unwrap here: https://github.com/rust-lang/rust/blob/1.39.0/src/libpanic_unwind/gcc.rs#L91.

  1. A SIGSEGV inside of __rust_maybe_catch_panic:
Signal: 11 Code: 1
0x0055d61629b70a - __rust_maybe_catch_panic (0x0055d61629b6f0) + 0x1a
0x0055d615da0838 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h72262d2d5e42fdc2 (0x0055d615da07c0) + 0x78
0x0055d6162839af - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h483711add4ba2330 (0x0055d616283970) + 0x3f
0x0055d61629aae0 - std::sys::unix::thread::Thread::new::thread_start::h7c2a7f9b68fe4bba (0x0055d61629aa50) + 0x90
0x007f6290088e65 - start_thread (0x007f6290088da0) + 0xc5
0x007f628e8d188d - ???? 

My guess is that the inability to mmap somehow put the unwinder in a bad place?

@sfackler sfackler changed the title Panic in unwind cleanup Panic/SIGSEGV in unwind cleanup Nov 26, 2019
@jonas-schievink jonas-schievink added A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 26, 2019
@nagisa
Copy link
Member

nagisa commented Nov 28, 2019

@sfackler do you have a coredump of the process? Would be interesting to see what the pointer/argument values are…

Either way I think the only feasible way to investigate this issue would be for us to look into doing some syscall fault injection fuzzing kind of thing.

@sfackler
Copy link
Member Author

Unfortunately, I was not able to grab any cores :(

Yeah, if my hypothesis about mmap failures is correct, you could maybe LD_PRELOAD in a mmap that can be triggered to fail or something.

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2019

Case 1 can be caused if a foreign (C/C++) exception is caught by a catch_panic. This is now fixed by #65646.

I'm not sure about case 2, but it might be a similar issue.

@sfackler
Copy link
Member Author

sfackler commented Dec 8, 2019

Ah! That could explain it. The server does talk to RocksDB via some slightly sloppy FFI bindings that don't catch C++ exceptions, and I wouldn't be surprised if some error in rocks throws.

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2019

In your particular case I suspect it's just C++ throwing std::bad_alloc when new fails.

@Amanieu
Copy link
Member

Amanieu commented Dec 8, 2019

I think we can close this issue unless the problem is reproduced with 1.40+ which include #65646.

@Amanieu Amanieu closed this as completed Dec 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows C-bug Category: This is a bug. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants