Since last nightly, winit doesn't create windows any more #43102

est31 · 2017-07-07T02:15:32Z

Doing the following:


$ git clone https://github.com/tomaka/winit
$ cd winit
$ cargo run --example window

gives me a window on stable, beta, and 1.20.0-nightly (3610a70ce 2017-07-05).
On 1.20.0-nightly (696412de7 2017-07-06) I get no window.

List of the commits.

Platform is Linux/X11.

cc @tomaka

The text was updated successfully, but these errors were encountered:

est31 · 2017-07-07T02:16:25Z

Obviously this also affects glium examples.

est31 · 2017-07-07T02:49:07Z

Using rust-bisect I've bisected the regression to #42816 (cd72f2e). cc @alexcrichton @nikomatsakis .

est31 · 2017-07-07T03:34:43Z

Minimized example using only the x11_dl crate (x11-dl = "2.8" in Cargo.toml):

extern crate x11_dl;

pub use x11_dl::xlib::Xlib;
pub use x11_dl::xcursor::Xcursor;
use x11_dl::error::OpenError;

fn foo() -> Result<Xlib, OpenError> {
    // opening the libraries
    println!("new 1");
    let xlib = try!(Xlib::open());
    println!("new 2");
    let _xcursor = try!(Xcursor::open());

    Ok(xlib)
}

fn main() {
    foo().unwrap();
    println!("Good!");
}

On the old nightly it prints new 1 then new 2 then Good!. On the new nightly it prints new 1 then segfaults.

rukai · 2017-07-07T12:55:49Z

Issue doesnt occur when running in a seperate thread:

extern crate x11_dl;

pub use x11_dl::xlib::Xlib;
pub use x11_dl::xcursor::Xcursor;
use x11_dl::error::OpenError;
use std::thread;
use std::time::Duration;

fn foo() -> Result<Xlib, OpenError> {
    // opening the libraries
    println!("new 1");
    let xlib = try!(Xlib::open());
    println!("new 2");
    let _xcursor = try!(Xcursor::open());

    Ok(xlib)
}

fn main() {
    thread::spawn(|| {
        foo().unwrap();
    });
    thread::sleep(Duration::new(999,0));
}

est31 · 2017-07-07T13:53:25Z

Hmm so locally applying #43072 on top of commit cd72f2e didn't help. I still get the segfault with a locally compiled rustc.

est31 · 2017-07-07T14:01:28Z

The location at which the stack overflow occurs seems to be here.

full backtrace (expand)

#0  0x00005555555c52b0 in compiler_builtins::probestack::__rust_probestack ()
    at src/rustc/compiler_builtins_shim/../../libcompiler_builtins/src/probestack.rs:55
#1  0x000055555555dbbe in lazy_static::lazy::{{impl}}::get::{{closure}}<[(&str, usize); 767],fn() -> [(&str, usize); 767]>
    () at ~/.cargo/registry/src/github.mirror.nvdadr.com-1ecc6299db9ec823/lazy_static-0.2.8/src/lazy.rs:22
#2  0x000055555555d82a in std::sync::once::{{impl}}::call_once::{{closure}}<closure> ()
    at ~/src/rust/src/libstd/sync/once.rs:227
#3  0x0000555555580d6d in std::sync::once::Once::call_inner (
    self=0x5555557f51d0 <<x11_dl::xlib::Xlib::init::SYMS as core::ops::deref::Deref>::deref::__stability::LAZY+8>, 
    ignore_poisoning=<Fehler beim Lesen der Variable: access outside bounds of object referenced via synthetic pointer>, 
    init=...) at src/libstd/sync/once.rs:307
#4  0x000055555555d6cc in std::sync::once::Once::call_once<closure> (
    self=0x5555557f51d0 <<x11_dl::xlib::Xlib::init::SYMS as core::ops::deref::Deref>::deref::__stability::LAZY+8>, f=...)
    at ~/src/rust/src/libstd/sync/once.rs:227
#5  0x000055555557dc00 in lazy_static::lazy::Lazy<[(&str, usize); 767]>::get<[(&str, usize); 767],fn() -> [(&str, usize); 767]> (self=<optimized out>)
    at ~/.cargo/registry/src/github.mirror.nvdadr.com-1ecc6299db9ec823/lazy_static-0.2.8/src/lazy.rs:22
#6  x11_dl::xlib::{{impl}}::init::{{impl}}::deref::__stability () at <__lazy_static_internal macros>:20
#7  x11_dl::xlib::{{impl}}::init::{{impl}}::deref (self=0x5555555c9743 <str.eC>) at <__lazy_static_internal macros>:21
#8  0x000055555557d532 in x11_dl::xlib::Xlib::init (self=0x7ffffffef940)
    at ~/.cargo/registry/src/github.mirror.nvdadr.com-1ecc6299db9ec823/x11-dl-2.14.0/src/link.rs:48
#9  0x000055555557d99d in x11_dl::xlib::Xlib::open ()
    at ~/.cargo/registry/src/github.mirror.nvdadr.com-1ecc6299db9ec823/x11-dl-2.14.0/src/link.rs:61
#10 0x000055555555b548 in c::foo () at src/main.rs:10
#11 0x000055555555b9ed in c::main () at src/main.rs:18

tomaka · 2017-07-07T14:16:04Z

The problem is probably that a [(&str, usize); 767] is created on the stack here: https://github.com/rust-lang-nursery/lazy-static.rs/blob/63f333bf7795fdae2627041d3d2a782b50f6ddd5/src/lazy.rs#L23

alexcrichton · 2017-07-07T17:49:07Z

Ok I believe that this is going to be fixed with rust-lang/compiler-builtins#176. Once that lands I'll open a PR to pull it into rust-lang/rust.

After some debugging with @est31 I also was recommended by @cuviper to cc @bwhacks here as well as I'm told you're working on the kernel side of some changes here. @bwhacks you may be quite interested in this patch!

First it may help to give you some background. Everyone seems super interested in big stacks nowadays in lots of different ways! A few days ago we landed a change which switched to using stack probes on all x86/x86_64 platforms. This uses upstream LLVM's support for emitting a call to a function as part of the prologue for a function, and the intention is to touch each page of a stack frame to ensure they're all mapped. Basically we just want to guarantee that for all threads on all platforms if you have a guard page you'll hit it.

The actual stack probing function is in this file (note that this link is current as-of the time of this writing) which we define locally and LLVM controls the ABI of how its called (register-wise and whatnot).

Unfortunately though it looks like the addition of stack probes has caused this issue as a regression as well as #43110. After debugging this with @est31 we noticed that the segfault in this issue (the program in the comment above) was suspicious in that it was faulting just below the end of the stack. In other words, the segfault was happening just below the lowest-address of the mapped stack, but within a page of the lowest stack address.

I'm personally quite unfamiliar with the implementation of stack growth in the kernel, but the investigation on #43052 clued me into at least some changes happening on the kernel side recently. Based on this I stabbed in the dark with rust-lang/compiler-builtins@f9f6bd0 and it ended up fixing the regression for @est31 (fixing this issue).

So with all that said, I think that this may be another case where the recent changes in the kernel related to stack growth may have accidentally broken programs. My guess is that the kernel previously grew the stack for any fault within a certin range, but nowadays it looks like it also has a condition that the faulting page has to be above the stack pointer. Our __rust_probestack function originally was faulting addresses below the stack pointer but the fix was to update the stack pointer as we go along and that appeared to appease the kernel and trigger normal stack-growth logic.

Note that this isn't really a problem for us. The __rust_probestack function was very recently added and never made its way to a release of Rust itself. We'll have this fixed shortly and when stack probes are stabilized we'll have the new behavior regardless. In other words on our end there's no great urgency to "fix" anything in the kernel, but it seemed like you would want to be aware of this!

cuviper · 2017-07-07T18:02:20Z

One more detail - we were seeing this even after removing Rust's custom guard page, so it seems not to be an issue with the new padding. More like the kernel is putting new constraints on stack growth below the actual stack pointer. Maybe it's enforcing no access farther than the x86_64 red zone? (128 bytes)

cuviper · 2017-07-07T18:34:21Z

Hmm, the x86 page fault handler checks %sp before expanding the stack

	/*
	 * Accessing the stack below %sp is always a bug.
	 * The large cushion allows instructions like enter
	 * and pusha to work. ("enter $65535, $31" pushes
	 * 32 pointers and then decrements %sp by 65535.)
	 */
	if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) {
		bad_area(regs, error_code, address);
		return;
	}

So "always a bug" but allows some cushion anyway -- yet it seems like we aren't getting that allowance. This code has not been changed in many years, nor does it appear to be changed in the Fedora kernel I'm running.

est31 · 2017-07-07T19:18:11Z

Many thanks @alexcrichton for you looking into this issue!

Closes rust-lang#43102

fintelia · 2017-07-10T15:48:57Z

This issue is made worse by the fact that cargo run doesn't display segmentation fault information. Silently failing is problematic in situations where no output is interpreted as success. Though even printing a segmentation fault isn't ideal given that a more precise error would be "stack overflow".

briansmith · 2017-07-10T21:15:41Z

There is no red zone in Windows and (IIRC) you must always move the stack pointer below any address before you access something at that address.

Update compiler_builtins submodule for probestack fix Closes #43102

est31 mentioned this issue Jul 7, 2017

Rust's main-stack thread guard is implemented incorrectly; breaks nearly all Rust programs on 64KB-pagesize post-stack-clash Linux #43052

Closed

est31 mentioned this issue Jul 7, 2017

Today's nightly (stack probes) causes capnpc to segfault #43110

Closed

alexcrichton added a commit to alexcrichton/rust that referenced this issue Jul 8, 2017

Update compiler_builtins submodule for probestack fix

b8b4f56

Closes rust-lang#43102

alexcrichton mentioned this issue Jul 8, 2017

Update compiler_builtins submodule for probestack fix #43116

Merged

emilio mentioned this issue Jul 9, 2017

Servo fails with stacktrace for every website servo/servo#17642

Closed

Ygg01 mentioned this issue Jul 10, 2017

silent segfault servo/html5ever#287

Closed

mrobinson mentioned this issue Jul 11, 2017

[DO NOT MERGE] Upgrade to the latest version of WebRender servo/servo#17649

Closed

5 tasks

bors added a commit that referenced this issue Jul 11, 2017

Auto merge of #43116 - alexcrichton:probestack, r=sfackler

9475ae4

Update compiler_builtins submodule for probestack fix Closes #43102

bors closed this as completed in #43116 Jul 11, 2017

tomaka mentioned this issue Jul 17, 2017

Segmentation fault calling EventsLoop::new() rust-windowing/winit#232

Closed

est31 mentioned this issue Aug 21, 2017

ThreadSanitizer broke between nightly build 07-06 and 07-07 #44002

Closed

shisoft mentioned this issue Feb 5, 2019

Runtime stack overflow that does not make any sense #58164

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Since last nightly, winit doesn't create windows any more #43102

Since last nightly, winit doesn't create windows any more #43102

est31 commented Jul 7, 2017 •

edited

Loading

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

rukai commented Jul 7, 2017

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

tomaka commented Jul 7, 2017

alexcrichton commented Jul 7, 2017

cuviper commented Jul 7, 2017

cuviper commented Jul 7, 2017

est31 commented Jul 7, 2017

fintelia commented Jul 10, 2017

briansmith commented Jul 10, 2017

Since last nightly, winit doesn't create windows any more #43102

Since last nightly, winit doesn't create windows any more #43102

Comments

est31 commented Jul 7, 2017 • edited Loading

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

rukai commented Jul 7, 2017

est31 commented Jul 7, 2017

est31 commented Jul 7, 2017

tomaka commented Jul 7, 2017

alexcrichton commented Jul 7, 2017

cuviper commented Jul 7, 2017

cuviper commented Jul 7, 2017

est31 commented Jul 7, 2017

fintelia commented Jul 10, 2017

briansmith commented Jul 10, 2017

est31 commented Jul 7, 2017 •

edited

Loading