Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

armv5te hello world fails to run on qemu and pi3 #46822

Closed
malbarbo opened this issue Dec 18, 2017 · 19 comments
Closed

armv5te hello world fails to run on qemu and pi3 #46822

malbarbo opened this issue Dec 18, 2017 · 19 comments

Comments

@malbarbo
Copy link
Contributor

malbarbo commented Dec 18, 2017

Running (in qemu and in raspberry pi 3) a hello world binary compiled with xargo and rust nightly-2017-11-16 works as expected. Using rustc nightly-2017-11-17 it segfaults.

Here is a docker file that can be used to reproduce the problem:

FROM ubuntu:17.10

RUN apt-get update
RUN apt-get install \
    qemu-user \
    curl ca-certificates \
    make file \
    gcc libc6-dev \
    gcc-arm-linux-gnueabi libc6-dev-armel-cross \
    -y --no-install-recommends
# change to 2017-11-17 to fail
RUN curl https://sh.rustup.rs -sSf | sh -s -- -y --default-toolchain nightly-2017-11-16 
ENV PATH=$PATH:/root/.cargo/bin/
RUN rustup component add rust-src
RUN cargo install xargo

ENV USER=root
RUN cargo new hello --bin

RUN mkdir hello/.cargo/
RUN echo "[target.armv5te-unknown-linux-gnueabi]\nlinker = \"arm-linux-gnueabi-gcc\"" > hello/.cargo/config

RUN echo "[target.armv5te-unknown-linux-gnueabi.dependencies.std]\nfeatures = [\"force_alloc_system\"] " > hello/Xargo.toml

RUN echo "\n[profile.release]\npanic = \"abort\"" >> hello/Cargo.toml

ENV CFLAGS_armv5te_unknown_linux_gnueabi="-march=armv5te -mfloat-abi=soft" \
    CC_armv5te_unknown_linux_gnueabi=arm-linux-gnueabi-gcc

RUN cd hello && xargo build --release --target armv5te-unknown-linux-gnueabi

RUN QEMU_STRACE=1 qemu-arm -L /usr/arm-linux-gnueabi hello/target/armv5te-unknown-linux-gnueabi/release/hello

According to @Dushistov, the crash happens in __sync_val_compare_and_swap_4, but I get other result.

Edit: removed the stack trace, it was not helping and it takes to much space.

@Dushistov
Copy link
Contributor

@malbarbo

According to @Dushistov, the crash happens in __sync_val_compare_and_swap_4
And here is a strace running in raspberry pi 3:

But strace just show system calls it doesn't show crash place, or I missed something?

If you have no gdb on the board (this is my case), you can run ulimit -c unlimited
and then look at the core.SOME_PID files generated by crash under gdb from cross toolchain on your PC machine.

@malbarbo
Copy link
Contributor Author

@Dushistov You are right. I didn't want to mean that strace show the crash place, sorry.

@malbarbo
Copy link
Contributor Author

Running in gdb I get the following backtrace:

#0  0x00000000 in ?? ()
#1  0x0040f800 in __sync_fetch_and_add_4 ()
#2  0x004047dc in std::io::stdio::stdout::hf90edfc5dc4b8a28 ()
#3  0x00405028 in std::io::stdio::_print::h73be5c1a0e336538 ()
#4  0x00401dc0 in hello::main () at src/main.rs:2
#5  0x004087bc in __rust_maybe_catch_panic ()
#6  0x00408230 in std::rt::lang_start::h62f49e8260dc865f ()
#7  0x00401e28 in main ()

Here is the disassemble output of __sync_fetch_and_add_4:

Dump of assembler code for function __sync_fetch_and_add_4:
   0x0000f7e0 <+0>:     push    {r4, r5, r6, lr}
   0x0000f7e4 <+4>:     mov     r4, r1
   0x0000f7e8 <+8>:     mov     r2, r0
   0x0000f7ec <+12>:    ldr     r5, [r2]
   0x0000f7f0 <+16>:    ldr     r6, [pc, #24]   ; 0xf810 <__sync_fetch_and_add_4+48>
   0x0000f7f4 <+20>:    add     r1, r5, r4
   0x0000f7f8 <+24>:    mov     r0, r5
   0x0000f7fc <+28>:    blx     r0
   0x0000f800 <+32>:    cmp     r0, #0
   0x0000f804 <+36>:    bne     0xf7ec <__sync_fetch_and_add_4+12>
   0x0000f808 <+40>:    mov     r0, r5
   0x0000f80c <+44>:    pop     {r4, r5, r6, pc}
   0x0000f810 <+48>:                    ; <UNDEFINED> instruction: 0xffff0fc0

Here is the disassemble for the working version:

Dump of assembler code for function __sync_fetch_and_add_4:
   0x0000fc18 <+0>:     push    {r4, r5, r6, r7, r8, lr}
   0x0000fc1c <+4>:     mov     r5, r0
   0x0000fc20 <+8>:     mov     r7, r1
   0x0000fc24 <+12>:    ldr     r6, [pc, #40]   ; 0xfc54 <__sync_fetch_and_add_4+60>
   0x0000fc28 <+16>:    ldr     r4, [r5]
   0x0000fc2c <+20>:    mov     r2, r5
   0x0000fc30 <+24>:    add     r1, r4, r7
   0x0000fc34 <+28>:    mov     r0, r4
   0x0000fc38 <+32>:    mov     lr, pc
   0x0000fc3c <+36>:    bx      r6
   0x0000fc40 <+40>:    cmp     r0, #0
   0x0000fc44 <+44>:    bne     0xfc28 <__sync_fetch_and_add_4+16>
   0x0000fc48 <+48>:    mov     r0, r4
   0x0000fc4c <+52>:    pop     {r4, r5, r6, r7, r8, lr}
   0x0000fc50 <+56>:    bx      lr
   0x0000fc54 <+60>:                    ; <UNDEFINED> instruction: 0xffff0fc0

Maybe @jamesmunns can help with this?

@jamesmunns
Copy link
Member

Hey @malbarbo, a bit of quick background information:

The armv5te architecture is a little weird for an ARM architecture, as the CPU has no direct CPU level atomic instructions (excluding the use of Arc<_> or anything that relies on it, which is a good portion of std). Linux provides a "workaround" for this by allowing OS level atomic instructions which provides a slower, but usable set of atomic operations in the 0xffffxxxx address space. There is some background information on my PR, as well as this kernel.org document.

I unfortunately am not aware whether non-armv5te kernels contain this "feature", but if not, these crashes would make sense to me, since the kernel is not stepping in to catch these shims.

I also do not know what the issue is here between the different versions of Rust.

I am currently running your dockerfile with the 2017-11-17 version of Rust, and I will extract the binary and run on some actual armv5te hardware, and see if I can repro for you.

I don't think this helps much, but I'll re-read this later when I have more time, and I am happy to explain anything I can. Feel free to re-ping me or ask questions.

@parched
Copy link
Contributor

parched commented Dec 19, 2017

Looks like the problem is the previous code is from libgcc but the new code is from rust-lang/compiler-builtins#115.

cc @Amanieu

@jamesmunns
Copy link
Member

I can confirm the -17 version also fails on real armv5te hardware:

# uname -a
Linux 4.4.24 #1 Fri Nov 24 13:37:38 UTC 2017 armv5tejl GNU/Linux
# ls -hal ./hello
-rwxr-xr-x    1 root     root      129.2K Dec 19 18:02 ./hello
# ./hello
Illegal instruction
#

My ARM assembly is a bit weak, so I may not be able to help @Amanieu, but I am happy to test any potential changes on my side (and with my hardware).

@malbarbo
Copy link
Contributor Author

Thanks @jamesmunns and @parched for taking a look at this.

@Amanieu
Copy link
Member

Amanieu commented Dec 19, 2017

@jamesmunns Could you try running the program in gdb and disassembling the function in which the crash occurs?

(gdb) run
(gdb) backtrace
(gdb) disas

@parched
Copy link
Contributor

parched commented Dec 19, 2017

0x0000f7fc <+28>:    blx     r0

This can't be right, can it? r0 is one of the arguments to that call? I wonder if LLVM is mucking up the register allocation for the inline assembly.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@parched The blx instruction should be supported on all ARMv5 systems. Are you sure your hardware isn't ARMv4 by any chance? You are getting the same Illegal instruction crash as @jamesmunns, right?

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@parched Actually, you are right, there does seem to be something wrong with the register allocation. I'll look into it.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

See rust-lang/compiler-builtins#218

@jamesmunns
Copy link
Member

Hey @Amanieu, I don't currently have a build for my device that has gdb enabled on that device (I'm not actively working on that device at the moment), do you still need this, or should I wait for the next nightly to come through to test the changes introduced by rust-lang/compiler-builtins#218 ?

Let me know, if you still need it, I will set up a new build and image for my device with gdb, etc.

@Amanieu
Copy link
Member

Amanieu commented Dec 20, 2017

@jamesmunns You should wait for the next nightly. The previous code was definitely broken.

@malbarbo
Copy link
Contributor Author

@Amanieu Thanks for working on this. I made a PR updating compiler_builtins crate.

kennytm added a commit to kennytm/rust that referenced this issue Dec 21, 2017
@green-s
Copy link
Contributor

green-s commented Dec 22, 2017

Should the latest nightly have fixed this (250b492 2017-12-21)? I'm still getting a segfault.

@arielb1
Copy link
Contributor

arielb1 commented Dec 22, 2017

@green-s
Copy link
Contributor

green-s commented Dec 23, 2017

Tested the new nightly (5165ee9) in QEMU and on-device. It now successfully prints hello world but segfaults afterwards.

(gdb) run
Starting program: /home/sam/hello-world
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
Hello, world!

Program received signal SIGSEGV, Segmentation fault.
0x00410644 in __sync_val_compare_and_swap_4 ()
(gdb) backtrace
#0  0x00410644 in __sync_val_compare_and_swap_4 ()
#1  0x00407ec4 in std::rt::lang_start::hc79ba98377dc1008 ()
#2  0xb6e1c2cc in __libc_start_main (main=0xbefffa74, argc=-1225486336, argv=0xb6e1c2cc <__libc_start_main+280>, init=<optimized out>,
    fini=0x410b48 <__libc_csu_fini>, rtld_fini=0xb6fdfc60 <_dl_fini>, stack_end=0xbefffa74) at libc-start.c:287
#3  0x00401d1c in _start ()
(gdb) disas
Dump of assembler code for function __sync_val_compare_and_swap_4:
   0x00410634 <+0>:     push    {r4, r5, r6, lr}
   0x00410638 <+4>:     mov     r4, r2
   0x0041063c <+8>:     mov     r6, r1
   0x00410640 <+12>:    mov     r5, r0
=> 0x00410644 <+16>:    ldr     r0, [r4]
   0x00410648 <+20>:    cmp     r0, r5
   0x0041064c <+24>:    popne   {r4, r5, r6, pc}
   0x00410650 <+28>:    ldr     r3, [pc, #28]   ; 0x410674 <__sync_val_compare_and_swap_4+64>
   0x00410654 <+32>:    mov     r0, r5
   0x00410658 <+36>:    mov     r1, r6
   0x0041065c <+40>:    mov     r2, r4
   0x00410660 <+44>:    blx     r3
   0x00410664 <+48>:    cmp     r0, #0
   0x00410668 <+52>:    bne     0x410644 <__sync_val_compare_and_swap_4+16>
   0x0041066c <+56>:    mov     r0, r5
   0x00410670 <+60>:    pop     {r4, r5, r6, pc}
   0x00410674 <+64>:                    ; <UNDEFINED> instruction: 0xffff0fc0
End of assembler dump.

@Amanieu
Copy link
Member

Amanieu commented Dec 23, 2017

bors added a commit to rust-lang/compiler-builtins that referenced this issue Dec 23, 2017
bors added a commit that referenced this issue Dec 25, 2017
bors added a commit that referenced this issue Dec 27, 2017
Add dist builder for armv5te-unknown-linux-gnueabi (again)

The dist builder was first add in #46498 and later remove in #46498 because of #46822.

#46822 seems to be fixed now (I and @green-s have [tested](#46498 (comment)) it).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants