Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nucleo_f103rb: samples/posix/eventfd/ failed since "retargetable locking" addition #38601

Closed
erwango opened this issue Sep 16, 2021 · 6 comments · Fixed by #38625
Closed

nucleo_f103rb: samples/posix/eventfd/ failed since "retargetable locking" addition #38601

erwango opened this issue Sep 16, 2021 · 6 comments · Fixed by #38625
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: medium Medium impact/importance bug Regression Something, which was working, does not anymore
Milestone

Comments

@erwango
Copy link
Member

erwango commented Sep 16, 2021

Describe the bug
Sample samples/posix/eventfd is failed since 43b8002

To Reproduce
twister --ninja --runtime-artifact-cleanup -s samples/posix/eventfd/sample.posix.eventfd -p nucleo_f103rb --device-testing --device-serial /dev/ttyACM0

Expected behavior
Test should pass

Additional input
EDIT: According to bench history, it is possible that nucleo_l073rz was also impacted by same commit on same test

@erwango erwango added bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: medium Medium impact/importance bug Regression Something, which was working, does not anymore labels Sep 16, 2021
@erwango
Copy link
Member Author

erwango commented Sep 16, 2021

@stephanosio Would you mind having a look?
I'm available to do some tests if needed.

@stephanosio stephanosio added this to the v2.7.0 milestone Sep 16, 2021
@stephanosio
Copy link
Member

@erwango Could you post the console output? What exactly is the failure message?

@erwango
Copy link
Member Author

erwango commented Sep 16, 2021

@stephanosio There is no console output. We're getting a hard fault at the first printf occurence:
Here is an abstract

Breakpoint 4, 0x08008224 in _vfprintf_r ()
(gdb) si
Info : halted: PC: 0x0800a000
halted: PC: 0x0800a000
0x0800a000 in _localeconv_r ()
[....]
k_spin_unlock (key=..., l=0x20001c81) at /local/mcu/zephyrproject/zephyr/kernel/mutex.c:120
120			k_spin_unlock(&lock, key);
(gdb) 
arch_irq_unlock (key=0) at /local/mcu/zephyrproject/zephyr/include/arch/arm/aarch32/asm_inline_gcc.h:95
95		__asm__ volatile(
(gdb) 
Info : halted: PC: 0x080062a4
halted: PC: 0x080062a4
Info : halted: PC: 0x080062a8
halted: PC: 0x080062a8
z_impl_k_mutex_lock (mutex=0x20000f5c <__lock___sinit_recursive_mutex>, timeout=...)
    at /local/mcu/zephyrproject/zephyr/kernel/mutex.c:124
124			return 0;
(gdb) 
Info : halted: PC: 0x080062aa
halted: PC: 0x080062aa
Info : halted: PC: 0x080062ac
halted: PC: 0x080062ac
Info : halted: PC: 0x08009c20
halted: PC: 0x08009c20
0x08009c20 in __sinit ()
(gdb) 
Single stepping until exit from function __sinit,
which has no line number information.
Info : halted: PC: 0x08009c24
halted: PC: 0x08009c24
[...]
Info : halted: PC: 0x0800c6c8
halted: PC: 0x0800c6c8
__retarget_lock_init_recursive (lock=0x200001b0 <impure_data+432>)
    at /local/mcu/zephyrproject/zephyr/lib/libc/newlib/libc-hooks.c:374
374		*lock = malloc(sizeof(struct k_mutex));
(gdb) 
Info : halted: PC: 0x0800c6ca
halted: PC: 0x0800c6ca
Info : halted: PC: 0x0800c6cc
halted: PC: 0x0800c6cc
Info : halted: PC: 0x0800c6ce
halted: PC: 0x0800c6ce
Info : halted: PC: 0x0800720c
halted: PC: 0x0800720c
Info : halted: PC: 0x0800c6d4
halted: PC: 0x0800c6d4
708		return z_impl_k_mutex_init(mutex);
(gdb) 
Info : halted: PC: 0x0800c6d8
halted: PC: 0x0800c6d8
Info : halted: PC: 0x0800f860
halted: PC: 0x0800f860
z_impl_k_mutex_init (mutex=0x20004870) at /local/mcu/zephyrproject/zephyr/kernel/mutex.c:50
50	{
(gdb) 
Info : halted: PC: 0x0800f862
halted: PC: 0x0800f862
51		mutex->owner = NULL;
(gdb) 
Info : halted: PC: 0x0800f864
halted: PC: 0x0800f864
Info : halted: PC: 0x0800f868
halted: PC: 0x0800f868
Info : halted: PC: 0x0800f86c
halted: PC: 0x0800f86c
60		return 0;
(gdb) 
Info : halted: PC: 0x08009c7c
halted: PC: 0x08009c7c
0x08009c7c in __sinit ()
(gdb) 
Single stepping until exit from function __sinit,
which has no line number information.
Info : halted: PC: 0x08009c80
halted: PC: 0x08009c80
[..]
Info : halted: PC: 0x0800c6c8
halted: PC: 0x0800c6c8
__retarget_lock_init_recursive (lock=0x20000218 <impure_data+536>)
    at /local/mcu/zephyrproject/zephyr/lib/libc/newlib/libc-hooks.c:374
374		*lock = malloc(sizeof(struct k_mutex));
(gdb) 
Info : halted: PC: 0x0800c6ca
halted: PC: 0x0800c6ca
[...]
Info : halted: PC: 0x0800c6d4
halted: PC: 0x0800c6d4
708		return z_impl_k_mutex_init(mutex);
(gdb) 
Info : halted: PC: 0x0800c6d8
halted: PC: 0x0800c6d8
Info : halted: PC: 0x0800f860
halted: PC: 0x0800f860
z_impl_k_mutex_init (mutex=0x0) at /local/mcu/zephyrproject/zephyr/kernel/mutex.c:50
50	{
(gdb) 
Info : halted: PC: 0x0800f862
halted: PC: 0x0800f862
51		mutex->owner = NULL;
(gdb) 
Info : halted: PC: 0x0800f864
halted: PC: 0x0800f864
Info : halted: PC: 0x08003360
halted: PC: 0x08003360
z_arm_usage_fault () at /local/mcu/zephyrproject/zephyr/arch/arm/core/aarch32/cortex_m/fault_s.S:80
80		mrs r0, MSP

@stephanosio
Copy link
Member

@erwango Could you set CONFIG_ASSERT=y and check if you see any failures?

z_impl_k_mutex_init (mutex=0x0) at /local/mcu/zephyrproject/zephyr/kernel/mutex.c:50

Looking at the debug log, it looks like the malloc in __retarget_lock_init_recursive failed due to insufficient SRAM.

If you enable ASSERT, you will likely see the following messages:

  • "memory space available for newlib heap is less than the minimum required size specified by
    CONFIG_NEWLIB_LIBC_MIN_REQUIRED_HEAP_SIZE"
  • "recursive lock allocation failed"

This is because there is very little leftover SRAM, which the newlib uses for its heap, for nucleo_l073rz with this particular sample:

Memory region         Used Size  Region Size  %age Used
           FLASH:       73248 B       128 KB     55.88%
            SRAM:       18528 B        20 KB     90.47%
        IDT_LIST:          0 GB         2 KB      0.00%

@stephanosio
Copy link
Member

From #38044 (comment):

Normally, for any statically allocated regions, twister will skip the test if there is an overflow; but here, we implicitly make use of the leftover area without the linker's knowledge, so there is no linker error and twister ends up treating this as a failure at runtime.

Due to how Zephyr-to-newlib integration is implemented, there is no easy way for the twister to filter out/skip these boards/tests -- this will be fixed eventually in the next release as noted in the comment referenced above.

For now, we should identify all the failing sample-board combos (mostly the boards with very small SRAM and the samples with very high SRAM usage) and add platform_exclude to them manually. If there happen to be too many though, we can alternatively consider setting the min_ram in the testcase.yaml or sample.yaml so that twister does not run these tests on the devices with little SRAM.

@erwango
Copy link
Member Author

erwango commented Sep 17, 2021

Thanks for the help @stephanosio

If you enable ASSERT, you will likely see the following messages:

Unfortunately, we don't get messages even with asserts enabled and even in debugger we don't reach __assert_func.

For now, we should identify all the failing sample-board combos [...]

To save some time for others, I'll use min_ram: 32.

erwango added a commit to erwango/zephyr that referenced this issue Sep 17, 2021
This test makes use of dynamic RAM allocation done on the leftover
SRAM. Similarly to test/posix/eventfd, limit this test execution on
platforms with at least 32K of RAM available.

Fixes zephyrproject-rtos#38601

Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org>
cfriedt pushed a commit that referenced this issue Sep 20, 2021
This test makes use of dynamic RAM allocation done on the leftover
SRAM. Similarly to test/posix/eventfd, limit this test execution on
platforms with at least 32K of RAM available.

Fixes #38601

Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org>
Rushybrook pushed a commit to Rushybrook/zephyr that referenced this issue Oct 21, 2021
This test makes use of dynamic RAM allocation done on the leftover
SRAM. Similarly to test/posix/eventfd, limit this test execution on
platforms with at least 32K of RAM available.

Fixes zephyrproject-rtos#38601

Signed-off-by: Erwan Gouriou <erwan.gouriou@linaro.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: STM32 ST Micro STM32 priority: medium Medium impact/importance bug Regression Something, which was working, does not anymore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants