Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory delay free kasan error fix and memory note fix #13898

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

XuNeo
Copy link
Contributor

@XuNeo XuNeo commented Oct 8, 2024

Summary

  1. Hold lock when access mm struct field from sched_note_heap.
  2. Fix kasan reported error when delay free feature is enabled. See comments in code.
  /* If delay free is enabled, a memory node will be freed twice.
   * The first time is to add the node to the delay list, and the second
   * time is to actually free the node. Therefore, we only colorize the
   * memory node the first time, when `delay` is set to true.
   */

3. Remove CONFIG_MM_MAP_COUNT_MAX option. User should take care of it. It causes ltp_interfaces_mmap_24_1 test to fail, remove this patch for now.

Impact

Bug fixes should have no impact on existing projects.

Testing

Local tested with qemu.

Configure

tools/configure.sh mps3-an547:nsh

Change CONFIG_MM_REGIONS = 1 to reduce the heap size, in order to make kasan faster.

Enable

CONFIG_MM_KASAN=y
CONFIG_MM_FREE_DELAYCOUNT_MAX=16
CONFIG_MM_FILL_ALLOCATIONS=y

Run

 qemu-system-arm -M mps3-an547 -m 2G -nographic -kernel nuttx -gdb tcp::1127 -serial pty

Result

  • Before the fix
kasan_report: kasan detected a write access error, address at 0x1019f08,size is 1, return address: 0xf4f7                                     │start () at armv8-m/arm_vectors.c:66
kasan_show_memory: Shadow bytes around the buggy address:                                                                                     │66        asm volatile ("mov lr, #0\n\t"
kasan_show_memory:   0x1019eb0: 8c 2a f8 01 9d 89 22 ca e0 e9 90 40 d1 d3 15 0a                                                               │(gdb) c
kasan_show_memory:   0x1019ec0: 41 f2 70 95 43 f1 ea aa 89 a6 e9 fb 5a 4e fc 62                                                               │Continuing.
kasan_show_memory:   0x1019ed0: fd 86 8c c0 a5 8c dc 24 e7 4a 1a c9 e9 64 bd 52                                                               │[Inferior 1 (process 1) exited normally]
kasan_show_memory:   0x1019ee0: 70 97 68 55 71 dc a1 de 1f 68 d2 9a e5 12 78 7b                                                               │(gdb) c
kasan_show_memory:   0x1019ef0: 83 ab c2 51 32 06 2b 97 65 76 26 e2 2c 06 25 72                                                               │The program is not being run.
kasan_show_memory:   0x1019f00: aa aa aa aa 61 03 00 00[c0 97 01 01 55 55 55 55                                                               │(gdb) tar rem:1127
kasan_show_memory:   0x1019f10: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55                                                               │`/home/neo/projects/nuttx/nuttx/nuttx' has changed; re-reading symbols.
kasan_show_memory:   0x1019f20: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55                                                               │Remote debugging using :1127
kasan_show_memory:   0x1019f30: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55                                                               │
kasan_show_memory:   0x1019f40: 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55                                                               │Program received signal SIGTRAP, Trace/breakpoint trap.
dump_assert_info: Current Version: NuttX  12.7.0-RC0 1e556412c0d-dirty Oct  8 2024 14:36:37 arm                                               │start () at armv8-m/arm_vectors.c:66
dump_assert_info: Assertion failed panic: at file: kasan/hook.c:187 task: memstress process: memstress 0x3e56
  • After the fix
    memstress runs without kasan error.

@github-actions github-actions bot added Area: Memory Management Memory Management issues Size: S The size of the change in this PR is small labels Oct 8, 2024
@nuttxpr
Copy link

nuttxpr commented Oct 8, 2024

[Experimental Bot, please feedback here]

NuttX PR Requirements Review

This PR generally meets the NuttX requirements, but has some minor issues:

Strengths:

  • Summary: Provides a clear and concise explanation of the changes made. Each point addresses a specific issue, and the code comment excerpt is helpful.
  • Impact: Clearly outlines the limited impact of the changes. Removing CONFIG_MM_MAP_COUNT_MAX is acknowledged as a potential impact point, which is good.
  • Testing: Demonstrates testing effort by specifying the target, configuration, and steps to reproduce the issue. The inclusion of "before" and "after" logs is very helpful.

Areas for Improvement:

  • Testing - Build Host(s): While you mention "Local tested with qemu," please provide more details about your build host environment (OS, CPU architecture, compiler version).
  • Testing - Logs: The provided logs are helpful, but could be more focused. Instead of large log dumps, consider highlighting the specific lines demonstrating the issue before the fix and the successful execution after the fix.
  • Testing - Breadth: While testing with qemu is a good starting point, consider adding tests on real hardware if possible. This helps ensure the fix generalizes well.

Conciseness:

The PR description is concise and to the point.

Overall:

This PR demonstrates a good understanding of the NuttX requirements. Addressing the testing details will further strengthen the submission.

@XuNeo
Copy link
Contributor Author

XuNeo commented Oct 8, 2024

sim:citest reports multiple ltp tests failed in CI.
But it can pass locally.

E.g.

nsh> ltp_interfaces_mmap_24_1
Total mapped size is 1048576 bytes
Test PASSED
nsh>

@GUIDINGLI
Copy link
Contributor

rebase you PR, and try again

mm/mm_heap/mm_free.c Show resolved Hide resolved
mm/mm_heap/mm_free.c Show resolved Hide resolved
mm/tlsf/mm_tlsf.c Show resolved Hide resolved
mm/tlsf/mm_tlsf.c Show resolved Hide resolved
@XuNeo XuNeo force-pushed the mm-fix-up branch 3 times, most recently from f3865de to 2f70e6c Compare October 14, 2024 02:34
@github-actions github-actions bot added Size: M The size of the change in this PR is medium and removed Size: S The size of the change in this PR is small labels Oct 14, 2024
@github-actions github-actions bot added Size: S The size of the change in this PR is small and removed Size: M The size of the change in this PR is medium labels Oct 14, 2024
@XuNeo XuNeo force-pushed the mm-fix-up branch 2 times, most recently from 3696370 to 74408bf Compare October 15, 2024 11:53
XuNeo and others added 3 commits October 16, 2024 00:39
Signed-off-by: xuxingliang <xuxingliang@xiaomi.com>
Signed-off-by: Xu Xingliang <xuxingliang@xiaomi.com>
mm_heap/mm_malloc.c: In function 'mm_malloc':
mm_heap/mm_malloc.c:281:33: warning: excess elements in struct initializer
  281 |         MM_BACKTRACE_ALLOC_PID, 0, ULONG_MAX
      |                                 ^
mm_heap/mm_malloc.c:281:33: note: (near initialization for 'dump')
mm_heap/mm_malloc.c:281:36: warning: excess elements in struct initializer
  281 |         MM_BACKTRACE_ALLOC_PID, 0, ULONG_MAX
      |                                    ^~~~~~~~~
mm_heap/mm_malloc.c:281:36: note: (near initialization for 'dump')

Signed-off-by: hanqiyuan <hanqiyuan@xiaomi.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Memory Management Memory Management issues Size: S The size of the change in this PR is small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants