Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. #21

Lisias · 2021-06-23T01:29:27Z

Launch this craft with Recall installed.

See hell breaking loose over the poor Garbage Collector's head.

(There's also a memory leak happening on 3rd parties, completely unrelated to Recall)

Lisias · 2021-06-23T04:50:57Z

Process Sample (MacOs) if the 261 in progress:

Note a lot of threads at 100% CPU on semaphore_wait_trap (they did spinlocks on a bytecode virtual machine? really???), and the GC thread being chewed also at 100% on GC_header_cache_miss and GC_generic_malloc.

Lisias · 2021-06-23T15:45:40Z

This issue is not a bug on Refunding. The inherent problem is a terrible decision made by Unity on using spinlocks instead of mutexes on critical sections of code related to memory.

This also explains a lot of different problems KSP is suffering from a long time.

See these posts on forum:

Lisias · 2021-06-23T17:16:42Z

Renaming the issue to properly address the situation.

KSP-Recall's Refunding is, indeed, abusing the GC a bit - but this is not what's causing the process to a deadlock. At worst, I would be causing some stuttering while launching the craft.

But since fixing Unity is beyound the scope of KSP-Recall, I will have to tackle down this somehow on Refunding, perhaps hindering a bit Refunding itself (it will depend if I manage to keep the critical code inside the FixedUpdate, or I will have to move to the Update and risk being not called).

mgalyean · 2021-06-23T17:32:37Z

Is the spinlock at the mono level and perhaps tunable via mono environment variables? Ideally there would be a way to tell it to use mutexes instead of spinlocks for the GC at least, but this is very unlikely to be doable. There are a lot of GC env var options but I don't recall seeing anything specific to how the locking works skimming the vars

Lisias · 2021-06-23T17:54:40Z

Is the spinlock at the mono level and perhaps tunable via mono environment variables? Ideally there would be a way to tell it to use mutexes instead of spinlocks for the GC at least, but this is very unlikely to be doable. There are a lot of GC env var options but I don't recall seeing anything specific to how the locking works skimming the vars

Nope, they are on the C code:

    1279 Thread_2729384: Finalizer
    + 1279 thread_start  (in libsystem_pthread.dylib) + 13  [0x7fff73d0040d]
    +   1279 _pthread_start  (in libsystem_pthread.dylib) + 66  [0x7fff73d04249]
    +     1279 _pthread_body  (in libsystem_pthread.dylib) + 126  [0x7fff73d012eb]
    +       1279 GC_start_routine  (in libmonobdwgc-2.0.dylib) + 28  [0x115be9987]
    +         1279 GC_inner_start_routine  (in libmonobdwgc-2.0.dylib) + 90  [0x115be99f2]
    +           1279 start_wrapper  (in libmonobdwgc-2.0.dylib) + 652  [0x115b75b1b]
    +             1279 finalizer_thread  (in libmonobdwgc-2.0.dylib) + 671  [0x115baee76]
    +               1279 semaphore_wait_trap  (in libsystem_kernel.dylib) + 10  [0x7fff73c42256]`

https://github.com/Unity-Technologies/bdwgc/tree/6e8db07c8d2e45fe60016a95c343912e038e5e9d

But even if we had a way to limit this using some configuration option, we will still have to co-exist with the rest of the Operating System.

The only (and I really mean only) place where spinlocks are safe to use is on embedded systems where you have FULL control of every piece of the hardware you are using. And so you can define how much threads you have on the whole system in advance.

mgalyean · 2021-06-23T18:42:56Z

So the only option would be a purpose made c(++?) library on an OS specific basis with a spinlock API that actually did mutexes under the hood...and the KSP env would have to be setup so mono could only see that flavor of the library. And if the library is compiled into the runtime then there is nothing that can be done without binary editing/redirecting in the runtime.

Lisias · 2021-06-23T18:53:30Z

So the only option would be a purpose made c(++?) library on an OS specific basis with a spinlock API that actually did mutexes under the hood...and the KSP env would have to be setup so mono could only see that flavor of the library. And if the library is compiled into the runtime then there is nothing that can be done without binary editing/redirecting in the runtime.

One size does not fits all.

You need a specialised library for every OS, exploring its strongness and avoiding its weakness.

#21

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Lisias · 2021-06-27T06:49:08Z

I reworked the code to prevent removing a Refunding resource once it's created. It solved the deadlocks, the process get only 13 to 15% of CPU load (i7, cores, 8 threads -- max cpu load is 800%).

However, the memory continued to go through the roof. With that 1.000 parts craft on the OP, my machine just can't launch it - the process get to 32GB of memory allocated, with 30GB of compressed memory and about 2GB of "Real Memory" allocated.

Technically, this solved the issue : by avoiding destroying the Resource, I prevented the need of GC and, so, the spinlocks didn't fired. But he memory continued to blow up the roof. :(

Code on commit 9b74455

Lisias · 2021-06-27T07:03:17Z

The changes on commit 9b74455 does what follows:

Create the resource on demand.
this.part.Resources.Add it
Eventually this.part.Resource.Remove when appropriated, to prevent the fake resource to screw up parts that some add'ons (or even KSP) don't expect any resource. Stackable parts also gets the fake resource removed nevertheless it's needed or not, as I didn't managed to support them properly.
The resource is this.part.Resources.Add back when there is a expectation that the craft could be recovered.

The fake resource is not destroyed anymore - it is saved on a private variable and constantly reused.

So, Refunding is not responsible for the memory leak - it's only the trigger. The memory leak is happening on the this.part.Resources.Add call!!!

IT'S A KSP BUG. Yeah, another one.

This is royally screwing up the hack - I just can't shove the fake resource on every part via Module Manager, as this would screw up Add'Ons that relies on some parts having NO resources. Stackable parts are just one of the problems I need to prevent.

But if I add them at runtime, lots and lots of memory are allocated for no apparent reason - 93% of the memory ends up compressed, only 7% of the process memory is really being used on a given moment. This strongly hints me on a pretty nasty memory leak on the Modules.Add code.

Lisias · 2021-06-27T07:19:25Z

Flagging this as "Not my fault".

Refunding was not being the most efficient as possible but the code was working fine (and given my time constraints at that time, I'm not ashamed of it).

Refunding is now being the most efficient possible (besides wasting some memory as the fake resource is only destroyed when the part is destroyed now), what's kinda "fixes" the problem described on this issue.

However, the memory leak persists. It will be tackled down (or not....) on issue #23.

Lisias · 2021-06-27T07:19:39Z

Ping @mgalyean .

Lisias · 2021-06-27T07:21:02Z

Closing this as the code was successful on prevent triggering the spinlock madness. (the madness is still there to be triggered by another innocent code, but there's so much I can do about).

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Lisias changed the title ~~Possible Memory Leak involving KSP-Recall somehow.~~ Refunding is bullying the Garbage Collector. A lot! Jun 23, 2021

Lisias self-assigned this Jun 23, 2021

Lisias added the bug Something isn't working label Jun 23, 2021

Lisias added task Not a problem, but something I must do and should not forget about and removed bug Something isn't working labels Jun 23, 2021

Lisias added this to the 0.2.0.1 milestone Jun 23, 2021

Lisias changed the title ~~Refunding is bullying the Garbage Collector. A lot!~~ Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. Jun 23, 2021

Lisias added a commit that referenced this issue Jun 27, 2021

Minimizes the (ab)use of the GC due a Unity terrible dicision making.

70399c3

#21

Lisias added a commit that referenced this issue Jun 27, 2021

Preventing the shitty spinlocks from Unity to bully the CPU while gar…

760a039

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Lisias mentioned this issue Jun 27, 2021

Refunding is triggering a nasty memory leak on this.part.Modules.Add #23

Closed

Lisias added the Not My Fault Not my Fault! I'm innocent! :) label Jun 27, 2021

Lisias closed this as completed Jun 27, 2021

Lisias added a commit that referenced this issue Jun 27, 2021

Preventing the shitty spinlocks from Unity to bully the CPU while gar…

bf7f88b

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Lisias added a commit that referenced this issue Jun 27, 2021

Preventing the shitty spinlocks from Unity to bully the CPU while gar…

c1573a1

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Lisias added a commit that referenced this issue Jun 27, 2021

Preventing the shitty spinlocks from Unity to bully the CPU while gar…

9b74455

…bage collecting (due the destroyance of the unneeded resources). Works around #21 .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. #21

Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. #21

Lisias commented Jun 23, 2021 •

edited

Loading

Lisias commented Jun 23, 2021

Lisias commented Jun 23, 2021

Lisias commented Jun 23, 2021

mgalyean commented Jun 23, 2021

Lisias commented Jun 23, 2021 •

edited

Loading

mgalyean commented Jun 23, 2021

Lisias commented Jun 23, 2021

Lisias commented Jun 27, 2021 •

edited

Loading

Lisias commented Jun 27, 2021 •

edited

Loading

Lisias commented Jun 27, 2021

Lisias commented Jun 27, 2021

Lisias commented Jun 27, 2021

Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. #21

Unity's spinlocks are bullying the Garbage Collector, and Refunding is not helping on the situation. #21

Comments

Lisias commented Jun 23, 2021 • edited Loading

Lisias commented Jun 23, 2021

Lisias commented Jun 23, 2021

Lisias commented Jun 23, 2021

mgalyean commented Jun 23, 2021

Lisias commented Jun 23, 2021 • edited Loading

mgalyean commented Jun 23, 2021

Lisias commented Jun 23, 2021

Lisias commented Jun 27, 2021 • edited Loading

Lisias commented Jun 27, 2021 • edited Loading

Lisias commented Jun 27, 2021

Lisias commented Jun 27, 2021

Lisias commented Jun 27, 2021

Lisias commented Jun 23, 2021 •

edited

Loading

Lisias commented Jun 23, 2021 •

edited

Loading

Lisias commented Jun 27, 2021 •

edited

Loading

Lisias commented Jun 27, 2021 •

edited

Loading