-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cgroup: add PIDs cgroup controller support #446
Conversation
@mrunalp The main change I made to #58 was to actually explicitly It should be noted that the "fix" for getting systemd to "work" is to just use the fs driver for basically everything. Unfortunately, there's deeper systemd issues here. At least it all works now. |
What problems are you saying about systemd cgroup in docker upstream exactly? I'm sure we have some systemd cgroup users and we don't see reports from them. And what's the problem about |
@hqhq Well, if you try to use the version on master and set something like |
@cyphar No I don't get any errors when setting BTW, I don't have a box with latest kernel to test this PR out, but over all the codes in this PR looks good, if there are any systemd issues with docker, I can help diagnose. |
@hqhq I'm testing this on Arch (with a custom-compiled kernel). The systemd problems I'm referring to are not related to this PR (there are a few). I'll open a separate issue (or PR) for these problems if appropriate (a SUSE customer has seen issues with systemd on SLE). |
I've rebased (which was annoying, as I just discovered the /ping @mrunalp @hqhq @vishh @crosbymichael @dqminh |
Do you have a docker branch with this integrated? Sent from my iPhone
|
I tried with docker. It works perfectly with fs cgroups. But seems like systemd cgroups doesn't work at all in master, so I can't test it. |
@LK4D4 Here is a fix for systemd in docker moby/moby#19149 |
@LK4D4 @mrunalp note that you'll have to re-run |
I tested definitely with this branch :) |
Good to hear. 😸 |
I tested and it works with systemd. |
LGTM |
Due to some upstream kernel changes introduced in Ultimately, we do the best we can. Anything else is either due to the fact we use Go (which isn't going to change soon) or are potential kernel bugs (gulp). NOTE: This doesn't actually affect the driver code, just the tests. |
In any case, PTAL so we can merge this soon. /ping @mrunalp @hqhq @vishh @crosbymichael @dqminh |
@cyphar Yes, I am going to test this today. Thanks! |
t.Fatalf("expected fork() to succeed with permissive pids limit") | ||
} | ||
|
||
// Enforce a restrictive limit limit (shell + 6 * true + 3). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/limit limit/limit/
I tested and it worked fine for me. Besides the nits, LGTM |
Add support for the pids cgroup controller to libcontainer, a recent feature that is available in Linux 4.3+. Unfortunately, due to the init process being written in Go, it can spawn an an unknown number of threads due to blocked syscalls. This results in the init process being unable to run properly, and thus small pids.max configs won't work properly. Signed-off-by: Aleksa Sarai <asarai@suse.com>
Apply and Set are two separate operations, and it doesn't make sense to group the two together (especially considering that the bootstrap process is added to the cgroup as well). The only exception to this is the memory cgroup, which requires the configuration to be set before processes can join. One of the weird cases to deal with is systemd. Systemd sets some of the cgroup configuration options, but not all of them. Because memory is a special case, we need to explicitly set memory in the systemd Apply(). Otherwise, the rest can be safely re-applied in .Set() as usual. Signed-off-by: Aleksa Sarai <asarai@suse.com>
It is vital to loudly fail when a user attempts to set a cgroup limit (rather than using the system default). Otherwise the user will assume they have security they do not actually have. This mirrors the original Apply() (that would set cgroup configs) semantics. Signed-off-by: Aleksa Sarai <asarai@suse.com>
Due to the fact that the init is implemented in Go (which seemingly randomly spawns new processes and loves eating memory), most cgroup configurations are required to have an arbitrary minimum dictated by the init. This confuses users and makes configuration more annoying than it should. An example of this is pids.max, where Go spawns multiple processes that then cause init to violate the pids cgroup constraint before the container can even start. Solve this problem by setting the cgroup configurations as late as possible, to avoid hitting as many of the resources hogged by the Go init as possible. This has to be done before seccomp rules are applied, as the parent and child must synchronise in order for the parent to correctly set the configurations (and writes might be blocked by seccomp). Signed-off-by: Aleksa Sarai <asarai@suse.com>
@mrunalp Nits addressed. |
@cyphar Thanks! LGTM |
cgroup: add PIDs cgroup controller support
great!! |
Remove one JSON related MUST requirement
This is a fixed up version of the now-reverted #58. The main issue with that code was that some of the memory cgroup's limits weren't being set at all if you were using systemd. Unfortunately, I seem to be unable to test the systemd portion (and the systemd test cases for memory don't actually test that the limits cause a process to fail).
All other cgroups were unaffected by this bug (as their limits were set using
.Set()
, and were joined either explicitly or using systemd), regardless of whether they were unsupported, partial supported or fully supported by systemd.Some testing has revealed that the old setup didn't work either (because
getSubsystemPath
doesn't appear to work as expected)./cc @mrunalp @hqhq @vishh @crosbymichael @dqminh