From 3f45bf7d57f059d751436bf907fbff825d10e7c0 Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Fri, 18 Dec 2015 10:42:33 -0800 Subject: [PATCH] config-linux: Specify host mount namespace for namespace paths Avoid trouble with situations like: # mount --bind /mnt/test /mnt/test # mount --make-rprivate /mnt/test # touch /mnt/test/mnt /mnt/test/user # mount --bind /proc/123/ns/mnt /mnt/test/mnt # mount --bind /proc/123/ns/user /mnt/test/user # nsenter --mount=/proc/123/ns/mnt --user /proc/123/ns/user sh which uses the required private mount for binding mount namespace references [1,2,3]. We want to avoid: 1. Runtime opens /mnt/test/mnt as fd 3. 2. Runtime joins the mount namespace referenced by fd 3. 3. Runtime fails to open /mnt/test/user, because /mnt/test is not visible in the current mount namespace. and instead get runtime authors to setup flows like: 1. Runtime opens /mnt/test/mnt as fd 3. 2. Runtime opens /mnt/test/user as fd 4. 3. Runtime joins the mount namespace referenced by fd 3. 4. Runtime joins the user namespace referenced by fd 4. This also applies to new namespace creation. We want to avoid: 1. Runtime clones a container process with a new mount namespace. 2c. Container process fails to open /mnt/test/user, because /mnt/test is not visible in the current mount namespace. in favor of something like: 1. Runtime opens /mnt/test/user as fd 3. 2. Runtime clones a container process with a new mount namespace. 3h. Host process closes unneeded fd 3. 3c. Container process joins the user namespace referenced by fd 3. I also define runtime and container namespaces, so we have consistent terminology. I prefer: * host namespace: a namespace you are in when you invoke the runtime * host process: the runtime process invoked by the user * container process: the process created by a clone call in the host process which will eventually execute the user-configured process. Both the host and container processes are running runtime code (although the container process eventually transitions to user-configured code), so I find "runtime process", "runtime namespace", etc. to be imprecise. However, the maintainer consensus is for "runtime namespace" [4,5], so that's what we're going with here. [1]: http://karelzak.blogspot.com/2015/04/persistent-namespaces.html [2]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4ce5d2b1a8fde84c0eebe70652cf28b9beda6b4e [3]: http://mid.gmane.org/87haeahkzc.fsf@xmission.com [4]: https://github.com/opencontainers/specs/pull/275#discussion_r48057211 [5]: https://github.com/opencontainers/specs/pull/275#discussion_r48324264 Signed-off-by: W. Trevor King --- config-linux.md | 2 +- glossary.md | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/config-linux.md b/config-linux.md index 5e77c113a..90e2cbbf8 100644 --- a/config-linux.md +++ b/config-linux.md @@ -34,7 +34,7 @@ The following parameters can be specified to setup namespaces: * **`uts`** the container will be able to have its own hostname and domain name * **`user`** the container will be able to remap user and group IDs from the host to local users and groups within the container -* **`path`** *(string, optional)* - path to namespace file +* **`path`** *(string, optional)* - path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace) If a path is specified, that particular file is used to join that type of namespace. Also, when a path is specified, a runtime MUST assume that the setup for that particular namespace has already been done and error out if the config specifies anything else related to that namespace. diff --git a/glossary.md b/glossary.md index f9d11c4f5..6b6eb5949 100644 --- a/glossary.md +++ b/glossary.md @@ -13,6 +13,10 @@ The [`config.json`](config.md) file in a [bundle](#bundle) which defines the int An environment for executing processes with configurable isolation and resource limitations. For example, namespaces, resource limits, and mounts are all part of the container environment. +## Container namespace + +On Linux, a leaf in the [namespace][namespaces.7] hierarchy in which the [configured process](config.md#process-configuration) executes. + ## JSON All configuration [JSON][] MUST be encoded in [UTF-8][]. @@ -22,5 +26,11 @@ All configuration [JSON][] MUST be encoded in [UTF-8][]. An implementation of this specification. It reads the [configuration files](#configuration) from a [bundle](#bundle), uses that information to create a [container](#container), launches a process inside the container, and performs other [lifecycle actions](runtime.md). +## Runtime namespace + +On Linux, a leaf in the [namespace][namespaces.7] hierarchy from which the [runtime](#runtime) process is executed. +New container namespaces will be created as children of the runtime namespaces. + [JSON]: http://json.org/ [UTF-8]: http://www.unicode.org/versions/Unicode8.0.0/ch03.pdf +[namespaces.7]: http://man7.org/linux/man-pages/man7/namespaces.7.html