Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch current container runtime config #686

Merged
merged 1 commit into from
Oct 10, 2024
Merged

Conversation

tariq1890
Copy link
Contributor

@tariq1890 tariq1890 commented Sep 9, 2024

Summary of changes made in this PR:

  1. In CRI-O and Containerd config engines, retrieve the container runtime config commands via the following commands:
    i. crio status config
    ii. containerd config dump

  2. When configuring CRI-O runtime, we prioritise the runtime designated as the default_runtime in the config when setting up the nvidia runtime handler as opposed to favouring the runc runtime handler at all times. This is needed as vanilla cri-o packages have crun as the default low-level-runtime. Live-swapping the low-level runtime causes the running containers in a cluster to break

  3. Add the full path of the low-level runtime to the nvidia-container-runtime.runtimes config value, so that nvidia-container-runtime binds to a low-level runtime binary that cannot be resolved in the PATH.

pkg/config/engine/api.go Outdated Show resolved Hide resolved
pkg/config/source.go Outdated Show resolved Hide resolved
pkg/config/source.go Outdated Show resolved Hide resolved
pkg/config/toml.go Outdated Show resolved Hide resolved
pkg/config/toml/source.go Outdated Show resolved Hide resolved
pkg/config/engine/api.go Outdated Show resolved Hide resolved
Comment on lines 456 to 458
toolkitRuntimeList := getNvidiaContainerRuntimeList(cfg, opts.ContainerRuntimeRuntimes.Value())
if len(toolkitRuntimeList) > 0 {
configValues["nvidia-container-runtime.runtimes"] = toolkitRuntimeList
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should no longer be required.

We will ONLY set the binaries from the runtimes. Note that if opts.ContainerRuntimeRuntimes is set we will overwrite the value we set here as part of processing the "options" config options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will ONLY set the binaries from the runtimes.

I don't think we can do that. If you look at the default containerd config toml, the BinaryName fields are empty, meaning that the PATH-resolvable low-level-runtime binary paths will be used. So we still need the default list ["docker-runc", "runc", "crun"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we implement it so that if the slice is empty we don't update the config. Alternatively see the comment below.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tariq1890.

I think this has come together very nicely. I have some additional comments.

@elezar
Copy link
Member

elezar commented Oct 9, 2024

Thanks for the work on this @tariq1890.

One question that I had was whether we should update how we apply the modified config. At present we're going to generate a file that includes the full config. This may not be desirable when we are using drop-in files in crio, for example. I don't think it's a blocker for this iteration, but it would be good to put some thought into how the system will behave under these conditions.

@tariq1890 tariq1890 force-pushed the get-config-from-cmdline branch 3 times, most recently from a0ddda5 to 6266ef3 Compare October 9, 2024 19:51
@tariq1890
Copy link
Contributor Author

At present we're going to generate a file that includes the full config. This may not be desirable when we are using drop-in files in crio, for example. I don't think it's a blocker for this iteration, but it would be good to put some thought into how the system will behave under these conditions.

Agreed, one way is to modify the command line config source to only return the runtimes config block and everything under it. Although, this could easily apply to every other config source

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patience @tariq1890.

Let's get this in!

Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>

add default runtime binary path to runtimes field of toolkit config toml

Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>

[no-relnote] Get low-level runtimes consistently

We ensure that we use the same low-level runtimes regardless
of the runtime engine being configured. This ensures consistent
behaviour.

Signed-off-by: Evan Lezar <elezar@nvidia.com>

Co-authored-by: Evan Lezar <elezar@nvidia.com>

address review comment

Signed-off-by: Tariq Ibrahim <tibrahim@nvidia.com>
Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@elezar elezar merged commit a06d838 into main Oct 10, 2024
10 checks passed
@elezar elezar deleted the get-config-from-cmdline branch October 10, 2024 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants