Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate podsetupdates to jobs #1180

Merged
merged 7 commits into from
Oct 13, 2023

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Oct 4, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Part of #1145

Special notes for your reviewer:

For all fields we want to fail on conflict according to the KEP: https://github.com/kubernetes-sigs/kueue/tree/main/keps/1145-additional-labels.

Adjusted unit tests which tested update of nodeSelectors, because this scenario could not happen e2e.

Does this PR introduce a user-facing change?

A mechanism for AdmissionChecks to provide labels, annotations, tolerations and node selectors to the pod templates when starting a job

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Oct 4, 2023
@netlify
Copy link

netlify bot commented Oct 4, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 85bc270
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/65293246bf1a9200086a5634

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 4, 2023
@mimowo mimowo marked this pull request as ready for review October 5, 2023 10:34
@mimowo mimowo force-pushed the propagate-labels branch 9 times, most recently from 5bfce40 to 88fb63f Compare October 6, 2023 11:22
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 6, 2023
@mimowo mimowo force-pushed the propagate-labels branch 6 times, most recently from 2b9ca75 to 66c9c7e Compare October 6, 2023 18:21
@mimowo
Copy link
Contributor Author

mimowo commented Oct 11, 2023

Also, it looks like the PR description is outdated?

Which part of the description? I looks all accurate to me.

@mimowo mimowo force-pushed the propagate-labels branch 3 times, most recently from 473d049 to 7f247f4 Compare October 11, 2023 16:58
@mimowo
Copy link
Contributor Author

mimowo commented Oct 12, 2023

Also, it looks like the PR description is outdated?

Which part of the description? I looks all accurate to me.

Adjusted the description due to the drop of the nodeSelectorOverwrite

@mimowo
Copy link
Contributor Author

mimowo commented Oct 12, 2023

I think all comments are addressed.

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM
I left a comment for a nit.

// Merge updates or appends the replica metadata & spec fields based on PodSetInfo.
// If returns error if there is a conflict.
func Merge(meta *metav1.ObjectMeta, spec *v1.PodSpec, info PodSetInfo) error {
if err := info.Merge(PodSetInfo{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with either way. If @alculquicondor would like to take another way let us any comment here.

pkg/controller/jobframework/reconciler.go Outdated Show resolved Hide resolved
@@ -675,14 +676,27 @@ func (r *JobReconciler) getPodSetsInfoFromAdmission(ctx context.Context, w *kueu
return nil, err
}
for k, v := range flv.Spec.NodeLabels {
nodeSelector.NodeSelector[k] = v
podSetInfo.NodeSelectorOverwrite[k] = v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any conflicts, because the node selectors coming from the job are used to filter out flavors. https://kueue.sigs.k8s.io/docs/concepts/resource_flavor/#resourceflavor-labels

In almost situations, resourceFlavors are created by batch admins and jobs are created by batch users. So I guess that conflicts could happen. Certainly, when users set the proper parameters for flvs and jobs, conflicts won't happen.

Actually, current implementation, each integration controller overwrites podSpec's nodeSelector with podSetInfo one (flvs):

info := podSetsInfo[0]
j.Spec.Template.Spec.NodeSelector = utilmaps.MergeKeepFirst(info.NodeSelector, j.Spec.Template.Spec.NodeSelector)

However, I think the previous implementation (8555fd3) looks a bit hacky.

...
	newNodeSelector := make(map[string]string)
	for k, v := range o.NodeSelector {
		if _, exists := podSetInfo.NodeSelectorOverwrite[k]; !exists {
			newNodeSelector[k] = v
		}
	}
	if err := utilmaps.HaveConflict(podSetInfo.NodeSelector, newNodeSelector); err != nil {
...

In this implementation, maybe utilmaps.HaveConflict(podSetInfo.NodeSelector, newNodeSelector) always returns nil, right?

Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice work! Thank you!
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 13, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 975543b57801ad9a50214e3490fafb8b5b8ef51a

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mimowo, tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 13, 2023
@k8s-ci-robot k8s-ci-robot merged commit 8375309 into kubernetes-sigs:main Oct 13, 2023
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.5 milestone Oct 13, 2023
@alculquicondor
Copy link
Contributor

/release-note-edit

A mechanism for AdmissionChecks to provide labels, annotations, tolerations and node selectors to the pod templates when starting a job

PBundyra pushed a commit to PBundyra/kueue that referenced this pull request Oct 26, 2023
* propagate podsetupdates to jobs

* Remarks

* Move test test

* Remove focus

* consistent naming for `podSetInfos`

* consistent naming

* revert unnecessary rename
@mimowo mimowo deleted the propagate-labels branch November 29, 2023 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants