-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propagate podsetupdates to jobs #1180
Conversation
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
63dd636
to
0ce7d07
Compare
5bfce40
to
88fb63f
Compare
2b9ca75
to
66c9c7e
Compare
Which part of the description? I looks all accurate to me. |
473d049
to
7f247f4
Compare
b95f795
to
8555fd3
Compare
Adjusted the description due to the drop of the nodeSelectorOverwrite |
I think all comments are addressed. |
89f5a27
to
ff470f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM
I left a comment for a nit.
test/integration/controller/jobs/tfjob/tfjob_controller_test.go
Outdated
Show resolved
Hide resolved
// Merge updates or appends the replica metadata & spec fields based on PodSetInfo. | ||
// If returns error if there is a conflict. | ||
func Merge(meta *metav1.ObjectMeta, spec *v1.PodSpec, info PodSetInfo) error { | ||
if err := info.Merge(PodSetInfo{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with either way. If @alculquicondor would like to take another way let us any comment here.
@@ -675,14 +676,27 @@ func (r *JobReconciler) getPodSetsInfoFromAdmission(ctx context.Context, w *kueu | |||
return nil, err | |||
} | |||
for k, v := range flv.Spec.NodeLabels { | |||
nodeSelector.NodeSelector[k] = v | |||
podSetInfo.NodeSelectorOverwrite[k] = v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be any conflicts, because the node selectors coming from the job are used to filter out flavors. https://kueue.sigs.k8s.io/docs/concepts/resource_flavor/#resourceflavor-labels
In almost situations, resourceFlavors are created by batch admins and jobs are created by batch users. So I guess that conflicts could happen. Certainly, when users set the proper parameters for flvs and jobs, conflicts won't happen.
Actually, current implementation, each integration controller overwrites podSpec's nodeSelector with podSetInfo one (flvs):
kueue/pkg/controller/jobs/job/job_controller.go
Lines 227 to 228 in c3cfc18
info := podSetsInfo[0] | |
j.Spec.Template.Spec.NodeSelector = utilmaps.MergeKeepFirst(info.NodeSelector, j.Spec.Template.Spec.NodeSelector) |
However, I think the previous implementation (8555fd3) looks a bit hacky.
...
newNodeSelector := make(map[string]string)
for k, v := range o.NodeSelector {
if _, exists := podSetInfo.NodeSelectorOverwrite[k]; !exists {
newNodeSelector[k] = v
}
}
if err := utilmaps.HaveConflict(podSetInfo.NodeSelector, newNodeSelector); err != nil {
...
In this implementation, maybe utilmaps.HaveConflict(podSetInfo.NodeSelector, newNodeSelector)
always returns nil, right?
2f0b9e4
to
e6d8c2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice work! Thank you!
/lgtm
/approve
LGTM label has been added. Git tree hash: 975543b57801ad9a50214e3490fafb8b5b8ef51a
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/release-note-edit
|
* propagate podsetupdates to jobs * Remarks * Move test test * Remove focus * consistent naming for `podSetInfos` * consistent naming * revert unnecessary rename
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Part of #1145
Special notes for your reviewer:
For all fields we want to fail on conflict according to the KEP: https://github.com/kubernetes-sigs/kueue/tree/main/keps/1145-additional-labels.
Adjusted unit tests which tested update of nodeSelectors, because this scenario could not happen e2e.
Does this PR introduce a user-facing change?