Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "Frozen" ClusterQueue state #134

Closed
Tracked by #222
ahg-g opened this issue Mar 19, 2022 · 9 comments
Closed
Tracked by #222

Add "Frozen" ClusterQueue state #134

ahg-g opened this issue Mar 19, 2022 · 9 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/productionization priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Mar 19, 2022

Introduce a new "OutOfCommision/Frozen" state for ClusterQueue. A ClusterQueue can enter this state in the following cases:

  • it is referencing a non-existent ResourceFlavor.
  • being deleted while there are still running workloads previously admitted via the ClusterQueue. This will require adding a finalizer to prevent deleting the object until the workloads finish.

ClusterQueues in this state should be taken out of the cohort and no jobs can schedule via them until the referenced flavors are defined.

@ahg-g ahg-g added kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Mar 19, 2022
@ahg-g ahg-g added kind/productionization priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Apr 13, 2022
@kerthcet
Copy link
Contributor

/assign
I'd like to take a look.

@kerthcet
Copy link
Contributor

kerthcet commented Apr 25, 2022

ToDoList:

@ahg-g
Copy link
Contributor Author

ahg-g commented Jun 24, 2022

Firstly, just want to make sure all running workloads including admitted workloads and inadmissible workloads, right? So you means all workloads corresponding to the clusterQueue should finish running before we successfully delete a clusterQueue.

We only need to wait for running workloads to finish. We can do this as follows: 1) change the CQ status to suspended; this will prevent any new workloads from being admitted 2) wait until existing admitted workloads finish.

To @alculquicondor question: Why did we need to know if a workload was assumed?, I used to decide whether a clusterQueue can be deleted by the number of admitted workloads, if they all finished, then the clusterQueue will be deleted. I have two reasons:
workloads inadmissible may never have change to run, e.g. insufficient resources, if so, clusterQueue will never be deleted successfully until we delete the workload
Considering clusterQueues should be managed by administrators, they may delete the clusterQueue for special reasons, like reallocating the cluster resources, I don't think they would like to wait for all workloads finished running, especially some workloads will never run to completion. 🤔

The admin will have to delete admitted running workloads manually, I don't think we can do that on their behalf.

When a clusterQueue is stuck in terminating, we should forbid other workloads get admitted any longer. I totally agree with this, I have implemented this in a follow up PR, do you think we should combine them together into one PR?

I think this should be the first PR, or you can merge them together.

Different with the idea of changing the clusterQueue's status back to Pending, I added a new status named suspended, WDYT?

I think we need one state, I would change pending to suspended and have different reasons based on the why the CQ is in that state.

A new question, when clusterQueue is in terminating, we will not forbid creating new corresponding workloads like we do today, right? If so, when batch users continuously creating workloads, the clusterQueue will never be deleted.

It will based on the logic I described above, we don't care about workloads that are not yet admitted, once the CQ is actually deleted, they will get their status updated that the CQ doesn't exist and users can make a decision what to do with them.

@alculquicondor
Copy link
Contributor

Note that the Assumed state is temporary. A workload is assumed when the scheduler decides that it should fit and then it's Forgotten if the API call fails. So it's completely fine to wait for these workloads to either be admitted and finished or to get removed from the clusterqueue.

In all scenarios, all that should matter is whether the cache's ClusterQueue is empty.

@kerthcet
Copy link
Contributor

kerthcet commented Jun 25, 2022

The admin is aware of the status of all the workloads, when he wants to delete the clusterQueue, he should be responsible for the results(we will only wait for the admitted workloads to complete).

Oppositely, if he wants to delete the clusterQueue, but stuck in terminating for several unadmitted workloads pending for special reasons, it doesn't make sense.

Of course he can delete the unadmitted workloads manually, but if we have hundreds of unadmitted workloads, it's struggle.

@ahg-g
Copy link
Contributor Author

ahg-g commented Jun 27, 2022

The admin is aware of the status of all the workloads, when he wants to delete the clusterQueue, he should be responsible for the results(we will only wait for the admitted workloads to complete).
Oppositely, if he wants to delete the clusterQueue, but stuck in terminating for several unadmitted workloads pending for special reasons, it doesn't make sense.

What are you proposing? that Kueue deletes running workloads? if so, I don't think we can do that, we should be conservative when handling user workloads, deleting the workloads is super aggressive and may not be what the admin/user wants.

Of course he can delete the unadmitted workloads manually, but if we have hundreds of unadmitted workloads, it's struggle.

A tool can be created for that; we plan to have a kubectl plugin for kueue, such a feature can be created there.

@kerthcet
Copy link
Contributor

We only need to wait for running workloads to finish. We can do this as follows: 1) change the CQ status to suspended; this will prevent any new workloads from being admitted 2) wait until existing admitted workloads finish

My opinion is the same as yours actually.

@ahg-g
Copy link
Contributor Author

ahg-g commented Jul 13, 2022

with #284 merged, I think this one is done. We still need to add the finalizers in the webhook though. Which are now tracked in separate issues.

/close

@k8s-ci-robot
Copy link
Contributor

@ahg-g: Closing this issue.

In response to this:

with #284 merged, I think this one is done. We still need to add the finalizers in the webhook though. Which are now tracked in separate issues.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/productionization priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants