Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update manage.py #586

Merged
merged 8 commits into from
Sep 9, 2021
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 26 additions & 4 deletions qlib/workflow/task/manage.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,14 @@ class TaskManager:
The tasks manager assumes that you will only update the tasks you fetched.
The mongo fetch one and update will make it date updating secure.

This class can be used as a tool from commandline. Here are serveral examples

.. code-block:: shell

python -m qlib.workflow.task.manage -t <pool_name> wait
python -m qlib.workflow.task.manage -t <pool_name> task_stat


.. note::

Assumption: the data in MongoDB was encoded and the data out of MongoDB was decoded
Expand Down Expand Up @@ -80,7 +88,7 @@ def __init__(self, task_pool: str):
task_pool: str
the name of Collection in MongoDB
"""
self.task_pool = getattr(get_mongodb(), task_pool)
self.task_pool:pymongo.collection.Collection = getattr(get_mongodb(), task_pool)
self.logger = get_module_logger(self.__class__.__name__)

@staticmethod
Expand All @@ -101,6 +109,19 @@ def _encode_task(self, task):
return task

def _decode_task(self, task):
"""
_decode_task is Serialization tool

Parameters
----------
task : dict
task information

Returns
-------
bson.objectid.ObjectId
demon143 marked this conversation as resolved.
Show resolved Hide resolved
Convert dict to bson
"""
for prefix in self.ENCODE_FIELDS_PREFIX:
for k in list(task.keys()):
if k.startswith(prefix):
Expand Down Expand Up @@ -211,6 +232,7 @@ def create_task(self, task_def_l, dry_run=False, print_nt=False) -> List[str]:
r = self.task_pool.find_one({"filter": t})
except InvalidDocument:
r = self.task_pool.find_one({"filter": self._dict_to_str(t)})
# When r is none, it indicates that r s a new task
if r is None:
new_tasks.append(t)
if not dry_run:
Expand Down Expand Up @@ -461,11 +483,11 @@ def run_task(

After running this method, here are 4 situations (before_status -> after_status):

STATUS_WAITING -> STATUS_DONE: use task["def"] as `task_func` param
STATUS_WAITING -> STATUS_DONE: use task["def"] as `task_func` param,it means that the task has not been started

STATUS_WAITING -> STATUS_PART_DONE: use task["def"] as `task_func` param

STATUS_PART_DONE -> STATUS_PART_DONE: use task["res"] as `task_func` param
STATUS_PART_DONE -> STATUS_PART_DONE: use task["res"] as `task_func` param,it means that the task has been started but not completed

STATUS_PART_DONE -> STATUS_DONE: use task["res"] as `task_func` param

Expand Down Expand Up @@ -496,7 +518,7 @@ def (task_def, **kwargs) -> <res which will be committed>
if task is None:
break
get_module_logger("run_task").info(task["def"])
# when fetching `WAITING` task, use task["def"] to train
# when fetching `WAITING` task, use task["def"] to train.
if before_status == TaskManager.STATUS_WAITING:
param = task["def"]
# when fetching `PART_DONE` task, use task["res"] to train because the middle result has been saved to task["res"]
Expand Down