Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the implementation of uri & Fix async log bug #1364

Merged
merged 8 commits into from
Nov 18, 2022

Conversation

you-n-g
Copy link
Collaborator

@you-n-g you-n-g commented Nov 17, 2022

Description

Motivation and Context

How Has This Been Tested?

  • Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
  • If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

  1. Pipeline test:
  2. Your own tests:

Types of changes

  • Fix bugs
  • Add new feature
  • Update documentation

@you-n-g
Copy link
Collaborator Author

you-n-g commented Nov 17, 2022

@ChiahungTai You are welcome to review this PR.

logger = get_module_logger("workflow")


class ExpManager:
"""
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChiahungTai You can start from these docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we can have multiple Experiments with different uri" => IIUC, the words are imply the design of ExpManger is suggestion the user use different uri in each experiment?

From my understanding in MLFLow, the uri is like a store backend(sql, file...). The ExpManger is a single entry to lookup and manipulate the experiments in the uri.

So the user can change different uri to retrieve the different topic of experiements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
ExpManager is a singleton, but we can change its uri to get experiments from different uri.
I added some comments just now.
Please check if it is helpful

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your new comment is more clear than before.

qlib/workflow/expm.py Outdated Show resolved Hide resolved
@ChiahungTai
Copy link
Collaborator

Fail the test in test_all_pipeline.py.
I have add the test case for you in #1363.

@ChiahungTai
Copy link
Collaborator

Looks like the exception is the same as the test case #1363.

        except MlflowException as e:
            raise ValueError(
                "No valid experiment has been found, please make sure the input experiment name is correct."
            ) from e

@@ -370,11 +369,11 @@ def uri_context(self, uri: Text):
the temporal uri
"""
prev_uri = self.exp_manager.default_uri
C.exp_manager["kwargs"]["uri"] = uri
self.exp_manager.default_uri = uri
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the set of default_uri is not engouh. The self.exp_manager is not change the client uri if you only change default uri.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fixed now.
The _client will not be cached. No further maintenance is required for it.

if self.active_experiment is not None:
self.active_experiment.end(recorder_status)
self.active_experiment = None
# When an experiment end, we will release the current uri.
self._current_uri = None
self._set_client_uri()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pump into an exception - No argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now.

@@ -322,18 +333,17 @@ def client(self):
self._client = mlflow.tracking.MlflowClient(tracking_uri=self.uri)
return self._client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might need to sync the returned self._client to the updated default_uri.
Or you have set the uri in every function in access the client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed Now.

@@ -329,7 +329,7 @@ def get_local_dir(self):
def start_run(self):
# set the tracking uri
mlflow.set_tracking_uri(self.uri)
# start the run
# start the RuntimeError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the start the RuntimeError mean?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now

_ = mlflow.tracking.MlflowClient(tracking_uri=str(self.TMP_PATH))
end = time.time()
elasped = end - start
self.assertGreater(1e-2, elasped) # it can be done in less than 10ms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertGreater or assertLesser

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the direction. And it is more readable than before

logger = get_module_logger("workflow")


class ExpManager:
"""
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
This is the `ExpManager` class for managing experiments. The API is designed similar to mlflow.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your new comment is more clear than before.

@you-n-g
Copy link
Collaborator Author

you-n-g commented Nov 18, 2022

@ChiahungTai
Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable.
When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given.
So the default exp name is used.
There is no experiment which is active or is named with the default exp name.
So an exception of No valid experiment has been found is raised

@you-n-g you-n-g mentioned this pull request Nov 18, 2022
5 tasks
@you-n-g you-n-g changed the title Optimize the implementation of uri Optimize the implementation of uri & Fix async bug Nov 18, 2022
@you-n-g you-n-g changed the title Optimize the implementation of uri & Fix async bug Optimize the implementation of uri & Fix async log bug Nov 18, 2022
@ChiahungTai
Copy link
Collaborator

The test code is pass after assign the experiment_name and recorder_name.
But in my own project, I provide the experiment_name and recorder_name.
But the code is too long to write a simple test case.
with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original")
I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first.
Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.

I checked out your test

I didn't get what you want to assert in the test.

It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

@ChiahungTai
Copy link
Collaborator

Your new patch works fine in my own project.

The test code is pass after assign the experiment_name and recorder_name. But in my own project, I provide the experiment_name and recorder_name. But the code is too long to write a simple test case. with R.uri_context(uri=uri_path): recorder = R.get_recorder(experiment_name=EXP_NAME, recorder_name="original") I am working on another version of cached_pipeline in my own project. I might test the new commit and try it my local first. Also I will upload a new test if I can rewrite a simple test case.

@ChiahungTai Hi, I have fixed all the comments and waiting for the CI.
I checked out your test
I didn't get what you want to assert in the test.
It seems that the exception looks reasonable. When we want to get a recorder, Qlib tries to get an experiment first. The experiment name or id is not given. So the default exp name is used. There is no experiment which is active or is named with the default exp name. So an exception of No valid experiment has been found is raised

@ChiahungTai
Copy link
Collaborator

The patch LGTM. You can merge it after fix the pylint error.

@you-n-g you-n-g merged commit 994f893 into microsoft:main Nov 18, 2022
@you-n-g you-n-g deleted the r_optm branch November 18, 2022 05:11
@you-n-g you-n-g added the enhancement New feature or request label Dec 9, 2022
qianyun210603 pushed a commit to qianyun210603/qlib that referenced this pull request Mar 23, 2023
* Optimize the implementation of uri

* remove redundant func

* Set the right order of _set_client_uri

* Update qlib/workflow/expm.py

* Simplify client & add test.Add docs; Fix async bug

* Fix comments & pylint

* Improve README
qianyun210603 pushed a commit to qianyun210603/qlib that referenced this pull request Mar 23, 2023
* Optimize the implementation of uri

* remove redundant func

* Set the right order of _set_client_uri

* Update qlib/workflow/expm.py

* Simplify client & add test.Add docs; Fix async bug

* Fix comments & pylint

* Improve README
qianyun210603 pushed a commit to qianyun210603/qlib that referenced this pull request Mar 23, 2023
* Optimize the implementation of uri

* remove redundant func

* Set the right order of _set_client_uri

* Update qlib/workflow/expm.py

* Simplify client & add test.Add docs; Fix async bug

* Fix comments & pylint

* Improve README
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants