Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MCA param for multithread opal_progress(). #5241

Merged
merged 1 commit into from
Nov 9, 2018

Conversation

thananon
Copy link
Member

@thananon thananon commented Jun 6, 2018

This is more of a discussion than a PR. After some work into multi-thread support, all of us found the bottleneck at opal_progress() since we serialize the threads here. (The request refactoring a few years back was to handle the serialization better for oversubscribed case.)

Now let's discuss here. @hjelmn , @bosilca and I agreed that we should allow threads in opal_progress to maximize the use of multiple resources and most of the components are now thread-safe. We should push this forward. However, this will be a problem with oversubscribed case. So I think the first good step should be an MCA param.

So I drafted a PR. This is just a simple way to do it and I'm open to discussion. This does not have to be in 4.0. release.

@bosilca
Copy link
Member

bosilca commented Jun 6, 2018

The oversubscription case will not necessarily be a problem if we handle the progress according to the core where it happen and prevent threads on the same core for going into progress. This might add one additional atomic operation on the critical path, but it will give us the opportunity to keep things clean in the BTL progress.

@thananon
Copy link
Member Author

thananon commented Jun 19, 2018

So we all agreed this should happen right? I would love some input from other people.

@aravindksg
Copy link
Contributor

The basic idea of the PR to allow multi-threaded progress would be beneficial. Playing with Scalable Endpoints in OFI MTL, I see significant performance improvement with this approach.
Minor comment on the MCA parameter itself: Instead of ‘unsafe_progress’, can probably rename to something that sounds less ominous

@hjelmn
Copy link
Member

hjelmn commented Jul 23, 2018

Should help with btl/uct as well and will eventually help with btl/ugni (once I enable multi-device support for two-sided).

@thananon
Copy link
Member Author

I see performance bump on btl/ofi as well. I can start working on this as soon as I get off my internship. Assuming there is no one opposing this.

I also want to know if it's possible to make multithreaded progress a default. Should I send out an email on devel list to ask if anyone's component is still not thread safe?

@thananon
Copy link
Member Author

I discussed with @bosilca on this.

  • BTLs are all thread-safe. (with sm going away)
  • Collectives are always serialized.
  • osc/pt2pt is going away.

We might want to make the multithreaded opal_progress() a default. At least we should bring this up in our weekly meeting or maybe a discussion/decision on the upcoming face to face meeting?

@aravindksg
Copy link
Contributor

@thananon - Were you planning on merging this patch for now and leave the discussion to make multi-threaded opal_progress a default for later? (Not sure if this was already discussed in the weekly meeting?)

@thananon
Copy link
Member Author

@aravindksg There is no plan to merge this now. We have plan to bring this up in the next face to face meeting in October.

Right now I'm running test on applications and the effect that of this change to different btls. I know that this will be off by default but we would like to have community member weight in on this.

@jsquyres
Copy link
Member

@thananon
Copy link
Member Author

Per face to face discussion, we agreed on this...

  1. Create MCA parameter to allow number of threads to perform progress. The default will be 1. (I will update this PR)

  2. I will send out a guideline for components who has progress function to appropriately update their component.

  3. Sometimes later, we will change the default number to something (like num_core per rank). This will be discussed later.

@thananon
Copy link
Member Author

thananon commented Nov 1, 2018

I fixed the patch. This one creates an MCA opal_max_thread_in_progress to set the number of threads allowed in opal_progress. The default is 1, so it will maintain the default behavior. The component with multithreaded design can set this MCA to num_core_per_rank to get optimal performance.

I will send out the component guidelines out soon.

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make num_thread_in_progress static.

@@ -121,7 +122,7 @@ int opal_register_params(void)
opal_signal_string = string;
ret = mca_base_var_register ("opal", "opal", NULL, "signal",
"Comma-delimited list of integer signal numbers to Open MPI to attempt to intercept. Upon receipt of the intercepted signal, Open MPI will display a stack trace and abort. Open MPI will *not* replace signals if handlers are already installed by the time MPI_INIT is invoked. Optionally append \":complain\" to any signal number in the comma-delimited list to make Open MPI complain if it detects another signal handler (and therefore does not insert its own).",
MCA_BASE_VAR_TYPE_STRING, NULL, 0, MCA_BASE_VAR_FLAG_SETTABLE,
MCA_BASE_VAR_TYPE_STRING, NULL, 0, MCA_BASE_VAR_FLAG_SETTABLE,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure I fixed this.

This commit added MCA param `opal_max_thread_in_progress` to set the
number of threads allowed to do opal_progress concurrently. The default
value is 1.

Component with multithreaded design can benefit from this change to
parallelize their component progress function.

Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
@thananon thananon merged commit 3dc1629 into open-mpi:master Nov 9, 2018
@thananon thananon deleted the opal_progress branch November 9, 2018 17:30
aravindksg added a commit to aravindksg/ompi that referenced this pull request Dec 17, 2018
PR open-mpi#5241 provided an MCA variable to allow multi-threaded opal_progress.
However, it allowed to update the linked list even when multiple threads was
allowed to call opal_progress. This caused a scenario when a more recent thread
could complete it's progress and fail the assert(sync ==
wait_sync_list).

Allowing to update the linked list only for the case when the number of threads
exceeds the threshold fixes the problem. It also seems to improve performance quite a
bit.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
aravindksg added a commit to aravindksg/ompi that referenced this pull request Dec 17, 2018
PR open-mpi#5241 provided an MCA variable to allow multi-threaded opal_progress.
However, it allowed to update the linked list even when multiple threads was
allowed to call opal_progress. This caused a scenario when a more recent thread
could complete it's progress and fail the assert(sync ==
wait_sync_list).

Allowing to update the linked list only for the case when the number of threads
exceeds the threshold fixes the problem.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants