-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intro nonblocking: no serialization & no deadlocks #122
Comments
Amended text in PDF for reading: |
Annotated version of the pdf file above: |
I have no opinion on this change but would like to dispute both justifications:
|
@RolfRabenseifner and I talked about this and came up with alternate language:
And then lead into the existing text: "Another advantage is that one can improve performance ..." |
In fact, this is not the primary reason for non-blocking communication. It is very difficult to ensure that complex patterns of blocking communication do not deadlock, particularly if the communication pattern is not known at compile time. Non-blocking communication solves this problem, even if it doesn’t provide any overlapping of operations (except in the broadest sense, which does not fit the use in this description).
I have sometimes distinguished between non-blocking communication (ensures that programs can make progress because communication does not block) and asynchronous communication (something that can take place when other operations are also taking place).
Non-blocking communication makes it possible to express communication is a way that enables overlap. But this is not the only or even primary purpose.
Bill
William Gropp
Director and Chief Scientist, NCSA
Thomas M. Siebel Chair in Computer Science
University of Illinois Urbana-Champaign
… On Mar 7, 2019, at 8:31 AM, Jeff Squyres ***@***.***> wrote:
@RolfRabenseifner <https://github.com/RolfRabenseifner> and I talked about this and came up with alternate language:
The purpose of nonblocking communcation is to permit overlapping communication with any other activity. For example, overlapping multiple send and receive operations can allow greater efficiency and can help resolve serialization and deadlock issues.
And then lead into the existing text: "Another advantage is that one can improve performance ..."
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#122 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ANJTZmUA_2j_QqgdEQ2G6kU2wcSmd-0Wks5vUSLagaJpZM4bBN4t>.
|
Minutes from the reading of this issue on March 6, 2019:
Minutes from the re-reading on March 7, 2019:
|
@wgropp : I'm not sure whether your comment supports our proposal or not. |
@wgropp The proposed text is using "overlap" in a general sense, i.e. to overlap with any other activity (perhaps computation - e.g. to hide latency, perhaps other communication - e.g. to avoid deadlock or serialisation of message transfers). It is may be more usual to use "overlap" to mean the narrow sense of "asynchronous communication to permit concurrent computation" - leading to confusion. All: should we use a different word? |
The first sentence in the proposal is incorrect. The purpose of nonblocking is not to enable the overlap of communication and computation. While it is true the having some form of nonblocking operation is required for overlap, that is an additional benefit, not the purpose. It would be better to start with the original reason - then note that this make overlap possible, then talk about the benefits of having asynchronous or overlapped communication. |
I almost hate to say it, but nonblocking is a semantic concept independent of MPI progress. The proposal confuses the two. |
And in response to Dan's comment about the language covering this case, that's what I meant by "in the broadest sense". My concern is that almost everyone reading the proposed text will not understand that sense, and further will come to expect asynchronous execution, which is not part of the semantics of MPI non-blocking. |
Based on a new text proposal from Bill Gropp, I updated the proposed solution above in the description of this issue. |
The PR https://github.com/mpi-forum/mpi-standard/pull/99 is now updated. The latest PDF (with change-bars, see section 3.7 on page 49) for reading in Albuquerque in Dec 2019 is here: |
Dear Dan,
Please can you update your pull request and provide a new pdf before our 2 week deadline. |
I’m strongly against adding such information to the change log. The change log should only cover substantive changes to functions or additions, and it should never be a replacement for reading the standard. The change log already has a number of inappropriate entries (e.g., advice to implementors that belongs *only* in the text of the standard), and I don’t see any value is adding notes about specific clarifications to the text, even when as extensive as these.
Bill
William Gropp
Director and Chief Scientist, NCSA
Thomas M. Siebel Chair in Computer Science
University of Illinois Urbana-Champaign
On Jan 30, 2020, at 3:16 PM, Rolf Rabenseifner <notifications@github.com<mailto:notifications@github.com>> wrote:
The PR mpi-forum/mpi-standard#99<mpi-forum/mpi-standard#99> is now updated.
The latest PDF (with change-bars, see section 3.7 on page 49) for reading in Albuquerque in Dec 2019 is here:
mpi-report-ticket122-2019-NOV-25.pdf<https://github.com/mpi-forum/mpi-issues/files/3887748/mpi-report-ticket122-2019-NOV-25.pdf>
Dear Dan,
The change-log is still missing. You can find the text from Albuquerque in the description of this issue:
TBD: Change-log entry that users, trainers, and book authors can detect the change of the intro and can adopt it in their software and teaching material.
Proposal for the Change-log:
Section 3.7 on page 49.
The introduction of MPI nonblocking communication was corrected to better describe correctness and performance reasons for the use of nonblocking communication.
Please can you update your pull request and provide a new pdf before our 2 week deadline.
Then a no-no-vote for the change-log and a first vote can be scheduled for Portland.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#122?email_source=notifications&email_token=ADJFGZXSMNZ24AA4XFN232DRAM7S5A5CNFSM4GYE3YW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKMSYMI#issuecomment-580463665>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADJFGZQ4I3LJZSHYY45NHNTRAM7S5ANCNFSM4GYE3YWQ>.
|
I have no strong opinion one way or the other w.r.t. a change-log entry for this clarification. Also, @RolfRabenseifner this is your ticket, not mine. I will support it with my vote because the wording change is valuable, but I don't have time this close to the 2-week announcement deadline to work on it. Sorry. |
Dear Bill, if you expect that the MPI users and MPI implementors are the only reader of the MPI standard, then you would have been right. But we have also to serve the book writers and providers of MPI training and teaching itself and of MPI training/teaching materials. So many teach MPI nonblocking only for overlapping communication with computation. The reason for this is, that
I now advertise for more than 20 years the book "Using MPI" from Gropp, Lusk, Skjellum, "On most parallel computers, moving data from one process to another takes And if you google for
If such authors try to keep their teaching material up-to-date, then I expect And therefore, this changelog entry is such important. You can never expect that they start to read the whole MPI standard again This still means that we may have to live the next 10 years with wrong stories What do you expect, when we will see a corrected new edition of "Using MPI". Best regards |
Updates for Portland (in PR99 and pdf):
|
There was a straw vote in Portland about whether or not to have a changelog entry: Yes: 15 |
mpi-report-issue122-NB-intro-2020-02-20-annotated.pdf |
Result of the two no-no-votes today, Feb. 20, 2020 morning in Portland::
|
Final version after all reading (from Albuquerque, Dec. 2019) plus no-no-vote changes until Portland (Feb. 2020): |
This proposal did not meet ballot quorum for one no-no vote (adding a changelog entry) at the February 2020 meeting: Yes - 15 This proposal passed another no-no vote (updating the normative text) at the February 2020 meeting: Yes - 23 |
This proposal passed a first vote at the February 2020 meeting: Yes - 24 |
This passed a second vote on 2020-06-30. |
Problem
Although overlapping communication and communication, this means sends and receives, is a significant usecase of nonblocking communication (e.g. the first example code in the book "Using MPI"), and mainly to prevent deadlocks and serialization of communication, the MPI standards only mentions the overlap of computation and communication in the Intro of Section 3.7 Nonblocking Communication. The result is a lot of books, tutorials, teaching that never discusses to prevent serialization by using nonblocking communication. And therefore, there exists a lot of inefficient application code with serialization wasting a significant amount of compute power, electric energy (and producing CO2).
Proposal
Start Section 3.7 Nonblocking Communication, MPI-3.1, page 47 lines 7-11 with a new paragraph and remove outdated text.
I asked Bill Gropp at the Zurich meeting for a new intro. See his text proposal below.
The text after the striked out text is unchanged text from MPI-3.1.
Changes to the Text
The first paragraph of Section 3.7 should be substituted by:
Section 3.7 Nonblocking Communication
Nonblocking communication is important both for reasons of correctness and performance.
A nonblocking send start call initiates the send operation, but does not complete it. The send start call can return before the message was copied out of the send buffer. A separate send complete call is needed to complete the communication, i.e., to verify that the data has been copied out of the send buffer. With suitable hardware, the transfer of data out of the sender memory may proceed concurrently with computations done at the sender after the send was initiated and before it completed. Similarly, a nonblocking receive start call initiates the receive operation, but does not complete it. The call can return before a message is stored into the receive buffer. A separate receive complete call is needed to complete the receive operation and verify that the data has been received into the receive buffer. With suitable hardware, the transfer of data into the receiver memory may proceed concurrently with computations done after the receive was initiated and before it completed. The use of nonblocking receives may also avoid system buffering and memory-to-memory copying, as information is provided early on the location of the receive buffer.
For complex communication patterns, the use of only blocking communication (without buffering) is difficult because the programmer must ensure that each send is matched with a receive in an order that avoids deadlock. For communication patterns that are determined only at run time, this is even
more difficult. Using nonblocking communication avoids this problem, allowing programmers to express complex and possibly dynamic communication patters without needing to ensure that all sends and receives are issued in an order that prevents deadlock (see Section 3.5 and the discussion of “safe" programs). Nonblocking communication also allows for the overlap of communication with different communication operations, e.g., to prevent the serialization of such operations, and for the overlap of communication with computation. Whether an implementation is able to accomplish an effective (from a performance standpoint) overlap of operations depends on the implementation itself and the system on which the implementation is running. Using nonblocking operations permits an implementation to overlap communication with computation, but does not require it to do so.
One can improve performance on many systems by overlapping communication and computation. This is especially true on systems where communication can be executed autonomously by an intelligent communication controller. Light-weight threads are one mechanism for achieving such overlap. An alternative mechanism that often leads to better performance is to use nonblocking communication.The latest PDF (with change-bars "ticket122", see section 3.7 on page 49) for reading in Albuquerque in Dec 2019 is here: mpi-report-ticket122-2019-NOV-25.pdf
Final version after all reading (from Albuquerque, Dec. 2019) plus no-no-vote changes until Portland (Feb. 2020):
mpi-report-issue122-NB-intro-2020-02-20b-annotated.pdf
Impact on Implementations
None.
Impact on Users
Directly none. In longterm, they may better understand MPI nonblocking and its usecases.
References
Pull request
The PR is https://github.com/mpi-forum/mpi-standard/pull/99
The text was updated successfully, but these errors were encountered: