Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event Notification Extension #18

Merged
merged 1 commit into from
Apr 6, 2017
Merged

Event Notification Extension #18

merged 1 commit into from
Apr 6, 2017

Conversation

rhc54
Copy link
Member

@rhc54 rhc54 commented Mar 24, 2017

Signed-off-by: Ralph Castain rhc@open-mpi.org

@rhc54 rhc54 added this to the v2.0 milestone Mar 24, 2017
@rhc54
Copy link
Member Author

rhc54 commented Mar 27, 2017

@dsolt @jjhursey @abouteiller Please comment - I'd like some feedback before doing the prototype implementation

@jjhursey
Copy link
Member

I think this sounds ok.

One of the points made was the need for events to be sent between threads within a process. That seems difficult to manage since we would have to track which threadid registered the request, and enforce that only that threadid receive the notification. And multiple threads could be registered for the same event so that would need to be tracked.

That seems like a lot of bookkeeping to me, but maybe I'm thinking about it wrong.

@rhc54
Copy link
Member Author

rhc54 commented Mar 30, 2017

I agree about tracking at a thread level - probably outside where we want to go, and I should clarify the comment. My intent was only to support multiple registrations against the same code or group of codes. It would be up to the user to ensure that each thread registered a different callback function.

This was the mechanism, when combined with a new data range of "proc_local", I had envisioned for someone to "notify" a separate thread of an event such as declaration of a programming model.

RFC0018.md Outdated
None of these require modification of existing PMIx APIs, nor addition of new ones. Instead, all can be supported by adding attributes to direct the behavior of the existing event registration/notification functions.

#### Event registration extensions
The primary need here is for attributes indicating desired ordering of the event handler being registered vs other handlers that have already been registered or will subsequently be registered. In both cases, it is necessary that the user provide something to identify each handler so the relative position can later be specified - this is to be done via the existing _PMIX\_EVENT\_HDLR\_NAME_ attribute. In addition, the existing _PMIX\_EVENT\_ORDER\_PREPEND_ attribute directs PMIx to prepend the handler being registered to the front of the chain, as opposed to the default append behavior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the default behavior be to prepend? In a typical use case, the application will initialize libraries in a "more generic to more specific" order, meaning that the last registered handler is the one that is the most "specialized". So it most often should have precedence at trying to "fix the problem", before passing the hand to more generic handlers.

I understand that this additional PR greatly alleviates this concern, but it still seems more logical in that direction to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reflecting the current default behavior, but I am open to modifying that if desired. Let's discuss at the meeting and see what people think.

RFC0018.md Outdated

* PMIX\_EVENT\_HDLR\_BEFORE - put this event handler immediately before the one specified in the (char*) value. The named event handler must be in the same category (single, multi, or default) as the one being registered. An error will be returned if the named event handler is not found.

* PMIX\_EVENT\_HDLR\_AFTER - put this event handler immediately after the one specified in the (char*) value. The named event handler must be in the same category (single, multi, or default) as the one being registered. An error will be returned if the named event handler is not found.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not remember if registering twice an handler is already illegal or not. If it is not, it certainly should become now, as otherwise the before/after relation is ill defined.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The before/after relation is only defined for the specific registration request that contains it. So if you register another handler for the same code, then it must include its own relational directives. I will clarify, however, that you cannot register more than one handler with the same string identifier.

RFC0018.md Outdated
#### Event notification extensions
Two changes are proposed in this area, although only one actually impacts the standards header file by adding the following attribute:

* PMIX\_EVENT\_NO\_TERMINATION - directs that all subsequent steps in the event handler chain must not call for termination of the application. Any other operations are permitted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic is reverse here:
lets consider a chain A B C D(default)

We call A(status, attr={})
if A can handle the issue reported by status it changes the attribute to NO_TERMINATION.

We then call B(status, attr={NO_TERMINATION})
B cannot handle the condition (it indicates a fatal error of some sort that cannot be corrected), with these definitions, B cannot reset to terminate.

I propose the opposite logic:

  1. A(status, attr={DEFAULT_TERMINATION})
    A is happy, so it sets NO_TERMINATION
  2. B(status, attr={NO_TERMINATION})
    B is unhappy, so it sets TERMINATION
  3. C(status, attr={WANT_TERMINATION})
    C is happy or unhappy, doesn't matter, WANT_TERMINATION cannot be overridden
  4. D is default, sees WANT_TERMINATION, terminates.

Now, if all handlers are happy, we'd reach
D(status, NO_TERMINATION)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically what you are proposing is that we allow each handler to modify the collected array of results, as opposed to only adding to them? We would definitely have to define which attributes can be overwritten, but implementation would be simple and not require an API change, if that's what people want to do. Again, worth bringing up in the meeting.

@rhc54
Copy link
Member Author

rhc54 commented Mar 30, 2017

@abouteiller @jjhursey I have updated the RFC to reflect your comments - please review

Copy link
Member

@jjhursey jjhursey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I still think we need to be careful about wording around delivering an event to a specific thread. That requires tracking of thread ids and making sure we deliver the event in the context of the same thread. I'd like to make that the caller's problem to solve, not PMIx. I think the wording in here is appropriately flexible.

rhc54 pushed a commit to rhc54/openpmix that referenced this pull request Apr 2, 2017
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
rhc54 pushed a commit to rhc54/openpmix that referenced this pull request Apr 3, 2017
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Approved

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
@rhc54 rhc54 merged commit 89fbe44 into pmix:master Apr 6, 2017
@rhc54 rhc54 deleted the rfc/eventext branch April 6, 2017 20:14
rhc54 pushed a commit to rhc54/openpmix that referenced this pull request Apr 6, 2017
Return an error if someone asks to register an event handler
before/after another handler that is not yet known.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants