Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add accelerator and part frameworks to mca lists #10949

Closed
wants to merge 1 commit into from

Conversation

hppritcha
Copy link
Member

This patch adds part framework to the list of ompi frameworks opened when the core of open mpi initialized and
closed when the ref cnt on the ompi core drops back to zero.

Likewise, the accelerator framework is added to the opal frameworks list.

Add some verbiage that will hopefully show where to add new frameworks in the opal/ompi initialization procedures.

Related to #10938

Signed-off-by: Howard Pritchard howardp@lanl.gov

Copy link
Member

@bwbarrett bwbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request changes is about the comment around the accelerator.

We don't need to address it in this PR, but I'm not sure comments are really helpful. It seems like what we really want to do is make explicit calls to mca_base_framework_open() in ompi/instance/instance.c and opal_init.c cause a compile-time error so that we never get into this situation in the first place. When I was talking about avoiding the problem in the first place, that's really the type of change I was talking about.

* The datatype convertor code has a dependency on the accelerator framework
* being initialized. */
ret = mca_base_framework_open(&opal_accelerator_base_framework, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how this change fixes anything. mca_base_framework_open_list() is literally just a for loop around the framework list calling mca_base_framework_open() on each item in the list. It has the same flag argument, so there's no change there. I don't think this was the important change.

I thought it might be a different behavior in the registered cleanup argument, but reading the code for mca_base_framework_close_list, it is just calling mca_base_framework_close(). I think the real problem is not any of the list usage stuff, but that the registered cleanup function was/is calling the finalize function directly rather than calling mca_base_framework_close, which was a pretty silly mistake on our part in the review. It kind of makes me wonder where @wckzhang found that cleanup example and if we don't have other issues lurking elsewhere.

Either way, this patch is wrong. Either the opal_finalize_register_cleanup on line 203 needs to be removed or the whole patch could become changing that to opal_finalize_register_cleanup_arg(mca_base_framework_close, opal_accelerator_base_framework). The first obviously seems simpler, but one of the two needs to happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before the patch here's the dlopening/closing activity related to accelerator framework:


calling MPI_INit
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null

calling MPI_Finalize
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null

calling MPI_Session_init
hit segfault in the accelerator framework selection function as there are pointers to memory which has been munmapped.

behavior with the patch

MPI_Init
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null
MPI_Finalize
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null
MPI_Session_init
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlopnening /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null
MPI_Session_finalize
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_rocm
dlclosing /usr/projects/artab/users/hpp/ompi3/test_install/lib/openmpi/mca_accelerator_null

a framework has to be closed in opal_finalize when using enable-mca-dso otherwise a reopen of the framework is a no op owing to a check at line 173 of mca_base_framework.c but the component *.so's were unloaded as part of the MPI_Finalize. Without the patch the accelerator framework was not being closed.

Copy link
Member

@bwbarrett bwbarrett Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, like I said, the issue is not the use / non-use of the list. The issue is that the code was registering a call to the finalize call directly, rather than component close. If line 203 was opal_finalize_register_cleanup_arg(mca_base_framework_close, opal_accelerator_base_framework), the existing code would have worked just as well.

I have no problem with the list change, but in opal, that was not the root cause of the problem. Deregisters are called in reverse order, so I think what's happening is that accelerator_base_selected_component.accelerator_finalize() is called and then shortly later mca_base_framework_close(opal_accelerator_base_framework) is called, which calls accelerator_base_selected_component.accelerator_finalize() again and then closes the framework correctly. But we document that finalize will be called once per open, so we're getting lucky that the component doesn't do anything bad in that second call to finalize.

Hence, for the patch to be correct, line 203 (opal_finalize_register_cleanup(accelerator_base_selected_component.accelerator_finalize);) needs to be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep I agree about removing the accelerator finalize call.
@wckzhang do you recall what motivated having the accelerator component finalize handled in a separate function rather than in the close of the component? Is there some cuda behavior that motivated this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought that was how we do cleanup

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hppritcha I don't understand that. mca_base_framework_close() is still the right answer; calling <framework>.finalize() directly leaves the base framework in a not-clean state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the accelerator framework was mostly doing the right thing. As described in section IIIB of 10.1109/CLUSTER.2019.8891002 the complex interlinking of framework dependencies that existed prior to sessions was replaced with a cleanup callback framework.

@hjelmn can provide more details if needed.

What is the not-clean state that the accelerator framework is being left in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code calls:

opal_finalize_register_cleanup(opal_accelerator_base_selected_component.accelerator_finalize);

instead of the correct:

opal_finalize_register_cleanup_arg(mca_base_framework_close, opal_accelerator_base_framework);

So while the component gets cleaned up, the framework itself does not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would opal_finalize_register_cleanup_arg(mca_base_framework_close, opal_accelerator_base_framework) be needed? If it is in the list of frameworks (which it should be if you want it always opened) then it will get called automatically when the last call to finalize is made.

I plan to take a closer look at this PR and the existing code tomorrow. Will probably have more comments then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjelmn we've gone around so long that it doesn't matter. But the current PR is wrong and needs to be fixed. Today, on line 212, it calls opal_finalize_register_cleanup(opal_accelerator_base_selected_component.accelerator_finalize);. This line just needs to be removed.

I was trying to make the point to Howard that adding the accelerator framework to the list of frameworks wasn't strictly necessary to fixing the issue, because I was pushing on how we missed this issue in the original CR. That just overcomplicated things. If the patch is extended to remove line 212, I'm happy to ship this patch. Until then, the patch is wrong, because we're calling the component's accelerator_finalize() function twice. Once from the base cleanup function that is invoked by adding the framework to the opal framework list and once explicitly on line 212.

@hppritcha
Copy link
Member Author

That's a good idea about compile time error mca_base_framework_open in the opal init file. I'll something to the PR to catch this.

@hppritcha hppritcha marked this pull request as draft October 24, 2022 20:51
@hppritcha
Copy link
Member Author

marking as draft while i test on various systems with accelerator flavor of the day

@github-actions
Copy link

Hello! The Git Commit Checker CI bot found a few problems with this PR:

7d8b1da: add accelerator and part frameworks to mca lists

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

This patch adds part framework to the list of ompi frameworks opened when the core of open mpi initialized and
closed when the ref cnt on the ompi core drops back to zero.

Likewise, the accelerator framework is added to the opal frameworks list.

Note the accelerator framework cleanup function does need to be added to opal's cleanup list.

Add some verbiage that will hopefully show where to add new frameworks in the opal/ompi initialization procedures.

Add a macro in opal_init.c that generates complilation errors if a call to mca_base_framework_open is made with
within opal_init.

Also add a cleanup function for the smcuda btl to avoid a warning message at mpi finalize.

Related to open-mpi#10938

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
@hppritcha
Copy link
Member Author

closing this PR as no longer needed.

@hppritcha hppritcha closed this Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants