Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retain info references on public info dup #12847

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Oct 5, 2024

We need to keep the references when duplicating info through MPI_Info_dup so that subsequent calls to MPI_Info_dup see the same info entries.

Test case below was provided by Neil Fortner. Is there a way to add this test to the test harness?

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

#define ERROR(msg) \
do { \
    printf(msg "\n"); \
    exit(1); \
} while(0)

int main(void) {
    MPI_Info info1 = MPI_INFO_NULL, info2 = MPI_INFO_NULL, info3 = MPI_INFO_NULL;
    int nkeys1, nkeys2, nkeys3;

    if (MPI_Info_create(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_create failed");

    if (MPI_Info_set(info1, "custom_key", "custom_value") != MPI_SUCCESS)
        ERROR("MPI_Info_set failed");

    if (MPI_Info_get_nkeys(info1, &nkeys1) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info1, &nkeys1) failed");

    if (MPI_Info_dup(info1, &info2) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info1) failed");

    if (MPI_Info_get_nkeys(info2, &nkeys2) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info2, &nkeys2) failed");

    if (nkeys1 != nkeys2)
        ERROR("Number of keys on duplicated MPI info object does not match that on original");

    if (MPI_Info_dup(info2, &info3) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info2) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info2) failed");

    if (MPI_Info_get_nkeys(info3, &nkeys3) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info3, &nkeys3) failed");

    if (nkeys1 != nkeys3)
        ERROR("Number of keys on double duplicated MPI info object does not match that on original");

    if (MPI_Info_free(&info3) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info3) failed");

    printf("test passed\n");

    return 0;
}

We need to keep the references when duplicating info through
MPI_Info_dup so that subsequent calls to MPI_Info_dup
see the same info entries.

Test case provided by Neil Fortner:

```C
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

#define ERROR(msg) \
do { \
    printf(msg "\n"); \
    exit(1); \
} while(0)

int main(void) {
    MPI_Info info1 = MPI_INFO_NULL, info2 = MPI_INFO_NULL, info3 = MPI_INFO_NULL;
    int nkeys1, nkeys2, nkeys3;

    if (MPI_Info_create(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_create failed");

    if (MPI_Info_set(info1, "custom_key", "custom_value") != MPI_SUCCESS)
        ERROR("MPI_Info_set failed");

    if (MPI_Info_get_nkeys(info1, &nkeys1) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info1, &nkeys1) failed");

    if (MPI_Info_dup(info1, &info2) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info1) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info1) failed");

    if (MPI_Info_get_nkeys(info2, &nkeys2) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info2, &nkeys2) failed");

    if (nkeys1 != nkeys2)
        ERROR("Number of keys on duplicated MPI info object does not match that on original");

    if (MPI_Info_dup(info2, &info3) != MPI_SUCCESS)
        ERROR("MPI_Info_dup failed");

    if (MPI_Info_free(&info2) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info2) failed");

    if (MPI_Info_get_nkeys(info3, &nkeys3) != MPI_SUCCESS)
        ERROR("MPI_Info_get_nkeys(info3, &nkeys3) failed");

    if (nkeys1 != nkeys3)
        ERROR("Number of keys on double duplicated MPI info object does not match that on original");

    if (MPI_Info_free(&info3) != MPI_SUCCESS)
        ERROR("MPI_Info_free(&info3) failed");

    printf("test passed\n");

    return 0;
}
```

Signed-off-by: Joseph Schuchart <joseph.schuchart@stonybrook.edu>
@edgargabriel
Copy link
Member

@devreal I plan to add running the HDF5 testsuite to the ompi CI through a github action, that should cover this testcase as well.

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the name change for the public_only argument ? I find the old naming more clear.

You don't need to check for public, at that point in the code it doesn't matter anymore. Moreover, while trying to understand the code I don't even see that's the need for ie_referenced, as all set entries have it set. Are you seeing a case where this field is not set ?

@edgargabriel
Copy link
Member

edgargabriel commented Oct 7, 2024

I can confirm btw. that this PR fixes the issue #12742 , i.e. the HDF5 testphdf5 test is passing again

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code and I see now why you need to copy ie_referenced. It is only set for set/get and not by dup, so if you try to dup an info twice it will not be duplicated the second time.

@@ -69,19 +69,24 @@ OBJ_CLASS_INSTANCE(opal_info_entry_t, opal_list_item_t, info_entry_constructor,
* key-value pairs that are not internal and that had been referenced,
* either through opal_info_get or opal_info_set.
*/
static int opal_info_dup_impl(opal_info_t *info, opal_info_t **newinfo, bool public_only)
static int opal_info_dup_impl(opal_info_t *info, opal_info_t **newinfo, bool public)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static int opal_info_dup_impl(opal_info_t *info, opal_info_t **newinfo, bool public)
static int opal_info_dup_impl(opal_info_t *info, opal_info_t **newinfo, bool public_only)

{
opal_info_entry_t *iterator;

OPAL_THREAD_LOCK(info->i_lock);
OPAL_LIST_FOREACH (iterator, &info->super, opal_info_entry_t) {
/* skip keys that are internal if we didn't ask for them */
if (public_only && (iterator->ie_internal || iterator->ie_referenced == 0)) continue;
if (public && (iterator->ie_internal || iterator->ie_referenced == 0)) continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (public && (iterator->ie_internal || iterator->ie_referenced == 0)) continue;
if (public_only && (iterator->ie_internal || iterator->ie_referenced == 0)) continue;

/* create a new info entry and retain the string objects */
opal_info_entry_t *newentry = OBJ_NEW(opal_info_entry_t);
newentry->ie_key = iterator->ie_key;
OBJ_RETAIN(iterator->ie_key);
newentry->ie_value = iterator->ie_value;
if (public) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (public) {
if (public_only) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants