Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpich additions for Romio341 refresh #5

Conversation

markalle
Copy link

This is a batch of romio fixes from OMPI that we've contributed back to mpich main. It's from the time period after we stopped fixing things in our copy of romio and instead started fixing them over in actual mpich romio.

At the time of this PR the below fixes from us have all been accepted into mpich main, but haven't yet appeared in numbered releases like romio 3.4.1.

This is meant to be an add-on to the romio 3.4.1 refresh by @ggouaillardet
open-mpi#8740

 *  https://github.com/pmodels/mpich/pull/4943
    -- darray fix which contains a flatten fix
        73a3eba
        c4b5762
 *  https://github.com/pmodels/mpich/pull/4995
    -- write strided fix
        bbb5210
 *  https://github.com/pmodels/mpich/pull/5100
    -- build fix for -Wpedantic
        ad0e435
 *  https://github.com/pmodels/mpich/pull/5099
    -- build fix, they had let file-system=...gpfs bit rot
        e1d42af
        313289a
        83bbb82
 *  https://github.com/pmodels/mpich/pull/5150
    -- build fix, configure-related _GNU_SOURCE
        a712b56
        5a036e7
 *  https://github.com/pmodels/mpich/pull/5184
    -- build fix, continuation of _GNU_SOURCE fix
        d97c4ee

markalle and others added 10 commits April 13, 2021 17:00
…ubtype creation

> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

When Type_create_darray() uses the block()/cyclic() function to
construct a subtype for whatever dimension it's processing, it
needs to resize the resulting type before it goes into the type
construction as the input for processing the next dimension.

The same problem is in the main path's type creation, and in romio's
mirroring of it.

Gist for a testcase:
https://gist.github.com/markalle/940de93d64fd779e304ee124855b8e6a

The darray_bug_romio.c testcase creates a darray using
* 4 ranks in a 2x2 grid
* looking at the type made for rank 0
* inner dimension: 4 ints distributed block over 2 ranks with 2 items each
* outer dimension: 6 of the above distributed cyclic over 2 ranks with 2 items each

The type created by processing the first dimension should look like this
[ x x . . ]

And then distributing those for the second dimension becomes

[ x x x x ]
[ . . . . ]
[ . . . . ]
[ . . . . ]
[ x x x x ]
[ . . . . ]

Going to the MPI standard to justify why the first layout is right,
it's where the definiton of the cyclic() function has a ub_marker
of gsize*ex, eg 4 ints for that first type.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

ADIOI_Count_contiguous_blocks and ADIOI_Flatten are very similar functions,
but they diverge a lot around the edges and in degenerate cases.  In Spectrum
MPI I spent some time making them consistent with each other but
found that to be a losing battle.

So the approach I used here is to not have Count() be as definitive,
and rather let Flatten() have the last word on what the final flat->count
really is.  Eg Flatten's *curr_index is the real count.

The changes I made are

1. Fix a couple places in Flatten where *curr_index was updated out
of sync with what was actually being added to flat->indices[] and
flat->blocklens[].  That's one of the important book-keeping rules
Flatten should follow.  There were a couple places (when counts[] arrays
have 0s) where that wasn't the case (see the "nonzeroth" var in this
commit).

2. The main change I made was to reset flat->count based on
Flatten's curr_index.  This is because the divergence between the
two functions is too great to reliably fix.

3. A third change is just a safety valve, using a
flatlist_node_grow() function just in case the
Count function returns a smaller value than Flatten ends up
trying to write into the array, and related to this I
got rid of the various --(flat->count) lines, since that var now
represents the allocated size of the array, until right after
the Flatten function when it's reset to represent the data written
to the array like it used to be.

I don't think we even need ADIOI_Count_contiguous_blocks() anymore,
but I didn't remove it.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

The ADIOI_GEN_WriteStrided funcion uses data sieving on non-contiguous
types. That is, if it wants to write data at locations
    [W...X...Y...Z...]
it reads the whole buffer
    [dddddddddddddddd]
changes the locations it wants to write to
    [WdddXdddYdddZddd]
then writes the whole thing back. It uses locks to make this safe, but
the problem is this only protects against other parts of the product that
are using locks. And without this PR a peer who is simultaneously making
a simple non-contiguous write wouldn't have locked.

A testcase to demonstrate the original problem is here:
https://gist.github.com/markalle/d7da240c19e57f095c5d1b13240dae24

% mpicc -o x romio_write_timing.c
% mpirun -np 4 ./x

Note: you need to use a filesystem that uses ADIOI_GEN_WriteStrided to
hit the bug. I was using GPFS.

This commit is pulled from wkliao after discussing where to put the
new lock.  It adds locks to contiguous writes in independent write
functions when data sieving write is not disabled

Signed-off-by: Mark Allen <markalle@us.ibm.com>
…ments

> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

typecasting a (void*) due to compiler complaint when -Wpedantic is used

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

Configuring mpich with --with-file-system=nfs+ufs+gpfs it wouldn't
build because the last arg of ADIOI_OneSidedWriteAggregation()
changed, so this PR adds an & to it

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

Configuring mpich with --with-file-system=nfs+ufs+gpfs it wouldn't
run because there are a couple free() calls that reported null
pointers, so I changed a couple places to "if (x) { ADIOI_Free(x); }"

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

Romio GPFS had a runtime problem due to a .mem_ptrs field that
was uninitialized that it was potentially trying to free.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

adio.h must be included before any system headers so that config
settings are taken into account. Fixes an issue where aligned_alloc
may be used without a declaration. See similar fix in [4fcffbe695cb].

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

mpl.h must be included before any system headers so that config
settings are taken into account. Fixes an issue where aligned_alloc
may be used without a declaration.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
> Pulled in from mpich romio, branch "main".
> Their commit message is below.
>
> This is part of a batch of commits from the
> following set of PRs:
>  *  pmodels/mpich#4943
>     -- darray fix which contains a flatten fix
>         73a3eba
>         c4b5762
>  *  pmodels/mpich#4995
>     -- write strided fix
>         bbb5210
>  *  pmodels/mpich#5100
>     -- build fix for -Wpedantic
>         ad0e435
>  *  pmodels/mpich#5099
>     -- build fix, they had let file-system=...gpfs bit rot
>         e1d42af
>         313289a
>         83bbb82
>  *  pmodels/mpich#5150
>     -- build fix, configure-related _GNU_SOURCE
>         a712b56
>         5a036e7
>  *  pmodels/mpich#5184
>     -- build fix, continuation of _GNU_SOURCE fix
>         d97c4ee

There was a problem with romio's use of MPL and the include order
sometimes having system files before the config files generated
by configure.  So for example at configure time _GNU_SOURCE was
used and thus configure concluded that aligned_memory() was
already declared, but then later in romio sometimes stdlib.h or
similar sys includes were coming before mplconfig.h and maybe
even before romioconf.h.

My understanding of the design described by hzhou is that
universlaly putting the config files at the top is the best
solution to this ordering.

All over the romio code they seem to use adio.h as their top
level include, which has romioconf.h as its first include.
So in this PR I'm moved adio.h to the top of a few more files.

And I also made romioconf.h include mplconfig.h as its first
include.  Otherwise what we have for romio's use of MPL is

adio.h:
  includes romioconf.h  <-- we're relying on _GNU_SOURCE being here too
  includes many things including system files
  includes adioi.h
    includes mpl.h
      includes mplconfig.h with _GNU_SOURCE

so having _GNU_SOURCE down in mplconfig.h doesn't really
ensure it's in front of the system files unless we can
rely on it being there in romioconf.h (Maybe we can? I'm not
sure).  Anyway as long as the top level romioconf.h knows
it's consuming MPL it can put mplconfig.h at its top and then
all the rest of the romio code only needs to follow the rule
of putting adio.h first and then both romio and mpl will get
their config settings included at the top

Signed-off-by: Mark Allen <markalle@us.ibm.com>
@ggouaillardet ggouaillardet merged commit 69db1c9 into ggouaillardet:topic/romio341_refresh Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants