Format specifiers for MPI types #107

jdinan · 2018-09-20T10:25:42Z

Problem

It's not easy to perform I/O on MPI_Count, e.g. with printf or scanf.

Proposal

Similar to inttypes.h, add MPI_PRI_COUNT and MPI_SCN_COUNT format specifiers to mpi.h.

Changes to the Text

TBD

Impact on Implementations

Should be limited to header files.

Impact on Users

Users don't need to figure out the format specifier based on the size and signedness of the type.

References

inttypes.h

The text was updated successfully, but these errors were encountered:

jdinan · 2018-09-20T10:47:10Z

Comment from discussion on 9/20/2018: This also raises a question about interoperability between MPI and C library routines that operate on C standard types (e.g. printf, scanf, etc.). Being able to specify format specifiers indicates that there is a correspondence between the MPI type and a C standard type.

jdinan · 2018-09-20T10:47:51Z

@mahermanns and @dholmes-epcc-ed-ac-uk Thanks for volunteering to further discussion on this ticket.

dholmes-epcc-ed-ac-uk · 2018-09-20T15:36:11Z

That is the right question (Bill, Sept 2018)

Should MPI replace MPI_COUNT with size_t in all C API definitions, and with whatever native Fortran datatype is natural for the intended usage in each situation in all Fortran API definitions?

Should MPI replace MPI_AINT with ptrdiff_t in all C API definitions, and with whatever native Fortran datatype is natural for the intended usage in each situation (which may not exist in all versions of Fortran!) in all Fortran API definitions?

The consequences of this counter-proposal are that no such format specifiers are needed, and the arithmetic operators MPI_AINT_ADD and MPI_AINT_DIFF are no longer needed, and the Big MPI proposal is no longer needed (as currently specified), and <other benefits>.

bosilca · 2018-09-20T17:11:10Z

MPI_Aint to ptrdiff_t would be more accurate. But otherwise +1.

dholmes-epcc-ed-ac-uk · 2018-09-20T22:50:03Z

Thanks @bosilca - I knew that such a type must exist but could not think of the type name at the time I wrote the comment.

jeffhammond · 2018-09-24T05:15:55Z

@jdinan Is it really that hard? We know from MPI-3.1 Section 2.5.8 that MPI_Count must be signed, because

it must be minimally 16 capable of encoding any value that may be stored in a variable of type int

so one should only need to verify that off_t and ptrdiff_t are the same size and then use %zd or PRI64d.

In any case, I fail to see any utility in truncating words in MPI_PRI_COUNT and MPI_SCN_COUNT. Just use MPI_PRINT_COUNT and MPI_SCAN_COUNT. The result is significantly more readable and adds only 3 bytes to the size of mpi.h.

dholmes-epcc-ed-ac-uk · 2018-09-24T08:54:37Z

@jeffhammond given that these format specifiers only apply to the printf and scanf functions (with variants, such as vsprintf?) then we should probably include that extra F to make it 20% clearer:
MPI_PRINTF_COUNT
MPI_SCANF_COUNT

Dumb question: will these ever be different to each other? Do we need two/both?

What is the Fortran equivalent? The "I" descriptor seems old, i.e. F77 era.

jeffhammond · 2018-09-24T14:49:57Z

@dholmes-epcc-ed-ac-uk Fortran does not standardize a preprocessor so it doesn't really matter.

jdinan · 2018-09-25T19:45:17Z

These should follow the convention used in inttypes.h for print and scan format specifiers. These can be used in any of the functions in the printf and scanf family (see the link above for info on the inttypes header).

@jeffhammond Yes, it really is this hard if you want portability. In C, the standard integer type binary format is implementation defined, but the fixed width integer types must be two's complement. It is therefore possible to have two different signed integer representations and a user will not know which one should be used with MPI_Count.

jdinan · 2018-09-25T19:46:50Z

We can't use C size_t and ptrdiff_t because of heterogeneity support and language interoperability.

jeffhammond · 2018-09-25T20:43:58Z

@jdinan This should be fixed in C20/C++20.

We could also just preemptively stipulate that the MPI standard requires two's complement integers because there are literally no system outside of Unisys supports anything else and then only in the context of FPGA emulation of legacy code that can't be migrated to x86_64 (see aforementioned documents for details).

mhoemmen · 2018-09-25T22:00:34Z

@jeffhammond FYI if you want the latest version of a paper, use the wg21.link/p0907 link; it automatically resolves to the most recent submitted version. P0907 is on R3 now. Also it's been forwarded to Core, but I'm not sure of current status for C++20.

mahermanns · 2018-10-05T13:11:09Z

I think using size_t and ptrdiff_t in the API is a different discussion.

I think as MPI introduces the typedef, it should also be MPI defining the format specifier (apart from how difficult it is or whether it is possible at all).

Using the PRI abbreviation would follow the principle of least astonishment. However, as we are diverting from the original naming anyway (with the second underscore and all uppercase), it may indeed be better to expand the names to MPI_PRINT_COUNT and MPI_SCAN_COUNT (I am also not a friend of abbreviating variable names unnecessarily). Then again, naming them MPI_PRI_COUNT and MPI_SCN_COUNT may set them apart enough from other MPI constants to foster intuitive recognition.

dholmes-epcc-ed-ac-uk · 2020-10-21T17:03:04Z

@jdinan has this problem gone away? (I know that the answer has to be "no" because no changes have been made to address it, but no-one has commented on this issue since 2018 so it obviously particularly pressing.)

Is there still interest in doing something about this for the mpi-4.0 release? If so, the clock is ticking rapidly.

jdinan · 2021-01-05T19:43:24Z

@dholmes-epcc-ed-ac-uk No, this hasn't been fixed. This issue could be a good first proposal for any Forum members that are looking to get their feet wet introducing a new proposal to the MPI Forum.

raffenet · 2021-07-21T16:10:45Z

Just as reference, MPICH has provided these (in mpi.h) for some time.

/* FIXME: The following two definition are not defined by MPI and must not be
   included in the mpi.h file, as the MPI namespace is reserved to the MPI
   standard */
#define MPI_AINT_FMT_DEC_SPEC "%ld"
#define MPI_AINT_FMT_HEX_SPEC "%lx"

raffenet · 2021-07-21T16:11:40Z

Just as reference, MPICH has provided these (in mpi.h) for some time.

/* FIXME: The following two definition are not defined by MPI and must not be
   included in the mpi.h file, as the MPI namespace is reserved to the MPI
   standard */
#define MPI_AINT_FMT_DEC_SPEC "%ld"
#define MPI_AINT_FMT_HEX_SPEC "%lx"

Note the actual specifiers are filled in by configure.

wesbland · 2022-07-14T23:30:13Z

I’m going to propose moving this to MPI 5.0. There’s more discussion to be had here. If someone objects and thinks we’ll be ready to read this soon, leave a comment and we can discuss bringing it back into MPI 4.1.

jdinan · 2022-09-30T20:55:33Z

To folks that have asked, no the problem has not gone away since users don't know to which integral C type a given MPI integer type maps. If another Forum member has cycles to pick up this issue (should be a relatively easy one), please feel free to do so.

jeffhammond · 2022-10-11T18:21:40Z

Can one just default to %llu and promote it, if it's not 64b?

jdinan · 2022-10-24T21:01:06Z

If you used the same approach with scanf, it would be difficult to detect whether the value is truncated.

jeffhammond · 2023-01-02T07:53:19Z

One could also determine the size of an integer using sizeof and whether it is signed using this.

With C++, it seems straightforward to deduce the printf formats from typeid. See below.

One can write something similar with the GNU C extension typeof, which is expected to be in C23. I assume there is a way to do it with _Generic as well, but I haven't tried.

As for scanf, I would expect binary file I/O to need to store the type information if one cannot assume 64-bit values.

#include <typeinfo>
#include <iostream>
#include <string>

#include <mpi.h>

int main(void)
{
    MPI_Count  c = 5;
    MPI_Aint   a = 6;
    MPI_Offset o = 7;

    std::string ff{"C=%"+std::string{typeid(MPI_Count).name()}+"\n"};
    printf(ff.c_str(),c);

    std::string gg{"A=%"+std::string{typeid(MPI_Aint).name()}+( std::is_signed<MPI_Aint>() ? "d" : "u")+"\n"};
    printf(gg.c_str(),a);

    std::string hh{"O=%"+std::string{typeid(MPI_Offset).name()}+"\n"};
    printf(hh.c_str(),o);

    return 0;
}

mhoemmen · 2023-01-02T22:22:58Z

@jeffhammond wrote:

With C++, it seems straightforward to deduce the printf formats from typeid. See below.

In C++23, I would use std::print, and in C++20, I would use std::format. These Solve the Problem without you needing to know the printf format specifier. If you need to support earlier C++ versions, you can use the {fmt} library. (C++20 and C++23 standardized these parts of the {fmt} library.)

If I had to use printf, Jeff's typeid-based approach works, but please note that the result of std::type_info::name() is mangled and not standard. GCC offers a demangling function ( https://gcc.gnu.org/onlinedocs/libstdc++/manual/ext_demangling.html ); other compilers probably also do that.

Can one just default to %llu and promote it, if it's not 64b?

If it's actually a pointer, reinterpret_cast<ptrdiff_t>(p) would get you a signed integer, in which case I would use t instead of ll.

Please don't use intmax_t (see e.g., https://thephd.dev/intmax_t-hell-c++-c ).

jeffhammond · 2023-03-16T12:01:58Z

We might end up standardizing the C type of these types for the ABI, so maybe it won't be so bad in the future.

jeffhammond · 2023-04-28T13:02:08Z

I withdraw my prior objections to this proposal. We should do this, and it is especially important for MPI_Count, because it is likely going to be the wider of intptr_t and int64_t and thus it's going to be annoying for users to printf these.

jdinan assigned mahermanns and dholmes-epcc-ed-ac-uk Sep 20, 2018

dholmes-epcc-ed-ac-uk added not ready wg-large-counts Large Counts Working Group labels Sep 20, 2018

jdinan mentioned this issue Sep 21, 2018

Predefined datatypes for MPI_Count and friends #109

Closed

dholmes-epcc-ed-ac-uk mentioned this issue Sep 23, 2018

Big MPI---large-count and displacement support--collective chapter #80

Closed

wesbland added mpi <next> and removed mpi-4.0 labels Nov 18, 2020

wesbland added no-wg Discussion doesn't have a current working group and removed wg-large-counts Large Counts Working Group labels Nov 18, 2020

wesbland added wg-languages Languages Working Group mpi-4.1 For inclusion in the MPI 4.1 standard and removed no-wg Discussion doesn't have a current working group mpi <next> labels Jul 21, 2021

wesbland assigned martinruefenacht and tonyskjellum Jul 21, 2021

wesbland removed the not ready label Jul 21, 2021

wesbland assigned Wee-Free-Scot and unassigned dholmes-epcc-ed-ac-uk Jul 21, 2021

wesbland added mpi-5 For inclusion in the MPI 5.0 standard and removed mpi-4.1 For inclusion in the MPI 4.1 standard labels Jul 14, 2022

jeffhammond mentioned this issue Aug 31, 2023

MPI needs a standard ABI #751

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format specifiers for MPI types #107

Format specifiers for MPI types #107

jdinan commented Sep 20, 2018

jdinan commented Sep 20, 2018

jdinan commented Sep 20, 2018

dholmes-epcc-ed-ac-uk commented Sep 20, 2018 •

edited

Loading

bosilca commented Sep 20, 2018

dholmes-epcc-ed-ac-uk commented Sep 20, 2018

jeffhammond commented Sep 24, 2018

dholmes-epcc-ed-ac-uk commented Sep 24, 2018

jeffhammond commented Sep 24, 2018

jdinan commented Sep 25, 2018 •

edited

Loading

jdinan commented Sep 25, 2018

jeffhammond commented Sep 25, 2018 •

edited

Loading

mhoemmen commented Sep 25, 2018 •

edited

Loading

mahermanns commented Oct 5, 2018

dholmes-epcc-ed-ac-uk commented Oct 21, 2020

jdinan commented Jan 5, 2021

raffenet commented Jul 21, 2021

raffenet commented Jul 21, 2021

wesbland commented Jul 14, 2022

jdinan commented Sep 30, 2022

jeffhammond commented Oct 11, 2022

jdinan commented Oct 24, 2022

jeffhammond commented Jan 2, 2023 •

edited

Loading

mhoemmen commented Jan 2, 2023

jeffhammond commented Mar 16, 2023

jeffhammond commented Apr 28, 2023

Format specifiers for MPI types #107

Format specifiers for MPI types #107

Comments

jdinan commented Sep 20, 2018

Problem

Proposal

Changes to the Text

Impact on Implementations

Impact on Users

References

jdinan commented Sep 20, 2018

jdinan commented Sep 20, 2018

dholmes-epcc-ed-ac-uk commented Sep 20, 2018 • edited Loading

bosilca commented Sep 20, 2018

dholmes-epcc-ed-ac-uk commented Sep 20, 2018

jeffhammond commented Sep 24, 2018

dholmes-epcc-ed-ac-uk commented Sep 24, 2018

jeffhammond commented Sep 24, 2018

jdinan commented Sep 25, 2018 • edited Loading

jdinan commented Sep 25, 2018

jeffhammond commented Sep 25, 2018 • edited Loading

mhoemmen commented Sep 25, 2018 • edited Loading

mahermanns commented Oct 5, 2018

dholmes-epcc-ed-ac-uk commented Oct 21, 2020

jdinan commented Jan 5, 2021

raffenet commented Jul 21, 2021

raffenet commented Jul 21, 2021

wesbland commented Jul 14, 2022

jdinan commented Sep 30, 2022

jeffhammond commented Oct 11, 2022

jdinan commented Oct 24, 2022

jeffhammond commented Jan 2, 2023 • edited Loading

mhoemmen commented Jan 2, 2023

jeffhammond commented Mar 16, 2023

jeffhammond commented Apr 28, 2023

dholmes-epcc-ed-ac-uk commented Sep 20, 2018 •

edited

Loading

jdinan commented Sep 25, 2018 •

edited

Loading

jeffhammond commented Sep 25, 2018 •

edited

Loading

mhoemmen commented Sep 25, 2018 •

edited

Loading

jeffhammond commented Jan 2, 2023 •

edited

Loading