Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coll: use MPI_Aint for v-collective array parameters #5065

Merged
merged 1 commit into from
Feb 20, 2021

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Feb 12, 2021

Pull Request Description

Last PR (#5044 ) switched to use MPI_Aint for all internal collective routines. This PR does the same for the counts/displs array for v-collectives.

Also added python code to generate MPI_Aint impl prototypes and generate
code to do counts array swap before calling MPIR_Xxx

Expected Impact

Author Checklist

  • Reference appropriate issues (with "Fixes" or "See" as appropriate)
  • Remove xfail from the test suite when fixing a test
  • Commits are self-contained and do not do two things at once
  • Commit message is of the form: module: short description and follows good practice
  • Passes whitespace checkers
  • Passes warning tests
  • Passes all tests
  • Add comments such that someone without knowledge of the code could understand
  • You or your company has a signed contributor's agreement on file with Argonne
  • For non-Argonne authors, request an explicit comment from your companies PR approval manager

@hzhou
Copy link
Contributor Author

hzhou commented Feb 12, 2021

test:mpich/ch3/most
test:mpich/ch4/most

@hzhou
Copy link
Contributor Author

hzhou commented Feb 12, 2021

test:mpich/ch3/most

@hzhou hzhou requested a review from raffenet February 13, 2021 05:24
@raffenet
Copy link
Contributor

raffenet commented Feb 19, 2021

Just to add some data here, I ran some simple experiments to measure the overhead of promoting these arrays. Using OSU benchmark for MPI_Alltoallv (4 it JLSE nodes, fully subscribed ppn=88) with actual communication in MPICH commented out (i.e. message size is irrelevant):

Original MPICH

# OSU MPI All-to-Allv Personalized Exchange Latency Test v5.4.1
# Size       Avg Latency(us)
1                       1.15

With large count promotion

# OSU MPI All-to-Allv Personalized Exchange Latency Test v5.4.1
# Size       Avg Latency(us)
1                       2.58

Actual communication cost

For reference, actually communication cost for MPI_Alltoallv (over TCP), leads me to believe this overhead may not be significant:

# OSU MPI All-to-Allv Personalized Exchange Latency Test v5.4.1
# Size       Avg Latency(us)
1                   35231.28

Copy link
Contributor

@raffenet raffenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely pushing the boundaries of what a single commit can change, but I suppose it is unavoidable given all the APIs that depend on the array arguments.

@raffenet
Copy link
Contributor

Here is communication data from Gomez w/ Infiniband (ch4:ucx):

# OSU MPI All-to-Allv Personalized Exchange Latency Test v5.4.1
# Size       Avg Latency(us)
1                    2029.23

Also added python code to generate MPI_Aint impl prototypes and generate
code to do counts array swap before calling MPIR_Xxx
@hzhou
Copy link
Contributor Author

hzhou commented Feb 19, 2021

test:mpich/ch4/ofi

@hzhou hzhou merged commit 9983a3b into pmodels:main Feb 20, 2021
@hzhou hzhou deleted the 2102_coll_v_large branch February 20, 2021 01:02
@hzhou
Copy link
Contributor Author

hzhou commented May 29, 2021

Reference #4880

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants