Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch4/ofi: long AM default to RDMA READ, make PIPELINE optional #4811

Merged
merged 2 commits into from
Sep 29, 2020

Conversation

yfguo
Copy link
Contributor

@yfguo yfguo commented Sep 24, 2020

OFI uses RDMA READ by default and fallback to PIPELINE if RDMA READ not supported. Provide a CVAR to force OFI to use PIPELINE.

Expected Impact

Author Checklist

  • Reference appropriate issues (with "Fixes" or "See" as appropriate)
  • Remove xfail from the test suite when fixing a test
  • Commits are self-contained and do not do two things at once
  • Commit message is of the form: module: short description and follows good practice
  • Passes whitespace checkers
  • Passes warning tests
  • Passes all tests
  • Add comments such that someone without knowledge of the code could understand
  • You or your company has a signed contributor's agreement on file with Argonne
  • For non-Argonne authors, request an explicit comment from your companies PR approval manager

@yfguo
Copy link
Contributor Author

yfguo commented Sep 24, 2020

test:mpich/ch4/ofi

@yfguo
Copy link
Contributor Author

yfguo commented Sep 25, 2020

test:mpich/ch4/ofi

@raffenet
Copy link
Contributor

I think we should default ch4/ofi to use RDMA READ for now.

OSU BW performance using 2 Jenkins nodes (am-only configuration):

PIPELINE
[raffenet@pmrs-centos64-240-01]~/osu-micro-benchmarks-5.6.2/mpi/pt2pt% mpiexec -n 2 -hosts pmrs-centos64-240-01.cels.anl.gov,pmrs-centos64-240-02.cels.anl.gov ./osu_bw
# OSU MPI Bandwidth Test v5.6.2
# Size      Bandwidth (MB/s)
1                       0.04
2                       0.12
4                       0.20
8                       0.59
16                      1.18
32                      2.30
64                      4.50
128                     9.32
256                    18.77
512                    52.39
1024                   74.33
2048                  140.38
4096                  251.50
8192                  481.52
16384                 418.78
32768                 578.39
65536                 709.71
131072                725.73
262144                748.25
524288                539.61
1048576               638.39
2097152               636.97
4194304               649.40
RDMA READ
[raffenet@pmrs-centos64-240-01]~/osu-micro-benchmarks-5.6.2/mpi/pt2pt% MPIR_CVAR_CH4_OFI_FORCE_RDMA_READ_AM_LONG=1 mpiexec -n 2 -hosts pmrs-centos64-240-01.cels.anl.gov,pmrs-centos64-240-02.cels.anl.gov ./osu_bw
# OSU MPI Bandwidth Test v5.6.2
# Size      Bandwidth (MB/s)
1                       0.04
2                       0.14
4                       0.29
8                       0.36
16                      1.02
32                      2.36
64                      4.73
128                     9.46
256                    18.78
512                    37.57
1024                   52.40
2048                  101.31
4096                  205.07
8192                  413.58
16384                 283.43
32768                 590.67
65536                1195.24
131072               1304.33
262144               2557.22
524288               2602.24
1048576              3015.07
2097152              3005.11
4194304              2789.09

@hzhou
Copy link
Contributor

hzhou commented Sep 25, 2020

I think we should default ch4/ofi to use RDMA READ for now.

I agree. However, we should check MPIDI_OFI_ENABLE_RMA.

@raffenet
Copy link
Contributor

I think we should default ch4/ofi to use RDMA READ for now.

I agree. However, we should check MPIDI_OFI_ENABLE_RMA.

Yes, makes sense.

@yfguo yfguo changed the title ch4/ofi: add option to force RDMA READ protocol for long message ch4/ofi: long AM default to RDMA READ, make PIPELINE optional Sep 25, 2020
@yfguo
Copy link
Contributor Author

yfguo commented Sep 25, 2020

Reversed the logic. Now OFI do RDMA READ when MPIDI_OFI_ENABLE_RMA is true and we are not forcing PIPELINE. Add three pt2pt test that forces PIPELINE.

@yfguo
Copy link
Contributor Author

yfguo commented Sep 25, 2020

test:mpich/ch4/ofi

Copy link
Contributor

@raffenet raffenet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me. For the CVAR based testing, maybe we should add strict=FALSE since that is really MPICH-specific behavior being tested?

@raffenet
Copy link
Contributor

This looks OK to me. For the CVAR based testing, maybe we should add strict=FALSE since that is really MPICH-specific behavior being tested?

Well, aside from the fact that is does not compile 😄 .

@yfguo
Copy link
Contributor Author

yfguo commented Sep 28, 2020

test:mpich/ch4/ofi

@yfguo
Copy link
Contributor Author

yfguo commented Sep 28, 2020

This looks OK to me. For the CVAR based testing, maybe we should add strict=FALSE since that is really MPICH-specific behavior being tested?

Well, aside from the fact that is does not compile 😄 .

That is embarrassing. Fixed now.

Set OFI to used RDMA READ as the default protocol for long message, add
an option for forcing PIPELINE protocol if needed for testing.
@yfguo
Copy link
Contributor Author

yfguo commented Sep 28, 2020

test:mpich/ch4/ofi

test after rebasing to make sure it still works fine.

@yfguo yfguo merged commit 5e07e58 into pmodels:main Sep 29, 2020
@yfguo yfguo deleted the ofi-rdmar branch October 23, 2020 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants