Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/TAG: Move TM stuff from context to worker #2031

Merged
merged 3 commits into from
Dec 7, 2017

Conversation

brminich
Copy link
Contributor

@brminich brminich commented Dec 1, 2017

Tag Matching queues and other related info are moved to the worker object. Now tag communications should not be crossed between different workers (similar to communicators in MPI). And ucp_tag_msg_recv_nb* behavior matches their description, that these routines receive a message on the particular worker.

Tag Matching queues and other related info are moved to the worker object.
Now tag communications should not be crossed between different workers (like
communicators in MPI). And ucp_tag_msg_recv_nb* behavior matches their
description, that these routines receive a message on the particular worker.
@jenkinsornl
Copy link

Build finished.

@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/3207/ for details.

@mellanox-github
Copy link
Contributor

Test PASSed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5318/ for details (Mellanox internal link).

@shamisp
Copy link
Contributor

shamisp commented Dec 1, 2017

This is major change that we have discussed during F2F.

I would suggest to get in touch with MPICH and OpenMPI community to make sure that - (a) they are happy with the changes , (b) we are not screwing anything

@brminich
Copy link
Contributor Author

brminich commented Dec 4, 2017

It should be ok with Open MPI, where only one worker is used by UCX pml
@raffenet, can you please confirm that this change is ok with MPICH?

@yosefe yosefe added the Cleanup label Dec 4, 2017
No need to get mutex in tag AM callbacks, since they can be called from
progress context only and ucp_worker_progress is already guarded with
locks.
@jenkinsornl
Copy link

Build finished.

@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/3239/ for details.

@mellanox-github
Copy link
Contributor

Test FAILed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5353/ for details (Mellanox internal link).

@mellanox-github
Copy link
Contributor

Test FAILed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5366/ for details (Mellanox internal link).

@brminich
Copy link
Contributor Author

brminich commented Dec 6, 2017

the fault is #2027

@brminich
Copy link
Contributor Author

brminich commented Dec 6, 2017

bot:mlx:retest

@mellanox-github
Copy link
Contributor

Test FAILed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5381/ for details (Mellanox internal link).

@mellanox-github
Copy link
Contributor

Test FAILed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5397/ for details (Mellanox internal link).

Conflicts:
	src/ucp/core/ucp_context.c
	src/ucp/core/ucp_context.h
	src/ucp/tag/offload.c
@jenkinsornl
Copy link

Build finished.

@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/3280/ for details.

@mellanox-github
Copy link
Contributor

Test PASSed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/5407/ for details (Mellanox internal link).

@brminich
Copy link
Contributor Author

brminich commented Dec 7, 2017

@yosefe, plz check 2-nd commit. I removed unnecessary locks from tag AM handlers

@yosefe yosefe merged commit 6625c6a into openucx:master Dec 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants