Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/MLX5: Move QP to RESET (instead of error) before cleaning the SRQ - v1.3.x #2428

Merged

Conversation

yosefe
Copy link
Contributor

@yosefe yosefe commented Mar 20, 2018

Picked from #2413

Moving QP to error state does not guarantee all CQEs for inprogress
receives would be generated when the modify_qp command completes. In
order to have such guarantee, need to move the QP to reset state.
However, calling ibv_modify_qp() will go to mlx5 driver which may try to
cleanup those CQEs, which will mess up the linked list. Instead, we call
ibv_cmd_modify_qp() directly which will just modify the QP without any
cleaup actions.
@yosefe yosefe changed the title UCT/IB/MLX5: Move QP to RESET (instead of error) before cleaning the SRQ UCT/IB/MLX5: Move QP to RESET (instead of error) before cleaning the SRQ - v1.3.x Mar 20, 2018
@yosefe yosefe added the Bugfix label Mar 20, 2018
@swx-jenkins1
Copy link

Test PASSed.
See http://bgate.mellanox.com/jenkins/job/gh-ucx-pr/4267/ for details.

@mellanox-github
Copy link
Contributor

Test PASSed.
See http://hpc-master.lab.mtl.com:8080/job/hpc-ucx-pr/6578/ for details (Mellanox internal link).

@yosefe yosefe merged commit e2e8196 into openucx:v1.3.x Mar 20, 2018
@yosefe yosefe deleted the topic/uct-rc-ml5-cleanup-qp-reset-v1.3.x branch March 20, 2018 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants