Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/MLX5: Fasten DC support check #3301

Merged
merged 2 commits into from
Mar 8, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 54 additions & 11 deletions src/uct/ib/mlx5/ib_mlx5_dv.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,16 @@ ucs_status_t uct_ib_mlx5dv_init_obj(uct_ib_mlx5dv_t *obj, uint64_t type)
static ucs_status_t uct_ib_mlx5_device_init(uct_ib_device_t *dev)
{
struct ibv_context *ctx = dev->ibv_context;
struct ibv_srq_init_attr srq_attr = {};
struct ibv_qp_init_attr_ex qp_attr = {};
struct mlx5dv_qp_init_attr dv_attr = {};
struct ibv_qp_attr attr = {};
ucs_status_t status = UCS_OK;
struct ibv_srq *srq;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_qp *qp;
int ret;

if (!(uct_ib_device_spec(dev)->flags & UCT_IB_DEVICE_FLAG_MLX5_PRM)) {
return UCS_OK;
Expand All @@ -50,24 +54,63 @@ static ucs_status_t uct_ib_mlx5_device_init(uct_ib_device_t *dev)
goto err_cq;
}

qp_attr.send_cq = cq;
qp_attr.recv_cq = cq;
qp_attr.cap.max_send_wr = 1;
qp_attr.cap.max_send_sge = 1;
qp_attr.qp_type = IBV_QPT_DRIVER;
qp_attr.comp_mask = IBV_QP_INIT_ATTR_PD;
qp_attr.pd = pd;
srq_attr.attr.max_sge = 1;
srq_attr.attr.max_wr = 1;
srq = ibv_create_srq(pd, &srq_attr);
if (srq == NULL) {
ucs_error("ibv_create_srq() failed: %m");
status = UCS_ERR_IO_ERROR;
goto err_srq;
}

qp_attr.send_cq = cq;
qp_attr.recv_cq = cq;
qp_attr.qp_type = IBV_QPT_DRIVER;
qp_attr.comp_mask = IBV_QP_INIT_ATTR_PD;
qp_attr.pd = pd;
qp_attr.srq = srq;

dv_attr.comp_mask = MLX5DV_QP_INIT_ATTR_MASK_DC;
dv_attr.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCI;
dv_attr.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCT;
dv_attr.dc_init_attr.dct_access_key = UCT_IB_KEY;

/* create DCI qp successful means DC is supported */
/* create DCT qp successful means DC is supported */
qp = mlx5dv_create_qp(ctx, &qp_attr, &dv_attr);
if (qp) {
ibv_destroy_qp(qp);
if (qp == NULL) {
goto err_qp;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is not creating a qp enough to check DC support?
Why is it required to move it to RTR state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to RTR state triggers in kernel actual CREATE_DCT

attr.qp_state = IBV_QPS_INIT;
attr.port_num = 1;
attr.qp_access_flags = IBV_ACCESS_REMOTE_WRITE |
IBV_ACCESS_REMOTE_READ |
IBV_ACCESS_REMOTE_ATOMIC;
ret = ibv_modify_qp(qp, &attr, IBV_QP_STATE |
IBV_QP_PKEY_INDEX |
IBV_QP_PORT |
IBV_QP_ACCESS_FLAGS);
if (ret != 0) {
goto err;
}

attr.qp_state = IBV_QPS_RTR;
attr.path_mtu = IBV_MTU_256;
attr.ah_attr.port_num = 1;

ret = ibv_modify_qp(qp, &attr, IBV_QP_STATE |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe extract new function just for testing if DC works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would entire old function.

IBV_QP_MIN_RNR_TIMER |
IBV_QP_AV |
IBV_QP_PATH_MTU);

if (ret == 0) {
dev->flags |= UCT_IB_DEVICE_FLAG_DC;
}

err:
ibv_destroy_qp(qp);
err_qp:
ibv_destroy_srq(srq);
err_srq:
ibv_destroy_cq(cq);
err_cq:
ibv_dealloc_pd(pd);
Expand Down