-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RC and iodemo fixes #71
Conversation
yosefe
commented
Oct 27, 2020
•
edited
Loading
edited
- Reserve 1 CQ credit for qp-flush NOP, to avoid disabling CQ moderation
- Fix keepalive on DevX
- Fix memory leak in IO demo
- Simplify active connection counting
@evgeny-leksikov pls take a look |
test/apps/iodemo/io_demo.cc
Outdated
@@ -671,9 +673,14 @@ class DemoServer : public P2pDemoCommon { | |||
recv_data(conn, *iov, msg->tr.sn, w); | |||
} | |||
|
|||
virtual void dispatch_connection_accepted(UcxConnection* conn) {\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls del \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
test/apps/iodemo/ucx_wrapper.h
Outdated
bool add_connection(UcxConnection *conn); | ||
|
||
bool remove_connection(UcxConnection *conn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return value is not handled anywhere now, can be void
with new virtual func or maybe assert for debug
virtual void dispatch_connection_error(UcxConnection *conn) { | ||
LOG << "deleting connection with status " | ||
<< ucs_status_string(conn->ucx_status()); | ||
--_curr_state.active_conns; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can server be an initiator of disconnect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
|
||
virtual bool remove_connection(UcxConnection *conn); | ||
// Called when new server connection is accepted | ||
virtual void dispatch_connection_accepted(UcxConnection* conn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO better to avoid functions in base class which are relevant only for one inheritor, add_conection
looks more generic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add/remove connection was intended to be private helper to manage conn_id hash and not virtual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
When QP is modified by DevX, the 'state' field may not be updated
Reserve a CQ entry for NOP operation during RC QP flush. Only one is needed since this is a blocking operation. This removes the need to disable TX moderation to avoid deadlock during flush due to lack of CQ resources to post a single NOP (which is needed to gen completions for unsignaled WQEs)
The client counts incorrectly the number of outstanding ops: when a connection is closed, _num_sent is decreased, but it should not be, since read requests also complete with error status and _num_completions is increased. As a result, the client sends more and more outstanding operations after every terminated connection, which causes increase in memory usage. The fix is to first close the connection, which will restore completions for read requests, and then restore the remaining credits for write requests - which will never complete on their own because they are waiting for write completion io-message.
fe6cc0f
to
e059cfb
Compare