Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/BASE: use random roce path factor to achieve high reliability. #127

Merged
merged 1 commit into from
May 12, 2021

Conversation

leibin2014
Copy link

UCT/IB/BASE: use random roce path factor to achieve high reliability.

When the switches on the current path have problem, the application can disconnect and reconnect again to achieve high reliability. A random roce path factor will be generated to find different sport and use different path.

src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.h Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.h Outdated Show resolved Hide resolved
@leibin2014 leibin2014 force-pushed the integration3 branch 2 times, most recently from 622a827 to 40521c7 Compare May 12, 2021 04:22
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
Comment on lines 536 to 537
udp_sport = iface->config.roce_path_factor * path_index;
if (iface->config.en_random_factor) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
udp_sport = iface->config.roce_path_factor * path_index;
udp_sport = iface->config.roce_path_factor * path_index;
if (iface->config.en_random_factor) {
assert(iface->config.roce_path_factor <= UCT_IB_ROCE_MAX_PATH_FACTOR);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucs_assert

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe Already applied this comment:
#127 (comment)

src/uct/ib/base/ib_iface.c Outdated Show resolved Hide resolved
Copy link
Owner

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leibin2014 pls squash before merge

@leibin2014
Copy link
Author

@leibin2014 pls squash before merge

@yosefe Done. Thanks!

@yosefe yosefe merged commit 74cbf08 into yosefe:integration3 May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants