-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit TestBlockPullerBadBlocks pullblock time #2975
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply have the block puller pull the blocks asynchronously and have a select with a timeout statement that waits for either it to finish or for a timeout to expire?
Also the title is not suitable to the content. "Orderer unittest flake in deliver_test" - you are not fixing here a flake. Maybe say instead that you are bounding the maximum execution time of the test?
case <-endPullBlock: | ||
case <-time.After(10 * time.Second): | ||
// Signal PullBlock to give up on the retries & mark the test as fail | ||
close(bp.StopChannel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to close the StopChannel if you already have the bp.Close down below in line 1109?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when times out, following statement t.Fatalf("PullBlock did not complete within time")
terminates the current go routine, hence, it does not reach to line bp.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what? the test terminates doesn't it? Why do we care about other goroutines that are not the main goroutine if the test fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, If i have not explained it clearly here. When the PullBlock is not completed within the time, we want to mark that particular test(subtest) as failed. Hence, in the select case, signaled the blocking PullBlock & then Fail the testcase by calling t.Fatalf
We see the following output when the PullBlock failed to return/blocked:
--- FAIL: TestBlockPullerBadBlocks (60.02s)
--- FAIL: TestBlockPullerBadBlocks/nil_header (10.00s)
deliver_test.go:1106: PullBlock did not complete within time
--- FAIL: TestBlockPullerBadBlocks/nil_data (10.00s)
deliver_test.go:1106: PullBlock did not complete within time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand, but we need to make progress
Limiting the test wait time. It will help to fail fast the testcase. Signed-off-by: Parameswaran Selvam <parselva@in.ibm.com>
Type of change
Description
Testcase times out after 20 minutes as the client is not able
to connect to the server. The rootcause for the server connection
failure is not known yet.
BlockPuller provides an option to limit the retries to get the blocks
This will help here when the client failed to connect & fail the testcase
fast. This PR set the retries to 5 times.
Related issues
#2835
Signed-off-by: Parameswaran Selvam parselva@in.ibm.com