Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set enable_cleanup_closed by default #1469

Merged
merged 3 commits into from
Mar 31, 2022

Conversation

pquentin
Copy link
Member

This is now the default in Elasticsearch's Python client and leaving it
false causes memory leaks in Rally for long-running benchmarks.

Closes #1466

@pquentin pquentin added the bug Something's wrong label Mar 28, 2022
@pquentin pquentin added this to the 2.4.0 milestone Mar 28, 2022
@pquentin pquentin self-assigned this Mar 28, 2022
@pquentin
Copy link
Member Author

pquentin commented Mar 28, 2022

Actually I think this only fixes the async case, not the sync one. I'll fix it and add an unit test.

@pquentin
Copy link
Member Author

@elasticmachine run tests please

1 similar comment
@pquentin
Copy link
Member Author

@elasticmachine run tests please

This is now the default in Elasticsearch's Python client and leaving it
false causes memory leaks in Rally for long-running benchmarks.
Timing tests are by nature flaky, especially in CI, see
python-trio/trio#1851 for example.
@pquentin
Copy link
Member Author

Actually I think this only fixes the async case, not the sync one. I'll fix it and add an unit test.

Actually async only is fine because enable_cleanup_closed only applies to async anyway. I added a test. I also ensured that:

  • a large benchmark did not leak connection and performed the same way
  • we introduced this setting in the past to set it to True in the presence of many clients (and not to set it to False)

I have a test currently running to make sure our nightly benchmarks also perform the same way.

I also had to fight a bit with a timings issue that failed in CI because we sleeped a bit more than expected. I bumped the tolerance to address that.

This is now ready for review.

docs/command_line_reference.rst Outdated Show resolved Hide resolved
@@ -5795,13 +5795,13 @@ async def test_adds_request_timings(self):
assert len(timings) == 3

assert timings[0]["operation"] == "initial-call"
assert math.isclose(timings[0]["service_time"], 0.1, rel_tol=0.05)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you suppose the change in enable_cleanup_closed somehow affected the RequestContextHolder's APIs' responsiveness? I don't have an issue in general with adding tolerance to approximation unit tests, but wondering if we're going to expect anything else to happen in service_times even for short-lived benchmarks.

For context, I haven't seen this test fail before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, RequestContextHolder is purely self-contained. I think it's just that when moving to pytest I made the test stricter.

Copy link
Contributor

@DJRickyB DJRickyB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minus one typo and one open question

Co-authored-by: Rick Boyd <boyd.richardj@gmail.com>
@dliappis
Copy link
Contributor

Should we add a note in migrate.rst as well about this change of the default behavior?

@pquentin
Copy link
Member Author

Should we add a note in migrate.rst as well about this change of the default behavior?

Since this is phrased as a question: I don't think so as it's a bug fix like any other.

@pquentin
Copy link
Member Author

My nightly test using nyc_taxis was succesful. This PR did not degrade performance (nor did it improve it). Thanks for the reviews, merging!

@pquentin pquentin merged commit 4762a9c into elastic:master Mar 31, 2022
@pquentin pquentin deleted the enable-cleanup-closed branch March 31, 2022 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something's wrong
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should we force Rally to clean up SSL connections?
3 participants