Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HashSet::union performance #66280

Merged
merged 1 commit into from
Nov 12, 2019
Merged

Fix HashSet::union performance #66280

merged 1 commit into from
Nov 12, 2019

Conversation

stepancheg
Copy link
Contributor

Consider this example: small_set = 0..2, large_set = 0..1000.

To efficiently compute the union of these sets, we should

  • take all elements of the larger set
  • for each element of the smaller set check it is not in the larger set

This is exactly what this commit does.

This particular optimization was implemented a year ago, but the
author mistaken < and >.

Consider this example: small_set = 0..2, large_set = 0..1000.

To efficiently compute the union of these sets, we should
* take all elements of the larger set
* for each element of the smaller set check it is not in the larger set

This is exactly what this commit does.

This particular optimization was implemented a year ago, but the
author mistaken `<` and `>`.
@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 10, 2019
@alexcrichton
Copy link
Member

@bors: r+ rollup

@bors
Copy link
Contributor

bors commented Nov 11, 2019

📌 Commit 04a237b has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 11, 2019
@ssomers
Copy link
Contributor

ssomers commented Nov 12, 2019

What happened is that I measured it on free standing code with free standing benchmarks, and the code must have gotten mangled while transplanting it into libstd, and I didn't even try to transplant benchmarks. In general, there are few benchmarks in libstd/collections/hash compared to liballoc/benches/btree. I could make some but I guess it's too late to include in this PR.

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Nov 12, 2019
Fix HashSet::union performance

Consider this example: small_set = 0..2, large_set = 0..1000.

To efficiently compute the union of these sets, we should
* take all elements of the larger set
* for each element of the smaller set check it is not in the larger set

This is exactly what this commit does.

This particular optimization was implemented a year ago, but the
author mistaken `<` and `>`.
bors added a commit that referenced this pull request Nov 12, 2019
Rollup of 11 pull requests

Successful merges:

 - #65965 (Clean up librustc_typeck error_codes file)
 - #66230 (remove vestigial comments referring to defunct numeric trait hierarchy)
 - #66241 (bump openssl version)
 - #66257 (Drop long-section-names linker workaround for windows-gnu)
 - #66263 (make the error message more readable)
 - #66267 (Add rustdoc doc)
 - #66276 (Move lock into CodeStats)
 - #66278 (Fix error message about exported symbols from proc-macro crates)
 - #66280 (Fix HashSet::union performance)
 - #66299 (support issue = "none" in unstable attributes )
 - #66309 (Tiny cleanup to size assertions)

Failed merges:

r? @ghost
@bors bors merged commit 04a237b into rust-lang:master Nov 12, 2019
@ssomers
Copy link
Contributor

ssomers commented Nov 13, 2019

cargo benchcmp of a few benchmarks on modestly sized sets (10 versus 100 elements) says about this change:

name                                                before ns/iter  after ns/iter  diff ns/iter   diff %  speedup
collections::hash::bench::set_difference            124             125                       1    0.81%   x 0.99
collections::hash::bench::set_intersection          122             126                       4    3.28%   x 0.97
collections::hash::bench::set_symmetric_difference  1,624           1,821                   197   12.13%   x 0.89
collections::hash::bench::set_union                 1,581           266                  -1,315  -83.18%   x 5.94

Quite a lot better than correcting the "still measurable" improvement I reported in #57043, whatever it was that I tested.

@stepancheg stepancheg deleted the union branch November 14, 2019 00:06
bors added a commit that referenced this pull request Nov 24, 2019
introduce benchmarks of HashSet operations

To avoid goofs such as corrected by #66280, I added benchmarks of binary HashSet operations.

Due to the fact x.py keeps recompiling the whole shebang (or at least a big part of it) whenever you touch the test code, and because piling up all tests in one file does not strike me as future proof, I tried moving the hash benches to the separate place they are for liballoc/collections/btree. But it turns out that, in a cleaned checkout, x.py still recompiles the whole shebang whenever you touch the test code (PS or when you add or delete any irrelevant file). So I'm not going to add more tests, and I doubt others will, and these tests have proven their point already, so this PR is kind of pointless
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants