Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rally should print a warning if there are no measurement samples #246

Closed
danielmitterdorfer opened this issue Mar 2, 2017 · 4 comments
Closed
Labels
enhancement Improves the status quo :Reporting Command line reporting :Usability Makes Rally easier to use
Milestone

Comments

@danielmitterdorfer
Copy link
Member

Current situation

If there are no measurement samples, Rally will show empty throughput lines in the summary report. On certain hardware and for "smaller" benchmarks (e.g. geonames) this can happen and is puzzling users.

Desired situation

Rally should make it clear that there are no measurement samples and should provide appropriate guidance.

@danielmitterdorfer danielmitterdorfer added :Reporting Command line reporting :Usability Makes Rally easier to use enhancement Improves the status quo labels Mar 2, 2017
@gszr
Copy link

gszr commented Apr 28, 2017

Experiencing this!

Playing with Rally on 3, 6, and 9-node clusters on EC2, m3.2xlarge instances. I'm setting the number of shards to be the same as the number of nodes. Running track 'geonames', challenge 'append-no-conflicts-index-only'.

For 3-node clusters, it works; for 6 and 9, it doesn't. The report for a 3-node cluster:


|   Lap |                          Metric |    Operation |     Value |   Unit |
|------:|--------------------------------:|-------------:|----------:|-------:|
|   All |                   Indexing time |              |   48.5889 |    min |
|   All |                      Merge time |              |   13.2943 |    min |
|   All |                    Refresh time |              |   2.05545 |    min |
|   All |                      Flush time |              |  0.183817 |    min |
|   All |             Merge throttle time |              |   2.24493 |    min |
|   All |              Total Young Gen GC |              |    43.709 |      s |
|   All |                Total Old Gen GC |              |    11.372 |      s |
|   All |          Heap used for segments |              |   29.9591 |     MB |
|   All |        Heap used for doc values |              |  0.102047 |     MB |
|   All |             Heap used for terms |              |   29.1469 |     MB |
|   All |             Heap used for norms |              | 0.0784302 |     MB |
|   All |     Heap used for stored fields |              |  0.631676 |     MB |
|   All |                   Segment count |              |       103 |        |
|   All |                  Min Throughput | index-append |   44029.8 | docs/s |
|   All |               Median Throughput | index-append |     44577 | docs/s |
|   All |                  Max Throughput | index-append |   44781.7 | docs/s |
|   All |       50.0th percentile latency | index-append |    853.63 |     ms |
|   All |       90.0th percentile latency | index-append |   1025.61 |     ms |
|   All |       99.0th percentile latency | index-append |   1186.15 |     ms |
|   All |      100.0th percentile latency | index-append |   1230.55 |     ms |
|   All |  50.0th percentile service time | index-append |   853.277 |     ms |
|   All |  90.0th percentile service time | index-append |   1025.53 |     ms |
|   All |  99.0th percentile service time | index-append |   1186.15 |     ms |
|   All | 100.0th percentile service time | index-append |   1230.55 |     ms |
|   All |                      error rate | index-append |         0 |      % |
|   All |                  Min Throughput |  force-merge |  0.697659 |  ops/s |
|   All |               Median Throughput |  force-merge |  0.697659 |  ops/s |
|   All |                  Max Throughput |  force-merge |  0.697659 |  ops/s |
|   All |      100.0th percentile latency |  force-merge |   1433.35 |     ms |
|   All | 100.0th percentile service time |  force-merge |   1433.35 |     ms |
|   All |                      error rate |  force-merge |         0 |      % |

---------------------------------
[INFO] SUCCESS (took 291 seconds)
---------------------------------

The report for a 6-node cluster:

|   Lap |                          Metric |    Operation |      Value |   Unit |
|------:|--------------------------------:|-------------:|-----------:|-------:|
|   All |                   Indexing time |              |   0.472517 |    min |
|   All |                      Merge time |              |   0.143567 |    min |
|   All |                    Refresh time |              | -0.0180667 |    min |
|   All |                      Flush time |              |  0.0571833 |    min |
|   All |             Merge throttle time |              |  0.0652333 |    min |
|   All |              Total Young Gen GC |              |      44.89 |      s |
|   All |                Total Old Gen GC |              |      9.596 |      s |
|   All |          Heap used for segments |              |      30.41 |     MB |
|   All |        Heap used for doc values |              |   0.127308 |     MB |
|   All |             Heap used for terms |              |    29.5544 |     MB |
|   All |             Heap used for norms |              |  0.0866699 |     MB |
|   All |     Heap used for stored fields |              |   0.641594 |     MB |
|   All |                   Segment count |              |        117 |        |
|   All |                  Min Throughput | index-append |            | docs/s |
|   All |               Median Throughput | index-append |            | docs/s |
|   All |                  Max Throughput | index-append |            | docs/s |
|   All |                      error rate | index-append |          0 |      % |
|   All |                  Min Throughput |  force-merge |    1.63125 |  ops/s |
|   All |               Median Throughput |  force-merge |    1.63125 |  ops/s |
|   All |                  Max Throughput |  force-merge |    1.63125 |  ops/s |
|   All |      100.0th percentile latency |  force-merge |    613.016 |     ms |
|   All | 100.0th percentile service time |  force-merge |    613.016 |     ms |
|   All |                      error rate |  force-merge |          0 |      % |


---------------------------------
[INFO] SUCCESS (took 202 seconds)
---------------------------------

@gszr
Copy link

gszr commented Apr 28, 2017

@danielmitterdorfer Is there a workaround for that - apart from using a bigger dataset?

@gszr
Copy link

gszr commented Apr 29, 2017

For reference, reducing the warmup-time-period for the challenge seems to be a solution, which makes sense, given that all data captured during this period is not considered for measurement results; so, the bigger the value, less measurements we get.

@danielmitterdorfer
Copy link
Member Author

Hi @Salazar, yes, a workaround is to reduce the warmup time. However, I think it does not make too much sense to run such short benchmarks because you are measuring a system that is clearly still in the warmup phase (JIT compiler is still very active, FS caches might not be warmed up enough, etc. etc) and you are basically measuring bogus numbers that will not help you at all. So you could either use a different one (see the output of esrally list tracks to see which ones are available) or create your own track and use a sufficiently large data set.

@danielmitterdorfer danielmitterdorfer added this to the 0.5.x milestone May 2, 2017
@danielmitterdorfer danielmitterdorfer modified the milestones: 0.5.4, 0.5.x May 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Reporting Command line reporting :Usability Makes Rally easier to use
Projects
None yet
Development

No branches or pull requests

2 participants