Rally should print a warning if there are no measurement samples #246

danielmitterdorfer · 2017-03-02T11:37:42Z

Current situation

If there are no measurement samples, Rally will show empty throughput lines in the summary report. On certain hardware and for "smaller" benchmarks (e.g. geonames) this can happen and is puzzling users.

Desired situation

Rally should make it clear that there are no measurement samples and should provide appropriate guidance.

gszr · 2017-04-28T16:23:34Z

Experiencing this!

Playing with Rally on 3, 6, and 9-node clusters on EC2, m3.2xlarge instances. I'm setting the number of shards to be the same as the number of nodes. Running track 'geonames', challenge 'append-no-conflicts-index-only'.

For 3-node clusters, it works; for 6 and 9, it doesn't. The report for a 3-node cluster:


|   Lap |                          Metric |    Operation |     Value |   Unit |
|------:|--------------------------------:|-------------:|----------:|-------:|
|   All |                   Indexing time |              |   48.5889 |    min |
|   All |                      Merge time |              |   13.2943 |    min |
|   All |                    Refresh time |              |   2.05545 |    min |
|   All |                      Flush time |              |  0.183817 |    min |
|   All |             Merge throttle time |              |   2.24493 |    min |
|   All |              Total Young Gen GC |              |    43.709 |      s |
|   All |                Total Old Gen GC |              |    11.372 |      s |
|   All |          Heap used for segments |              |   29.9591 |     MB |
|   All |        Heap used for doc values |              |  0.102047 |     MB |
|   All |             Heap used for terms |              |   29.1469 |     MB |
|   All |             Heap used for norms |              | 0.0784302 |     MB |
|   All |     Heap used for stored fields |              |  0.631676 |     MB |
|   All |                   Segment count |              |       103 |        |
|   All |                  Min Throughput | index-append |   44029.8 | docs/s |
|   All |               Median Throughput | index-append |     44577 | docs/s |
|   All |                  Max Throughput | index-append |   44781.7 | docs/s |
|   All |       50.0th percentile latency | index-append |    853.63 |     ms |
|   All |       90.0th percentile latency | index-append |   1025.61 |     ms |
|   All |       99.0th percentile latency | index-append |   1186.15 |     ms |
|   All |      100.0th percentile latency | index-append |   1230.55 |     ms |
|   All |  50.0th percentile service time | index-append |   853.277 |     ms |
|   All |  90.0th percentile service time | index-append |   1025.53 |     ms |
|   All |  99.0th percentile service time | index-append |   1186.15 |     ms |
|   All | 100.0th percentile service time | index-append |   1230.55 |     ms |
|   All |                      error rate | index-append |         0 |      % |
|   All |                  Min Throughput |  force-merge |  0.697659 |  ops/s |
|   All |               Median Throughput |  force-merge |  0.697659 |  ops/s |
|   All |                  Max Throughput |  force-merge |  0.697659 |  ops/s |
|   All |      100.0th percentile latency |  force-merge |   1433.35 |     ms |
|   All | 100.0th percentile service time |  force-merge |   1433.35 |     ms |
|   All |                      error rate |  force-merge |         0 |      % |

---------------------------------
[INFO] SUCCESS (took 291 seconds)
---------------------------------

The report for a 6-node cluster:

|   Lap |                          Metric |    Operation |      Value |   Unit |
|------:|--------------------------------:|-------------:|-----------:|-------:|
|   All |                   Indexing time |              |   0.472517 |    min |
|   All |                      Merge time |              |   0.143567 |    min |
|   All |                    Refresh time |              | -0.0180667 |    min |
|   All |                      Flush time |              |  0.0571833 |    min |
|   All |             Merge throttle time |              |  0.0652333 |    min |
|   All |              Total Young Gen GC |              |      44.89 |      s |
|   All |                Total Old Gen GC |              |      9.596 |      s |
|   All |          Heap used for segments |              |      30.41 |     MB |
|   All |        Heap used for doc values |              |   0.127308 |     MB |
|   All |             Heap used for terms |              |    29.5544 |     MB |
|   All |             Heap used for norms |              |  0.0866699 |     MB |
|   All |     Heap used for stored fields |              |   0.641594 |     MB |
|   All |                   Segment count |              |        117 |        |
|   All |                  Min Throughput | index-append |            | docs/s |
|   All |               Median Throughput | index-append |            | docs/s |
|   All |                  Max Throughput | index-append |            | docs/s |
|   All |                      error rate | index-append |          0 |      % |
|   All |                  Min Throughput |  force-merge |    1.63125 |  ops/s |
|   All |               Median Throughput |  force-merge |    1.63125 |  ops/s |
|   All |                  Max Throughput |  force-merge |    1.63125 |  ops/s |
|   All |      100.0th percentile latency |  force-merge |    613.016 |     ms |
|   All | 100.0th percentile service time |  force-merge |    613.016 |     ms |
|   All |                      error rate |  force-merge |          0 |      % |


---------------------------------
[INFO] SUCCESS (took 202 seconds)
---------------------------------

gszr · 2017-04-28T18:34:34Z

@danielmitterdorfer Is there a workaround for that - apart from using a bigger dataset?

gszr · 2017-04-29T22:30:54Z

For reference, reducing the warmup-time-period for the challenge seems to be a solution, which makes sense, given that all data captured during this period is not considered for measurement results; so, the bigger the value, less measurements we get.

danielmitterdorfer · 2017-05-02T07:33:41Z

Hi @Salazar, yes, a workaround is to reduce the warmup time. However, I think it does not make too much sense to run such short benchmarks because you are measuring a system that is clearly still in the warmup phase (JIT compiler is still very active, FS caches might not be warmed up enough, etc. etc) and you are basically measuring bogus numbers that will not help you at all. So you could either use a different one (see the output of esrally list tracks to see which ones are available) or create your own track and use a sufficiently large data set.

danielmitterdorfer added :Reporting Command line reporting :Usability Makes Rally easier to use enhancement Improves the status quo labels Mar 2, 2017

danielmitterdorfer added this to the 0.5.x milestone May 2, 2017

danielmitterdorfer closed this as completed in 34299d2 May 24, 2017

danielmitterdorfer modified the milestones: 0.5.4, 0.5.x May 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rally should print a warning if there are no measurement samples #246

Rally should print a warning if there are no measurement samples #246

danielmitterdorfer commented Mar 2, 2017

gszr commented Apr 28, 2017

gszr commented Apr 28, 2017

gszr commented Apr 29, 2017

danielmitterdorfer commented May 2, 2017

Rally should print a warning if there are no measurement samples #246

Rally should print a warning if there are no measurement samples #246

Comments

danielmitterdorfer commented Mar 2, 2017

Current situation

Desired situation

gszr commented Apr 28, 2017

gszr commented Apr 28, 2017

gszr commented Apr 29, 2017

danielmitterdorfer commented May 2, 2017