Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store duration time of task in rally-results metrics record #1220

Merged
merged 12 commits into from
Apr 19, 2021

Conversation

ebadyano
Copy link
Contributor

Closes: #1197

@ebadyano ebadyano added enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated labels Mar 26, 2021
@ebadyano ebadyano requested a review from dliappis March 26, 2021 00:14
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! That's very helpful. I left a couple of suggestions.

I also think that a prerequisite for this is #1198 so we have consistent time scale. Otherwise all measurements would be in milliseconds except for relative-time, which would be denoted in microseconds.

@@ -1673,6 +1675,11 @@ def total_transform_metric(self, metric_name):
def error_rate(self, task_name, operation_type):
return self.store.get_error_rate(task=task_name, operation_type=operation_type, sample_type=SampleType.Normal)

def duration(self, task_name):
values = self.store.get_raw("service_time", task_name, mapper=lambda doc: doc["relative-time"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can potentially return millions of documents (e.g. in a benchmark with thousands of clients) and we're only interested in one value. I wonder whether this would justify a specialized query method in the metrics store as it would be much cheaper only to retrieve the maximum value directly?

I also want to point out that relative-time is determined before a request is issued:

rally/esrally/driver/driver.py

Lines 1060 to 1062 in 62595bd

@property
def relative_time(self):
return self.request_start - self.task_start

So strictly speaking duration does not account for the service time of the last request. If this is a long-running request (e.g. in frozen tier benchmarks) the duration measurement could be several minutes off. This could be specifically misleading if there is only one measurement iteration as duration would report something in order of several milliseconds when the actual duration would more be like several minutes.

So if we settle on a special query method, I propose that the API allows us to retrieve one sample ordered by a criterion. E.g. we could amend the current get_one method in the metrics store:

def get_one(self, name, sample_type=None, node_name=None, task=None):

to allow sorting:

def get_one(self, name, sample_type=None, node_name=None, task=None, sort_key=None, sort_reverse=False):

Note: I've borrowed the names of the new parameters sort_key an sort_reverse from Python's sorted function for an idiomatic API.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I will use that, thank you for the suggestion!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielmitterdorfer interestingly the current implementation for get_one uses get and the just returns values[0] so it would still return all the values and then pick the first one locally.

return values[0] if values else None

Was your idea to modify the search query so that it would only return one value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should implement get_one in both subclasses (InMemoryMetricsStore and EsMetricsStore) and remove the generic implementation in the base class.

@@ -1821,6 +1830,7 @@ def add_op_metrics(self, task, operation, throughput, latency, service_time, pro
"service_time": service_time,
"processing_time": processing_time,
"error_rate": error_rate,
"duration_time": duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that duration by itself conveys its meaning already and the _time suffix is redundant? (see also https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/time/Duration.html)

@danielmitterdorfer
Copy link
Member

I also think that a prerequisite for this is #1198 so we have consistent time scale.

I've pushed #1221 now and we should base the duration on the newly introduced field relative-time-ms in this PR here.

@ebadyano ebadyano modified the milestones: 2.1.0, 2.2.0 Mar 31, 2021
@danielmitterdorfer danielmitterdorfer modified the milestones: 2.2.0, 2.1.1 Apr 6, 2021
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I left a couple more comments. Can you please also implement tests for get_one for both metrics store implementations (including tests for handling an empty search result)?

@@ -825,6 +828,29 @@ def _get(self, name, task, operation_type, sample_type, node_name, mapper):
self.logger.debug("Metrics query produced [%s] results.", result["hits"]["total"])
return [mapper(v["_source"]) for v in result["hits"]["hits"]]

def get_one(self, name, sample_type=None, node_name=None, task=None, mapper=lambda doc: doc["value"],
sort_key=None, sort_reverse=False):
order = "desc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this would be simpler?

order = "desc" if sort_reverse else "asc"

order = "desc"
if not sort_reverse:
order = "asc"
if sort_key:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you could implement this block as follows:

query = {
    "query": self._query_by_name(name, task, None, sample_type, node_name),
    "size": 1
}
if sort_key:
    query["sort"] = [{sort_key: {"order": order}}]

This avoids repetition of common elements.

self.logger.debug("Issuing get against index=[%s], query=[%s].", self._index, query)
result = self._client.search(index=self._index, body=query)
self.logger.debug("Metrics query produced [%s] results.", result["hits"]["total"])
return mapper(result["hits"]["hits"][0]["_source"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would fail with a key error if there are no hits. Can we check this and return None if there are no hits?

docs = sorted(self.docs, key=lambda k: k[sort_key], reverse=sort_reverse)
else:
docs = self.docs
for doc in docs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to return None if there are no hits (i.e. after the for loop).

@@ -1583,6 +1585,7 @@ def test_calculate_global_stats(self):
self.assertEqual(collections.OrderedDict(
[("50_0", 200), ("100_0", 210), ("mean", 200), ("unit", "ms")]), opm["service_time"])
self.assertAlmostEqual(0.3333333333333333, opm["error_rate"])
self.assertAlmostEqual(709*1000, opm["duration"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we able to use assertEqual here as this is an integer value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opps thank you for catching

@@ -1595,6 +1598,7 @@ def test_calculate_global_stats(self):
self.assertEqual(17.2, stats.ml_processing_time[0]["median"])
self.assertEqual(36.0, stats.ml_processing_time[0]["max"])
self.assertEqual("ms", stats.ml_processing_time[0]["unit"])
self.assertAlmostEqual(600*1000, opm2["duration"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we able to use assertEqual here as this is an integer value?

@ebadyano ebadyano modified the milestones: 2.1.1, 2.2.0 Apr 15, 2021
Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! I left one comment but no need for another review round. LGTM

actual_duration = self.metrics_store.get_one("service_time", task="task1", mapper=lambda doc: doc["relative-time-ms"],
sort_key="relative-time-ms", sort_reverse=True)

self.assertEqual(None, actual_duration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use self.assertIsNone here.

@ebadyano
Copy link
Contributor Author

Thank you for the review @danielmitterdorfer

@ebadyano ebadyano merged commit 7dfdbfb into elastic:master Apr 19, 2021
@ebadyano ebadyano deleted the time branch December 16, 2022 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves the status quo :Metrics How metrics are stored, calculated or aggregated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Store total time of task in rally-results metrics record
2 participants