Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: split avg to count and sum for TableReader cop task #11926

Merged
merged 4 commits into from
Sep 2, 2019

Conversation

lzmhhh123
Copy link
Contributor

What problem does this PR solve?

Because the calculation of avg in the coprocessor is split as sum and count, and it also returns two columns. We can split avg in the planner. Now it's only effective for cop agg in TableReader.

A sub PR for support the coprocessor in tiflash.

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test

Side effects

  • Possible performance regression
  • Increased code complexity

Related changes

  • None

@lzmhhh123 lzmhhh123 added the sig/planner SIG: Planner label Aug 29, 2019
@codecov
Copy link

codecov bot commented Aug 29, 2019

Codecov Report

Merging #11926 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #11926   +/-   ##
===========================================
  Coverage   81.4033%   81.4033%           
===========================================
  Files           444        444           
  Lines         95560      95560           
===========================================
  Hits          77789      77789           
  Misses        12291      12291           
  Partials       5480       5480

@lzmhhh123
Copy link
Contributor Author

/run-all-tests

@lzmhhh123
Copy link
Contributor Author

/run-integration-common-test

1 similar comment
@lzmhhh123
Copy link
Contributor Author

/run-integration-common-test

@zz-jason
Copy link
Member

@lzmhhh123 Does it affect the execution logic on TiKV Coprocessor?

@lzmhhh123
Copy link
Contributor Author

@lzmhhh123 Does it affect the execution logic on TiKV Coprocessor?

I'm checking this situation.

@lzmhhh123
Copy link
Contributor Author

/run-all-tests

@@ -864,7 +864,7 @@ func (s *testPlanSuite) TestDAGPlanBuilderAgg(c *C) {
// Test agg + table.
{
sql: "select sum(a), avg(b + c) from t group by d",
best: "TableReader(Table(t)->HashAgg)->HashAgg",
best: "TableReader(Table(t))->Projection->HashAgg",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this one changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt confused too. So I'm investigating the reason now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be caused by the fact that the number of AggFuncs increases, so the cost of Aggregation is changed, and planner decides not to push this Agg down to coprocessor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to #11926 (comment), seems we should modify the cost model to decrease the impact of number of aggregate functions to make this aggregation can be pushed to tikv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can temporarily not to modify the cost model. That's another task of cost model improvement with the consideration of aggregation type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, How about filing an issue about this improvement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

@lzmhhh123
Copy link
Contributor Author

@lzmhhh123 Does it affect the execution logic on TiKV Coprocessor?

I'm checking this situation.

It passed the integration-common-test at least.

@zz-jason
Copy link
Member

@lzmhhh123 Does it affect the execution logic on TiKV Coprocessor?

I'm checking this situation.

It passed the integration-common-test at least.

Yeah, does it mean that the execution instruction in TiKV is not changed? I'm afraid that this pr would impact the end-to-end performance of a SQL.

@lzmhhh123
Copy link
Contributor Author

@lzmhhh123 Does it affect the execution logic on TiKV Coprocessor?

I'm checking this situation.

It passed the integration-common-test at least.

Yeah, does it mean that the execution instruction in TiKV is not changed? I'm afraid that this pr would impact the end-to-end performance of a SQL.

Only a little. Because TiKV executes avg as sum and count.

@@ -864,7 +864,7 @@ func (s *testPlanSuite) TestDAGPlanBuilderAgg(c *C) {
// Test agg + table.
{
sql: "select sum(a), avg(b + c) from t group by d",
best: "TableReader(Table(t)->HashAgg)->HashAgg",
best: "TableReader(Table(t))->Projection->HashAgg",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should be caused by the fact that the number of AggFuncs increases, so the cost of Aggregation is changed, and planner decides not to push this Agg down to coprocessor.

planner/core/task.go Show resolved Hide resolved
planner/core/task.go Show resolved Hide resolved
Copy link
Contributor

@eurekaka eurekaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eurekaka eurekaka added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 30, 2019
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added status/LGT2 Indicates that a PR has LGTM 2. status/can-merge Indicates a PR has been approved by a committer. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 2, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Sep 2, 2019

/run-all-tests

@sre-bot sre-bot merged commit b239f2f into pingcap:master Sep 2, 2019
@lzmhhh123 lzmhhh123 deleted the dev/split_avg_to_countSum branch September 2, 2019 08:20
lzmhhh123 added a commit to lzmhhh123/tidb that referenced this pull request Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants