Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: refactor cost model formulas and constants #10581

Merged
merged 3 commits into from
Aug 7, 2019

Conversation

eurekaka
Copy link
Contributor

@eurekaka eurekaka commented May 23, 2019

What problem does this PR solve?

Our current cost model is too naive to pick out the physical plans we prefer in some scenarios, for example:

  • cost such as sorting in index lookup operator, or inner cost of index join operator, is not reflected in cost computing at all;
  • some cost computings are wrong or not that accurate because we are using wrong input row count estimation (e.g, cost computing of TopN operator);

Besides, cost computings for different operators are not uniform now: some operators consider memory cost, others do not; some operators consider operator parallelism, others do not;

What is changed and how it works?

This PR tries to

    1. refine cost model to catch up with the current executor implementations
    1. and uniform the dimensions we consider in cost computing for all operators, i.e, CPU cost, memory cost, network cost, scan cost, and operator parallelism.

Check List

Tests

  • Unit test: some UT results are updated
  • Integration test: some integration tests are updated

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change

Side effects

  • Possible performance regression

Related changes

  • Need to cherry-pick to the release branch: we may need this in release-3.0
  • Need to be included in the release note

@eurekaka eurekaka added type/enhancement The issue or PR belongs to an enhancement. sig/planner SIG: Planner labels May 23, 2019
@zhouqiang-cl
Copy link
Contributor

/rebuild

@codecov
Copy link

codecov bot commented May 29, 2019

Codecov Report

Merging #10581 into master will decrease coverage by 0.1794%.
The diff coverage is 96.2643%.

@@               Coverage Diff                @@
##             master     #10581        +/-   ##
================================================
- Coverage   81.4101%   81.2307%   -0.1795%     
================================================
  Files           426        426                
  Lines         92513      92028       -485     
================================================
- Hits          75315      74755       -560     
- Misses        11826      11904        +78     
+ Partials       5372       5369         -3

@zhouqiang-cl
Copy link
Contributor

/bench

cmd/explaintest/r/tpch.result Outdated Show resolved Hide resolved
executor/builder.go Outdated Show resolved Hide resolved
@eurekaka
Copy link
Contributor Author

/rebuild

@eurekaka
Copy link
Contributor Author

/run-all-tests

@eurekaka
Copy link
Contributor Author

/run-common-test tidb-test=pr/840

1 similar comment
@eurekaka
Copy link
Contributor Author

/run-common-test tidb-test=pr/840

@eurekaka
Copy link
Contributor Author

/run-all-tests tidb-test=pr/840

@eurekaka eurekaka marked this pull request as ready for review June 24, 2019 05:52
@eurekaka
Copy link
Contributor Author

/run-all-tests tidb-test=pr/840

1 similar comment
@eurekaka
Copy link
Contributor Author

/run-all-tests tidb-test=pr/840

planner/core/task.go Outdated Show resolved Hide resolved
planner/core/task.go Show resolved Hide resolved
statistics/table.go Outdated Show resolved Hide resolved
@eurekaka eurekaka requested review from alivxxx and winoros July 2, 2019 10:41
@eurekaka eurekaka changed the title planner: refactor cost model formulas and constants *: refactor cost model formulas and constants Jul 18, 2019
Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lzmhhh123 lzmhhh123 added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 26, 2019
executor/index_lookup_join_test.go Show resolved Hide resolved
planner/core/cbo_test.go Show resolved Hide resolved
planner/core/exhaust_physical_plans.go Outdated Show resolved Hide resolved
planner/core/exhaust_physical_plans.go Outdated Show resolved Hide resolved
@eurekaka eurekaka force-pushed the cost_model branch 2 times, most recently from f949384 to 0617428 Compare August 1, 2019 12:59
@alivxxx alivxxx removed their request for review August 2, 2019 06:25
@qw4990 qw4990 removed their request for review August 5, 2019 07:38
colHist, ok := coll.Columns[col.UniqueID]
// Normally this would not happen, it is for compatibility with old version stats which
// does not include TotColSize.
if !ok || (colHist.TotColSize == 0 && (colHist.NullCount != coll.Count)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can calculate (colHist.TotColSize == 0 && (colHist.NullCount != coll.Count)) once outside the for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to get a valid colHist to make this computation check, if we move this check outside the for loop, the code is pretty ugly.

planner/core/exhaust_physical_plans.go Outdated Show resolved Hide resolved
copTask := &copTask{
tablePlan: ts,
indexPlanFinished: true,
cst: scanFactor * rowSize * 1.0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about replacing 1.0 with ts.stats.RowCount? That will be much clearer.

planner/core/task.go Show resolved Hide resolved
planner/core/task.go Outdated Show resolved Hide resolved
rCount := rTask.count()
if len(p.RightConditions) > 0 {
cpuCost += lCount * rCount * cpuFactor
rCount *= selectionFactor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rCount is incorrect when we can use index scan on the inner side table, in which condition the scan range is decided by the correlated outer side join key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we cannot know the selectivity of the outer key until execution.

cpuCost += probeCost + (innerConcurrency+1.0)*concurrencyFactor
// Memory cost of hash tables for inner rows. The computed result is the upper bound,
// since the executor is pipelined and not all workers are always in full load.
memoryCost := innerConcurrency * (batchSize * distinctFactor) * innerCnt * memoryFactor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consider avg row size for each inner row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The row in memory would have different size compared with its representation in disk and network. Currently, we are using a very small default memoryFactor in order to choose the fastest plan which makes full utilization of resources. To make cost model friendly for memory management, we need to consider row size here indeed. We can leave this to another separate PR later?

planner/core/task.go Show resolved Hide resolved
planner/core/task.go Show resolved Hide resolved
@eurekaka eurekaka requested a review from zz-jason August 7, 2019 05:59
@eurekaka
Copy link
Contributor Author

eurekaka commented Aug 7, 2019

/rebuild

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 7, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Aug 7, 2019

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Aug 7, 2019

@eurekaka merge failed.

@eurekaka
Copy link
Contributor Author

eurekaka commented Aug 7, 2019

/run-all-tests tidb-test=pr/840

@eurekaka eurekaka merged commit fe03864 into pingcap:master Aug 7, 2019
@eurekaka eurekaka deleted the cost_model branch August 7, 2019 09:57
lzmhhh123 pushed a commit to lzmhhh123/tidb that referenced this pull request Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants