Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: RFC for TTL tables #39264

Merged
merged 43 commits into from
Mar 17, 2023
Merged

docs: RFC for TTL tables #39264

merged 43 commits into from
Mar 17, 2023

Conversation

lcwangchao
Copy link
Collaborator

What problem does this PR solve?

Issue Number: close #39263

Problem Summary:

What is changed and how it works?

Check List

Tests

  • No code

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Nov 21, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • YangKeao
  • bb7133

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 21, 2022
@lcwangchao
Copy link
Collaborator Author

/run-check-issue-triage-complete

docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Outdated Show resolved Hide resolved
docs/design/2022-11-17-ttl-table.md Show resolved Hide resolved
lcwangchao and others added 8 commits November 22, 2022 19:17
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
Co-authored-by: Mattias Jonsson <mjonss@users.noreply.github.com>
- Range: [10m0s, 8760h0m0s]
- Default: 1h

- `tidb_ttl_job_schedule_window_start_time`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't like the time window option. Data is very likely to be deleted incomplete in each time window.

- In most times, secondary indexes will not be created to avoid the hotspot of the insert. In this scene, we can also reduce some unnecessary scans by caching the statistical information. For example, we can cache the created time of the oldest row for each region after a job finished. When the next job starts, it can check the all the regions, if the data of one region do not have any updates and its cached time is not expired, just skip that region.
- If a TTL table has some Tiflash replicas, we can the TiFlash instead of TiKV.
- In the future, we can schedule the tasks from one job to multiple nodes instead of executing them only in one node. This approach will improve the resource utilization of the cluster. It also means we can execute more tasks in concurrency at the same time that makes the scan and delete faster.
- If a table does not have any secondary index, we can do some further optimizations. One optimization is that to push down the scan and delete to TiKV side without data exchanging between TiDB and TiKV. It is somewhat like what GCWorker dose. In this way, a new coprocessor command "TTLGC" will be introduced and when a job starts, TiDB will send "TTLGC" commands to each region and TiKV will then scan and delete the expired rows (TiKV should delete expired rows in a non-transactional way).
Copy link
Contributor

@SunRunAway SunRunAway Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • scan with stale read to avoid lock contention with main traffic write
  • scan and delete with LOW_PRIORITY?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integrate with resource control framework?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scan and delete with LOW_PRIORITY has already been implemented in the first version

Copy link
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 17, 2023
Copy link
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 17, 2023
@bb7133
Copy link
Member

bb7133 commented Mar 17, 2023

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: a202c34

@ti-chi-bot ti-chi-bot added status/can-merge Indicates a PR has been approved by a committer. and removed status/can-merge Indicates a PR has been approved by a committer. labels Mar 17, 2023
@lcwangchao
Copy link
Collaborator Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 87c3465

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 17, 2023
@ti-chi-bot ti-chi-bot merged commit b3973b3 into pingcap:master Mar 17, 2023
@lcwangchao lcwangchao deleted the ttl_doc branch May 16, 2023 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFC for TTL table support