Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL_CALC_FOUND_ROWS behavior leads to data truncation #14448

Closed
wwar opened this issue Jan 11, 2020 · 8 comments · Fixed by #19506
Closed

SQL_CALC_FOUND_ROWS behavior leads to data truncation #14448

wwar opened this issue Jan 11, 2020 · 8 comments · Fixed by #19506
Labels
component/executor severity/major type/bug The issue is confirmed as a bug.

Comments

@wwar
Copy link

wwar commented Jan 11, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?

There is an open bug on SQL_CALC_FOUND_ROWS: #8235

I am not sure if it is well understood how serious this is. Here is a small test case:

CREATE TABLE t1 (a INT);
INSERT INTO t1 VALUES (1),(2),(3);
SELECT SQL_CALC_FOUND_ROWS * FROM t1 LIMIT 1;
SELECT FOUND_ROWS();
  1. What did you expect to see?

In MySQL:

mysql> SELECT FOUND_ROWS();
+--------------+
| FOUND_ROWS() |
+--------------+
|            3 |
+--------------+
1 row in set, 1 warning (0.00 sec)
  1. What did you see instead?

In TiDB:

mysql> SELECT FOUND_ROWS();
+--------------+
| FOUND_ROWS() |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)

The most common use-case for this feature is that applications show the first few results (lets say LIMIT 100 rows) and then paginate to show a count of total results. Because the TiDB behavior is to return the same number of rows as the LIMIT, the application will incorrectly think that there is only one page of results.

This query is used in Wordpress, Drupal, and all number of MySQL applications. It is deprecated in MySQL 8.0, but I suspect based on popularity will not be removed for some time.

Assuming that #8235 is not fixed, I would like to instead propose similar handling to #10065 where it is an error by default when attempting to execute this query, with an option to skip the error and downgrade to a warning.

It is not safe to just return a warning for two reasons:

  1. You can't trust that an application won't just discard the warning, and this is a potential data loss issue.
  2. MySQL 8.0 already returns a warning here (for deprecation!). Can't trust that an application will check the warning error code, and understand TiDB is warning for a more serious issue.

What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?

mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-alpha-1334-g07e642c92
Git Commit Hash: 07e642c9230ccb7c1537b27442f1fe8433e65f8a
Git Branch: master
UTC Build Time: 2020-01-08 08:32:04
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.01 sec)
@wwar wwar added the type/bug The issue is confirmed as a bug. label Jan 11, 2020
@wwar wwar changed the title SQL_CALC_ROWS_FOUND behavior leads to data truncation SQL_CALC_FOUND_ROWS behavior leads to data truncation Jan 11, 2020
@XuHuaiyu
Copy link
Contributor

We can check from the document:

SQL_CALC_FOUND_ROWS| To guarantee compatibility with MySQL, TiDB parses this syntax, but will ignore it.

@XuHuaiyu
Copy link
Contributor

Hi,@wwar
Thank for your works on TiDB these days.
Are you doing some tests on TiDB now?

@XuHuaiyu
Copy link
Contributor

If you want, you can join the pingcap.com/tidbslack and discuss with others.

@wwar
Copy link
Author

wwar commented Jan 15, 2020

@XuHuaiyu It does parse-but-ignore, but then it populates FOUND_ROWS() :-( If not an error on the SELECT query, there should be one on the usage of the function (but I think the select query is a better fit).

My concern is that it is a silent data truncation issue, due to how the functionality is most commonly used. It is not good to bury it in the docs, because you can not rely on someone seeing it and understanding what it really means.

My usage of TiDB is currently purely hobby. It might turn professional at some point, I will try and join the slack when I get a chance :-) Thank you for the invitation.

@ghost
Copy link

ghost commented Jul 20, 2020

Verifying this still exists in master:

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a INT);
INSERT INTO t1 VALUES (1),(2),(3);
SELECT SQL_CALC_FOUND_ROWS * FROM t1 LIMIT 1;
SELECT FOUND_ROWS();

..

mysql> SELECT FOUND_ROWS();
+--------------+
| FOUND_ROWS() |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)
mysql> SELECT tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-beta.2-798-gd941ff5cc
Edition: Community
Git Commit Hash: d941ff5cc8b4babf9dcfdd91b66a5c53b798c122
Git Branch: master
UTC Build Time: 2020-07-18 05:54:02
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)

@ghost
Copy link

ghost commented Aug 28, 2020

Updating to match the severity of #19383 (a similarly unsafe parse but ignore issue).

@ghost ghost changed the title SQL_CALC_FOUND_ROWS behavior leads to data truncation Disable SQL_CALC_FOUND_ROWS/LOCK IN SHARE MODE by deault Aug 30, 2020
@ghost ghost changed the title Disable SQL_CALC_FOUND_ROWS/LOCK IN SHARE MODE by deault Disable SQL_CALC_FOUND_ROWS/LOCK IN SHARE MODE by default Aug 30, 2020
@ghost ghost changed the title Disable SQL_CALC_FOUND_ROWS/LOCK IN SHARE MODE by default SQL_CALC_FOUND_ROWS behavior leads to data truncation Aug 30, 2020
@sre-bot
Copy link
Contributor

sre-bot commented Sep 21, 2020

Integrity check:
component RCA symptom trigger_condition wa affect_version fix_version fields are empty
@zz-jason
Please comment /info to get template

@ti-srebot
Copy link
Contributor

ti-srebot commented Sep 21, 2020

Please edit this comment to complete the following information

Bug

Note: Make Sure that 'component', and 'severity' labels are added

1. Root Cause Analysis (RCA)

TiDB parses but ignores the flag SQL_CALC_FOUND_ROWS. This behavior is documented but can be dangerous, since it returns no errors or warnings. Given the common case of how this feature is used, it can lead to applications effectively receiving truncated result-sets from TiDB.

2. Symptom

Any query that includes the flag SQL_CALC_FOUND_ROWS.

3. All Trigger Conditions

The results are only wrong if the actual count is > the limit count, which might make it difficult to understand TiDB is wrong in simple cases.

4. Workaround (optional)

TiDB does not support this feature, so the workaround is that users should issue two queries. The first query retrieves the first N rows, the second query retrieves the count of rows.

5. Affected versions

[2.1.0,2.1.99],[3.0.0,3.0.99],[3.1.0,3.1.99],[4.0.0,4.0.99]

6. Fixed versions

unplaned

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/executor severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants