Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimizer picks wrong driving table in nested loop join #56012

Open
mzhang77 opened this issue Sep 11, 2024 · 2 comments
Open

optimizer picks wrong driving table in nested loop join #56012

mzhang77 opened this issue Sep 11, 2024 · 2 comments

Comments

@mzhang77
Copy link

mzhang77 commented Sep 11, 2024

Bug Report

1. Minimal reproduce step (Required)

mysql> show create table d\G
*************************** 1. row ***************************
       Table: d
Create Table: CREATE TABLE `d` (
  `a` bigint(20) unsigned NOT NULL,
  `object_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `b` longblob DEFAULT NULL,
  `c` int(11) unsigned NOT NULL,
  PRIMARY KEY (`object_id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci AUTO_INCREMENT=1242984075789489121 /*T![auto_id_cache] AUTO_ID_CACHE=1 */
1 row in set (0.00 sec)

mysql> show create table i\G
*************************** 1. row ***************************
       Table: i
Create Table: CREATE TABLE `i` (
  `object_id` bigint(20) unsigned NOT NULL,
  `b` bigint(20) DEFAULT NULL,
  `a` varbinary(767) DEFAULT NULL,
  `c` bigint(20) DEFAULT NULL,
  `d` varbinary(767) DEFAULT NULL,
  `e` int(11) unsigned NOT NULL,
  PRIMARY KEY (`object_id`) /*T![clustered_index] CLUSTERED */,
  KEY `a` (`a`,`object_id`),
  KEY `b` (`d`,`object_id`),
  UNIQUE KEY `c` (`c`,`b`),
  KEY `d` (`c`,`b`,`object_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
1 row in set (0.00 sec)

explain SELECT
  d.*
FROM
  i 
  LEFT JOIN d ON i.`object_id` = d.`object_id`
WHERE
  i.`a` = 0xaaa
ORDER BY
  i.`object_id`
LIMIT
  20;

2. What did you expect to see? (Required)

Execution plan should use index nested loop join on table d because there is a where predicate on an indexed column on table i.

3. What did you see instead (Required)

mysql> explain SELECT
    ->   d.*
    -> FROM
    ->   i 
    ->   LEFT JOIN d ON i.`object_id` = d.`object_id`
    -> WHERE
    ->   i.`a` = 0xaaa
    -> ORDER BY
    ->   i.`object_id`
    -> LIMIT
    ->   20;
+-----------------------------------+---------+-----------+--------------------------------+------------------------------------------------------------------------+
| id                                | estRows | task      | access object                  | operator info                                                          |
+-----------------------------------+---------+-----------+--------------------------------+------------------------------------------------------------------------+
| Projection_12                     | 0.00    | root      |                                | test.d.a, test.d.object_id, test.d.b, test.d.c                         |
| └─Limit_19                        | 0.00    | root      |                                | offset:0, count:20                                                     |
|   └─MergeJoin_56                  | 0.00    | root      |                                | left outer join, left key:test.i.object_id, right key:test.d.object_id |
|     ├─TableReader_44(Build)       | 36.00   | root      |                                | data:TableFullScan_43                                                  |
|     │ └─TableFullScan_43          | 36.00   | cop[tikv] | table:d                        | keep order:true                                                        |
|     └─Limit_35(Probe)             | 0.00    | root      |                                | offset:0, count:20                                                     |
|       └─IndexReader_42            | 0.00    | root      |                                | index:Limit_41                                                         |
|         └─Limit_41                | 0.00    | cop[tikv] |                                | offset:0, count:20                                                     |
|           └─IndexRangeScan_40     | 0.00    | cop[tikv] | table:i, index:a(a, object_id) | range:["\n\xaa","\n\xaa"], keep order:true                             |
+-----------------------------------+---------+-----------+--------------------------------+------------------------------------------------------------------------+
9 rows in set (0.01 sec)

mysql> explain SELECT /*+ inl_join(d) */
    ->   d.*
    -> FROM
    ->   i 
    ->   LEFT JOIN d ON i.`object_id` = d.`object_id`
    -> WHERE
    ->   i.`a` = 0xaaa
    -> ORDER BY
    ->   i.`object_id`
    -> LIMIT
    ->   20;
+-------------------------------------+---------+-----------+--------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                  | estRows | task      | access object                  | operator info                                                                                                                                    |
+-------------------------------------+---------+-----------+--------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_12                       | 0.00    | root      |                                | test.d.a, test.d.object_id, test.d.b, test.d.c                                                                                                   |
| └─Limit_19                          | 0.00    | root      |                                | offset:0, count:20                                                                                                                               |
|   └─IndexJoin_46                    | 0.00    | root      |                                | left outer join, inner:TableReader_43, outer key:test.i.object_id, inner key:test.d.object_id, equal cond:eq(test.i.object_id, test.d.object_id) |
|     ├─Limit_54(Build)               | 0.00    | root      |                                | offset:0, count:20                                                                                                                               |
|     │ └─IndexReader_59              | 0.00    | root      |                                | index:Limit_58                                                                                                                                   |
|     │   └─Limit_58                  | 0.00    | cop[tikv] |                                | offset:0, count:20                                                                                                                               |
|     │     └─IndexRangeScan_38       | 0.00    | cop[tikv] | table:i, index:a(a, object_id) | range:["\n\xaa","\n\xaa"], keep order:true                                                                                                       |
|     └─TableReader_43(Probe)         | 0.00    | root      |                                | data:TableRangeScan_42                                                                                                                           |
|       └─TableRangeScan_42           | 0.00    | cop[tikv] | table:d                        | range: decided by [test.i.object_id], keep order:false                                                                                           |
+-------------------------------------+---------+-----------+--------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.01 sec)

4. What is your TiDB version? (Required)

mysql> select @@Version;
+--------------------+
| @@Version |
+--------------------+
| 8.0.11-TiDB-v7.5.1 |
+--------------------+
1 row in set (0.00 sec)

@mzhang77 mzhang77 added the type/bug The issue is confirmed as a bug. label Sep 11, 2024
@mzhang77
Copy link
Author

mzhang77 commented Sep 11, 2024

mysql> show stats_meta;
+-------------------+--------------------------+----------------+---------------------+--------------+-----------+
| Db_name           | Table_name               | Partition_name | Update_time         | Modify_count | Row_count |
+-------------------+--------------------------+----------------+---------------------+--------------+-----------+
| test              | i                        |                | 2024-09-11 00:17:55 |      6378996 |        36 |
| test              | d                        |                | 2024-09-11 00:17:55 |    280170547 |        36 |
+-------------------+--------------------------+----------------+---------------------+--------------+-----------+

table d only has 36 rows, so full table scan table d is still very fast. However per test, using hint inl_join(d) is still more than 10 times faster than the default plan optimizer chooses. Also the defaut plan chosen by optimizer is risky since statistics can be inaccurate. The execution plan using hint inl_join(d) is safer when data grows and statistics becomes out of date.

@mzhang77
Copy link
Author

replayer_OUvHRwpy4C5HFJ3gMfpyUg==_1726071458482602000.zip
uploading a plan replayer dump for reproducing the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants