V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
MySQL 5.5 Community Server
MySQL 5.6 Community Server
Percona Configuration Wizard
XtraBackup 搭建主从复制
Great Sites on MySQL
Percona
MySQL Performance Blog
Severalnines
推荐管理工具
Sequel Pro
phpMyAdmin
推荐书目
MySQL Cookbook
MySQL 相关项目
MariaDB
Drizzle
参考文档
http://mysql-python.sourceforge.net/MySQLdb.html
Pendragon
V2EX  ›  MySQL

为什么数据库分了表后性能会直线下降?

  •  2
     
  •   Pendragon · 2020-06-17 23:10:58 +08:00 · 3200 次点击
    这是一个创建于 1610 天前的主题,其中的信息可能已经有所发展或是发生改变。

    原先有按月分的 12 张表,每张表里大概有 2000w 左右的数据

    ...
    CREATE TABLE `parcel_with_no_partition_m05` (
    ...
    )
    CREATE TABLE `parcel_with_no_partition_m06` (
      `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT '数据主键,毫无意义',
      `equip_id` int(11) DEFAULT NULL,
      `scan_id` int(11) DEFAULT NULL,
      `create_datetime` datetime DEFAULT NULL,
      `minute_of_hour` int(11) DEFAULT NULL,
      `hour_of_day` int(11) DEFAULT NULL,
      `day_of_month` int(11) DEFAULT NULL,
      `week_of_year` int(11) DEFAULT NULL,
      `month_of_year` int(11) DEFAULT NULL,
      `parcel_length` int(11) DEFAULT NULL
      PRIMARY KEY (`id`)
    ) ENGINE=innodb AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
    
    CREATE TABLE `parcel_with_no_partition_m07` (
    ...
    )
    ...
    

    现在想尝试用 mysql8 的 partition 来进行分表,集中存储了 12 个月的一共 2 亿多条数据,每个分区 2000w 条左右的数据,与之前的表数据一致

    CREATE TABLE `parcel_with_partition` (
      `equip_id` int(11) DEFAULT NULL,
      `scan_id` int(11) DEFAULT NULL,
      `create_datetime` datetime DEFAULT NULL,
      `minute_of_hour` int(11) DEFAULT NULL,
      `hour_of_day` int(11) DEFAULT NULL,
      `day_of_month` int(11) DEFAULT NULL,
      `week_of_year` int(11) DEFAULT NULL,
      `month_of_year` int(11) DEFAULT NULL,
      `parcel_length` int(11) DEFAULT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ENGINE=InnoDB PARTITION BY RANGE(`month_of_year`) (
    PARTITION `JAN` VALUES LESS THAN (2),
    PARTITION `FEB` VALUES LESS THAN (3),
    PARTITION `MAR` VALUES LESS THAN (4),
    PARTITION `APR` VALUES LESS THAN (5),
    PARTITION `MAY` VALUES LESS THAN (6),
    PARTITION `JUN` VALUES LESS THAN (7),
    PARTITION `JUL` VALUES LESS THAN (8),
    PARTITION `AUG` VALUES LESS THAN (9),
    PARTITION `SEP` VALUES LESS THAN (10),
    PARTITION `OCT` VALUES LESS THAN (11),
    PARTITION `NOV` VALUES LESS THAN (12),
    PARTITION `DEC` VALUES LESS THAN (13)
    );
    

    我尝试执行了一条业务 sql,发现效率下降了一倍

    #从 mysql 自带的分区中取数据
    select sql_no_cache count(*) as num,`day_of_month` as day,`hour_of_day` as hour,`scan_id` as scanId from `parcel_with_partition` partition(`JUN`) where `equip_id` = 3 and `create_datetime` >'2020-06-01' and `create_datetime` <= '2020-06-29' group by `scan_id`,`day_of_month`,`hour_of_day`
    2018 rows in set (59.56 sec)
    
    #从原始分表中取数据
     select sql_no_cache count(*) as num,`day_of_month` as day,`hour_of_day` as hour,`scan_id` as scanId from `parcel_with_no_partition_m06` where `equip_id` = 3 and `create_datetime` >'2020-06-01' and `sort_datetime` <= '2020-06-29' group by `scan_id`,`day_of_month`,`hour_of_day`
     2018 rows in set, 1 warning (29.13 sec)
    

    又进一步用 explain 和 performance schema 分别观察了两种 sql

    mysql 自带分区查询详情

    explain:
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    | id | select_type | table                 | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra                        |
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    |  1 | SIMPLE      | parcel_with_partition | JUN        | ALL  | NULL          | NULL | NULL    | NULL | 20481855 |     1.11 | Using where; Using temporary |
    +----+-------------+-----------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    
    table i/o:
    select * from table_io_waits_summary_by_table where count_star>0 \G;
    *************************** 1. row ***************************
         OBJECT_TYPE: TABLE
       OBJECT_SCHEMA: test
         OBJECT_NAME: parcel_with_partition
          COUNT_STAR: 22123040
      SUM_TIMER_WAIT: 55709022384855
      MIN_TIMER_WAIT: 55709022384855
      AVG_TIMER_WAIT: 2517879
      MAX_TIMER_WAIT: 55709022384855
          COUNT_READ: 22123040
      SUM_TIMER_READ: 55709022384855
      MIN_TIMER_READ: 55709022384855
      AVG_TIMER_READ: 2517879
      MAX_TIMER_READ: 55709022384855
         COUNT_WRITE: 0
     SUM_TIMER_WRITE: 0
     MIN_TIMER_WRITE: 0
     AVG_TIMER_WRITE: 0
     MAX_TIMER_WRITE: 0
         COUNT_FETCH: 22123040
     SUM_TIMER_FETCH: 55709022384855
     MIN_TIMER_FETCH: 55709022384855
     AVG_TIMER_FETCH: 2517879
     MAX_TIMER_FETCH: 55709022384855
        COUNT_INSERT: 0
    SUM_TIMER_INSERT: 0
    MIN_TIMER_INSERT: 0
    AVG_TIMER_INSERT: 0
    MAX_TIMER_INSERT: 0
        COUNT_UPDATE: 0
    SUM_TIMER_UPDATE: 0
    MIN_TIMER_UPDATE: 0
    AVG_TIMER_UPDATE: 0
    MAX_TIMER_UPDATE: 0
        COUNT_DELETE: 0
    SUM_TIMER_DELETE: 0
    MIN_TIMER_DELETE: 0
    AVG_TIMER_DELETE: 0
    MAX_TIMER_DELETE: 0
    
    file i/o:
                    FILE_NAME: /home/wedo/mysql/data/mysql/test/parcel_with_partition#p#jun.ibd
                   EVENT_NAME: wait/io/file/innodb/innodb_data_file
        OBJECT_INSTANCE_BEGIN: 140081009323904
                   COUNT_STAR: 422002
               SUM_TIMER_WAIT: 3965120084812
               MIN_TIMER_WAIT: 5739112
               AVG_TIMER_WAIT: 9395831
               MAX_TIMER_WAIT: 498159493
                   COUNT_READ: 422002
               SUM_TIMER_READ: 3965120084812
               MIN_TIMER_READ: 5739112
               AVG_TIMER_READ: 9395831
               MAX_TIMER_READ: 498159493
     SUM_NUMBER_OF_BYTES_READ: 6914080768
                  COUNT_WRITE: 0
              SUM_TIMER_WRITE: 0
              MIN_TIMER_WRITE: 0
              AVG_TIMER_WRITE: 0
              MAX_TIMER_WRITE: 0
    SUM_NUMBER_OF_BYTES_WRITE: 0
                   COUNT_MISC: 0
               SUM_TIMER_MISC: 0
               MIN_TIMER_MISC: 0
               AVG_TIMER_MISC: 0
               MAX_TIMER_MISC: 0
    
    

    原始分表查询详情

    explain:
     +----+-------------+-----------------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    | id | select_type | table                        | partitions | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra                        |
    +----+-------------+------------------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    |  1 | SIMPLE      | parcel_with_no_partition_m06 | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 22123040 |     1.11 | Using where; Using temporary |
    +----+-------------+------------------------------+------------+------+---------------+------+---------+------+----------+----------+------------------------------+
    
    table i/o:
         OBJECT_TYPE: TABLE
       OBJECT_SCHEMA: test
         OBJECT_NAME: parcel_with_no_partition_m06
          COUNT_STAR: 22123040
      SUM_TIMER_WAIT: 23182194259854
      MIN_TIMER_WAIT: 23182194259854
      AVG_TIMER_WAIT: 1047813
      MAX_TIMER_WAIT: 23182194259854
          COUNT_READ: 22123040
      SUM_TIMER_READ: 23182194259854
      MIN_TIMER_READ: 23182194259854
      AVG_TIMER_READ: 1047813
      MAX_TIMER_READ: 23182194259854
         COUNT_WRITE: 0
     SUM_TIMER_WRITE: 0
     MIN_TIMER_WRITE: 0
     AVG_TIMER_WRITE: 0
     MAX_TIMER_WRITE: 0
         COUNT_FETCH: 22123040
     SUM_TIMER_FETCH: 23182194259854
     MIN_TIMER_FETCH: 23182194259854
     AVG_TIMER_FETCH: 1047813
     MAX_TIMER_FETCH: 23182194259854
        COUNT_INSERT: 0
    SUM_TIMER_INSERT: 0
    MIN_TIMER_INSERT: 0
    AVG_TIMER_INSERT: 0
    MAX_TIMER_INSERT: 0
        COUNT_UPDATE: 0
    SUM_TIMER_UPDATE: 0
    MIN_TIMER_UPDATE: 0
    AVG_TIMER_UPDATE: 0
    MAX_TIMER_UPDATE: 0
        COUNT_DELETE: 0
    SUM_TIMER_DELETE: 0
    MIN_TIMER_DELETE: 0
    AVG_TIMER_DELETE: 0
    MAX_TIMER_DELETE: 0
    
    file i/o:
                    FILE_NAME: /home/wedo/mysql/data/mysql/test/parcel_with_no_partition_m06.ibd
                   EVENT_NAME: wait/io/file/innodb/innodb_data_file
        OBJECT_INSTANCE_BEGIN: 140081009372480
                   COUNT_STAR: 424687
               SUM_TIMER_WAIT: 4045883911263
               MIN_TIMER_WAIT: 5301220
               AVG_TIMER_WAIT: 9526557
               MAX_TIMER_WAIT: 861143891
                   COUNT_READ: 424687
               SUM_TIMER_READ: 4045883911263
               MIN_TIMER_READ: 5301220
               AVG_TIMER_READ: 9526557
               MAX_TIMER_READ: 861143891
     SUM_NUMBER_OF_BYTES_READ: 6958071808
                  COUNT_WRITE: 0
              SUM_TIMER_WRITE: 0
              MIN_TIMER_WRITE: 0
              AVG_TIMER_WRITE: 0
              MAX_TIMER_WRITE: 0
    SUM_NUMBER_OF_BYTES_WRITE: 0
                   COUNT_MISC: 0
               SUM_TIMER_MISC: 0
               MIN_TIMER_MISC: 0
               AVG_TIMER_MISC: 0
               MAX_TIMER_MISC: 0
    

    在 explain 和 file i/o 中都没有看出来异常情况,但是在 table i/o 时 内部分区 使用的时间远高于 原始的分表

    这是不是说明多浪费的时间的罪魁祸首是逻辑 i/o 部分?为什么分表后会出现这样的结果呢?希望大神们多多指教

    补充一下环境: 系统:centos7 存储引擎:mysql8 服务器配置:12 核 2.5Ghz 64G 内存 硬盘:机械硬盘 在执行查询 sql 时最大的内存占用 3G

    10 条回复    2020-07-01 17:20:42 +08:00
    xuanbg
        1
    xuanbg  
       2020-06-17 23:36:50 +08:00
    2000 万和 2.4 亿能一样? PARTITION BY RANGE 实际上还是 1 张表,只不过数据分开存罢了。
    ajsonx
        2
    ajsonx  
       2020-06-18 00:02:23 +08:00
    8 的窗口函数我试过几个,主要是为了编写方便吧。试过 1 千万的数据,性能上都不如写联表查询。
    Pendragon
        3
    Pendragon  
    OP
       2020-06-18 00:09:11 +08:00
    @xuanbg 我最开始也是这么想的,但是分析结果显示两种查询方法的硬盘 i/o 时间是一样的,占用的最大内存也是单张分表 /分区的大小,所以分区方式应该并没有把 2.4 亿的数据全部取出来
    mahone3297
        4
    mahone3297  
       2020-06-18 00:18:12 +08:00
    #从 mysql 自带的分区中取数据
    select sql_no_cache count(*) as num,`day_of_month` as day,`hour_of_day` as hour,`scan_id` as scanId from `parcel_with_partition` partition(`JUN`) where `equip_id` = 3 and `create_datetime` >'2020-06-01' and `create_datetime` <= '2020-06-29' group by `scan_id`,`day_of_month`,`hour_of_day`
    2018 rows in set (59.56 sec)

    #从原始分表中取数据
    select sql_no_cache count(*) as num,`day_of_month` as day,`hour_of_day` as hour,`scan_id` as scanId from `parcel_with_no_partition_m06` where `equip_id` = 3 and `create_datetime` >'2020-06-01' and `sort_datetime` <= '2020-06-29' group by `scan_id`,`day_of_month`,`hour_of_day`
    2018 rows in set, 1 warning (29.13 sec)


    * 2 个的条件不一样?
    partition 都是 `create_datetime` >'2020-06-01' and `create_datetime` <= '2020-06-29'
    分表 `create_datetime` >'2020-06-01' and `sort_datetime` <= '2020-06-29'
    * 还有,你是 from `parcel_with_partition` partition(`JUN`) 这样指定其中一张分区表的?能否 from parcel_with_partition_JUN 这样来对比?
    mahone3297
        5
    mahone3297  
       2020-06-18 00:19:28 +08:00
    分区表的用法,应该是你 where 条件中,有了分区条件,使得,你的 sql 可以直接查询某几张表,而不是所有表。
    flashrick
        6
    flashrick  
       2020-06-18 09:40:30 +08:00
    依我看,你分区的查询用错了啊,你用 month_of_year 分区的,查询的时候就是一条普通的 select 就行了,只是 where 条件要用到 month_of_year 。说错了请轻喷
    Pendragon
        7
    Pendragon  
    OP
       2020-06-18 09:54:05 +08:00
    @flashrick
    @mahone3297
    你说的没错 在 where 条件中指定 month_of_year 的值也能命中相应的分区,我在这里用 partition 强制限制分区是为了减少其他条件对查询的影响, 最终结果都是一样的
    Still4
        8
    Still4  
       2020-06-18 10:00:05 +08:00
    看了下分区表实现原理,每个分区一个文件,这样的话两个对照表文件大小一致,如果没有加索引,都是扫全表,执行时间不应该差很多啊

    再复制一个 parcel_with_no_partition_m06 表,确认下机械盘内道外道的影响?
    someonedeng
        9
    someonedeng  
       2020-06-24 17:13:16 +08:00 via Android
    话说本来是分表的,为什么又合成一个大表再分区而不是直接对小表分区?
    encro
        10
    encro  
       2020-07-01 17:20:42 +08:00
    既然你都是根据 create_datetime 查询的,那么就根据 create_datetime 分区即可

    http://mysql.taobao.org/monthly/2017/11/09/
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   3540 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 04:57 · PVG 12:57 · LAX 20:57 · JFK 23:57
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.