MySQL Partition pruning Explained

Partitioning is a process in which a single larger table is split into several smaller tables (physically) and still considered as a single table.It is generally a good idea for the tables whose size is in a few 100 GB’s.

While performing select,update,delete operations in a partitioned tables, we can notice a better performance in queries .This simple process of optimization is called partition pruning in which we can avoid scanning the certain partitions which does not have any matching values based on partition key.

The following example will explain you in detail about the partition pruning

Here I have used an Amazon linux 2 machine

Hardware Info

1 core CPU
1 GB RAM 
20 GB Disk (SSD) 
maximum of 3000 IOPS

MySQL 5.7 is installed in this machine and two tables (one is not partitioned and other is partitioned) with 1.7 million records of  same data is loaded in using Sysbench.

Table Structure without partition

mysql> show create table sbtest.sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT ,
`pad` char(60) NOT NULL DEFAULT ,
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=1734978 DEFAULT CHARSET=latin1

I have partitioned the second table with the help of the partition manager.we have explained about partition manager in our previous blog. partition_manager

mysql> show create table sbtest.sbtest1\G
*************************** 1. row ***************************
Table: sbtest1
Create Table: CREATE TABLE `sbtest1`
`id` int(11) NOT NULL AUTO_INCREMENT,
`k` int(11) NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`,`created`)
) ENGINE=InnoDB AUTO_INCREMENT=1734978 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (unix_timestamp(created))
(PARTITION p_START VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p_1544832000 VALUES LESS THAN (1544832000) ENGINE = InnoDB,
PARTITION p_1547424000 VALUES LESS THAN (1547424000) ENGINE = InnoDB,
PARTITION p_1550016000 VALUES LESS THAN (1550016000) ENGINE = InnoDB,
PARTITION p_1552608000 VALUES LESS THAN (1552608000) ENGINE = InnoDB,
PARTITION p_1555200000 VALUES LESS THAN (1555200000) ENGINE = InnoDB,
PARTITION p_1557792000 VALUES LESS THAN (1557792000) ENGINE = InnoDB,
PARTITION p_1560384000 VALUES LESS THAN (1560384000) ENGINE = InnoDB,
PARTITION p_1562976000 VALUES LESS THAN (1562976000) ENGINE = InnoDB,
PARTITION p_1565568000 VALUES LESS THAN (1565568000) ENGINE = InnoDB,
PARTITION p_1568160000 VALUES LESS THAN (1568160000) ENGINE = InnoDB,
PARTITION p_1570752000 VALUES LESS THAN (1570752000) ENGINE = InnoDB,
PARTITION p_1573344000 VALUES LESS THAN (1573344000) ENGINE = InnoDB,
PARTITION p_1575936000 VALUES LESS THAN (1575936000) ENGINE = InnoDB,
PARTITION p_1578528000 VALUES LESS THAN (1578528000) ENGINE = InnoDB,
PARTITION p_1581120000 VALUES LESS THAN (1581120000) ENGINE = InnoDB,
PARTITION p_1583712000 VALUES LESS THAN (1583712000) ENGINE = InnoDB,
PARTITION p_1586304000 VALUES LESS THAN (1586304000) ENGINE = InnoDB,
PARTITION p_1588896000 VALUES LESS THAN (1588896000) ENGINE = InnoDB,
PARTITION p_1591488000 VALUES LESS THAN (1591488000) ENGINE = InnoDB,
PARTITION p_1594080000 VALUES LESS THAN (1594080000) ENGINE = InnoDB,
PARTITION p_1596672000 VALUES LESS THAN (1596672000) ENGINE = InnoDB,
PARTITION p_1599264000 VALUES LESS THAN (1599264000) ENGINE = InnoDB,
PARTITION p_1607040000 VALUES LESS THAN (1607040000) ENGINE = InnoDB,
PARTITION p_END VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

Now Let us try to fetch the same data from both tables.we will see the difference between them in the Explain plan.

mysql> explain select * from sbtest1.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1574797
filtered: 11.11
Extra: Using where

The table without partition nearly examines the full table scan and it takes around 8 secs to fetch the records.

mysql> select * from sbtest1.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00";
0f7550ef001dab270bd4a6849151ce79  -
175616 rows in set (8.25 sec)

Repeating the same in partitioned table

mysql> explain select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00"\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: sbtest1
partitions: p_1550016000,p_1552608000,p_1555200000,p_1557792000
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using where
mysql>  select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-12 00:00:00";
0f7550ef001dab270bd4a6849151ce79  -
175616 rows in set (0.47 sec)

It is taken only less than 0.5 seconds to fetch the same data.Instead of searching the entire table, it scans only the certain partition where the matched value will be present.By limiting the search,it is possible to find matching rows in short span when compared with unpartitioned tables.This process of avoiding the unwanted partitions is known as pruning.

partition pruning-3

The below pictorial representation shows the disk operations while fetching the data

1)while scanning the full table

image (1)

image (2)

Without partition, it consumes more resources.Disk IOPS reaches the maximum of 900 and utilization goes above 95%

2) While scanning only specific partitions(Pruning)

pasted_image_at_2019-01-06__11_47_pmimage (3)

While pruning the specific partitions are accessed and consumption of system resources is greatly reduced. The optimiser performs pruning in the following two cases (in Where clause)

A) Partition_column = constant [Eq Ref]

B) Partition_column in (constant 1, constant 2,….) [Range]

In case A,

The optimiser search against a single value(key) and find a partition where the matching value will be present and scans only that partition

In case B,

The optimiser search for each given values in the where clause and makes the list of matched partitioning and then scan only that partitioning.

One of our clients reported an issue that partition is not pruning is not working well as expected. While  checking the issue further, we analysed that they had given incorrect values in WHERE clause. If we are giving invalid values in where clause, it will be treated as NULL. So partition will not be efficient for those kind of scenarios.

mysql> explain select * from sbtest.sbtest1 where created between "2019-01-31 00:00:00" and "2019-05-121 00:00:00"\G 
*************************** 1. row *************************** 
id: 1 
select_type: SIMPLE 
table: sbtest1 
partitions:p_1550016000,p_1552608000,p_1555200000,p_1557792000,p_1560384000,p_1562976000,p_1565568000,p_1568160000,p_1570752000,p_1573344000,p_1575936000,p_1578528000,p_1581120000,p_1583712000,p_1586304000,p_1588896000,p_1591488000,p_1594080000,p_1596672000,p_1599264000,p_1601856000,p_END          
type: ALL 
possible_keys: NULL 
key: NULL 
key_len: NULL 
ref: NULL 
rows: 1 
filtered: 100.00 
Extra: Using where

The wrong value in above where clause causes the optimiser to scan all the partitions.Here the pruning is not working.

As like read operations(select), write operations (only update,insert is not supported currently)can also speed up by this process.

mysql> explain update sbtest1.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00"\G 
*************************** 1. row *************************** 
id: 1 
select_type: UPDATE 
table: sbtest1 
partitions: NULL 
type: index 
possible_keys: NULL 
key: PRIMARY 
key_len: 8 
ref: NULL 
rows: 1574797 
filtered: 100.00 
Extra: Using where 
1 row in set (0.00 sec)
mysql> update sbtest1.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00"; 
Query OK, 353397 rows affected (17.36 sec) 
Rows matched: 353397  Changed: 353397 Warnings: 0

To update  rows, it took nearly 18 seconds.Also have a look on the below graphs. Again with partitioned tables

pasted_image_at_2019-01-06__11_47_pm (1)image (4)

mysql> explain update sbtest.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00"\G
*************************** 1. row *************************** 
id: 1 
select_type: UPDATE 
table: sbtest1 partitions:p_1547424000,p_1550016000,p_1552608000,p_1555200000,p_1557792000,p_1560384000,p_1562976000,p_1565568000 
type: index 
possible_keys: NULL 
key: PRIMARY 
key_len: 8 
ref: NULL 
rows: 1 
filtered: 100.00 
Extra: Using where 
1 row in set (0.00 sec)
mysql> update sbtest.sbtest1 set k=100 where created between "2018-12-25 09:08:00" and "2019-07-17 00:00:00";
Query OK, 353397 rows affected (2.51 sec) 
Rows matched: 353397  Changed: 353397 Warnings: 0

pasted_image_at_2019-01-06__11_48_pm

image (5)

Benefits of partition pruning:

  1. Faster looks up with Point Select and range conditions
  2. Better response time with write operations updates and deletes
  3. Reduced system resources like IO utilisation.

This is all about partition pruning and how it will be useful in large tables of production environment.

Advertisements

2 thoughts on “MySQL Partition pruning Explained

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s