Getting Started with Mysqlpump ( Parallel mysqldump)

MySQL 5.7 is about to release. Our DBA’s has started to explore a few feature from the complete list of features.

Mysqlpump is a new client utility for performing logical backup that is being currently shipped with version MySQL – 5.7.8-rc. This could be the replacement for traditional Mysqldump utility. It was criticized for being a single threaded process but now MySQL has come with a parallelism and compression. ( We have an open source equivalent in percona called mydumper )

Now let’s discuss about the key features and current limitation of mysqlpump before exploring the options.

Key Features:

  • Multi-threaded parallel processing of schemas and object to speed the dumping process
  • Enhanced control over the databases and its objects.
  • Dumping of user accounts as account management statements such as CREATE USER and GRANT, rather than as insert statement to MySQLDB.
  • Ability to create compressed output
  • Option to view the progress of the dumping.
  • Faster reloading for InnoDB tables by adding secondary indexes after the rows are being inserted.

Current Limitations:

  • The dump threads lack a synchronization point before they start backing up the data. This makes it currently unsafe as a general-purpose backup replacement (for mysqldump).
  • Since utility is quite new and its bound to have some draw backs and the MySQL team is currently working to fix those issues at the earliest. Memory leak is there on some scenarios.

Restrictions:

Mysqlpump doesn’t dump system databases like INFORMATION_SCHEMA, performance_schema, ndbinfo, or sys schema by default like mysqldump.

Now let’s look at the various important options to take backup, since the options are pretty much similar to mysqldump.

Below is the environment where we have ran the test

OS : CentOS release 6.3 (Final)
RAM : 8G
Cores : 4
Datasize: 33G.

Full backup:

By default when the mysqlpump option is called with the connection parameter it will go for a full backup and the progression will be indicated

#/usr/local/mysql_57/bin/mysqlpump –u root –p –S /var/lib/mysql_old/mysql.sock –P 3307 > all.sql

[root@mysqltest.mydbops.com]# mysqlpump -u root -p > all.sql
Enter password:
Dump progress: 0/1 tables, 250/66233749 rows
Dump progress: 22/25 tables, 160142/129692203 rows
Dump progress: 22/25 tables, 546392/129692203 rows
Dump progress: 22/25 tables, 916642/129692203 rows
Dump progress: 22/25 tables, 1287892/129692203 rows
Dump progress: 22/25 tables, 1670392/129692203 rows
Dump progress: 22/25 tables, 2053142/129692203 rows
Dump progress: 22/25 tables, 2438142/129692203 rows

Full backup can also be invoked by the option –all-databases

Partial Backup:

Partial backup refers to backup for certain database and tables, the usage is pretty much similar to mysqldump.

#/usr/local/mysql_57/bin/mysqlpump -u root -p -S /var/lib/mysql_old/mysql.sock -P 3307 monty > monty.sql

We can see the progression showing the table as 1 and the row count for the particular table.

Database level backup

[root@mysqltest.mydbops.com]# mysqlpump -u root -p –databases monty > monty.sql
Enter password:
Dump progress: 0/1 tables, 250/66233749 rows
Dump progress: 0/1 tables, 227250/66233749 rows
Dump progress: 0/1 tables, 548750/66233749 rows
Dump progress: 0/1 tables, 911750/66233749 rows
Dump progress: 0/1 tables, 1274750/66233749 rows
Dump progress: 0/1 tables, 1637750/66233749 rows
Dump progress: 0/1 tables, 2000750/66233749 rows
.
.
Dump completed in 669004 milliseconds

Table level backup:


[root@mysqltest.mydbops.com]# mysqlpump -u root -p monty grptag > grptag.sql
Enter password:
Dump progress: 0/1 tables, 250/66233749 rows
Dump progress: 0/1 tables, 232750/66233749 rows
Dump progress: 0/1 tables, 507000/66233749 rows
Dump progress: 0/1 tables, 709500/66233749 rows

Mysqlpump Parallelism:

Parallelism is one of the most important feature of mysqlpump, to increase the speed of backup process, the options which control the parallel processing are

By using mysqlpump we can select concurrency between databases and also within database

–default-parallelism=N : This option defines the default number of threads used for each queue, the default number of threads is 2 if N is not defined

–parallel-schemas=[N:]db_list: This option is used to setup the processing queue for particular defined databases, the db_list could be a comma separated list of databases, If the option begins with N: the queue uses N threads, else the –default-parallelism defines the value of N

Let us see how this variables are effectively used.

In the below option we are defining the parallel threads is 4

#/usr/local/mysql_57/bin/mysqlpump -u root -p -S /var/lib/mysql_old/mysql.sock -P 3307 –default-parallelism=4 –all-databases > parallel.sql

[root@mysqltest.mydbops.com]# mysqlpump -u root -p –default-parallelism=4 –all-databases > all_db.sql
Enter password:
Dump progress: 0/2 tables, 2/66233751 rows
Dump progress: 23/25 tables, 316142/129692203 rows
Dump progress: 23/25 tables, 740642/129692203 rows
Dump progress: 23/25 tables, 1107142/129692203 rows
Dump progress: 23/25 tables, 1316892/129692203 rows
Dump progress: 23/25 tables, 1536142/129692203 rows

If you analyze the mysql processlist you can see the 5 dump threads inside the server as below.

mysql> show processlist;
+—-+——+———–+——-+———+——+————–+———————————————————————————+
| Id | User | Host     | db   | Command | Time | State       | Info                                                                           |
+—-+——+———–+——-+———+——+————–+———————————————————————————+
| 11 | root | localhost | monty | Sleep   | 1974 |             | NULL                                                                           |
| 15 | root | localhost | NULL | Query   |   0 | starting     | show processlist                                                               |
| 16 | root | localhost | NULL | Sleep   |   40 |             | NULL                                                                           |
| 17 | root | localhost | NULL | Sleep   |   40 |             | NULL                                                                           |
| 18 | root | localhost | NULL | Query   |   41 | Sending data | SELECT SQL_NO_CACHE * FROM `monty`.`grptag`                                     |
| 19 | root | localhost | NULL | Sleep   |   40 |             | NULL                                                                           |
| 20 | root | localhost | NULL | Query   |   40 | Sending data | SELECT SQL_NO_CACHE * FROM `sakila`.`group_xxxxxxxxxxxxxxx` |
+—-+——+———–+——-+———+——+————–+———————————————————————————+

7 rows in set (0.00 sec)
Now we will be defining parallelism in the database level. In database level the queries .

# /usr/local/mysql_57/bin/mysqlpump -u root -p -S /var/lib/mysql_old/mysql.sock -P 3307 –parallel-schemas=4:monty > database_ll.sql

[root@mysqltest.mydbops.com]# mysqlpump -u root -p –default-parallelism=4 –parallel-schemas=3:monty > all_db.sql
Enter password:
Dump progress: 0/2 tables, 2/66233751 rows
Dump progress: 23/25 tables, 316642/129692203 rows
Dump progress: 23/25 tables, 734142/129692203 rows
Dump progress: 23/25 tables, 1068892/129692203 rows
Dump progress: 23/25 tables, 1296642/129692203 rows
Dump progress: 23/25 tables, 1489642/129692203 rows

In order to disable parallelism we can set –default-paralleism=0. Then it will be like a mysqldump utility.

Mysqlpump Compression:

[root@mysqltest.mydbops.com]# mysqlpump -u root -p –default-parallelism=4 –compress –compress-output=LZ4 –databases monty > monty_compress.sql
Enter password:
Dump progress: 0/1 tables, 250/66233749 rows
Dump progress: 0/1 tables, 97500/66233749 rows
Dump progress: 0/1 tables, 232000/66233749 rows
Dump progress: 0/1 tables, 362500/66233749 rows
Dump progress: 0/1 tables, 479750/66233749 rows

.
.
Dump completed in 1078812 milliseconds

Mysqlpump supports compression of backup directly, by shipping the below options. They use Zlib library.

–compress
–compress-output=LZ4 or ZLIB

Once the compression is done decompression has to done. Based on the algorithm used.

shell> lz4 -d input_file output_file
shell> openssl zlib -d < input_file > output_file

We will explore much more features of MySQL 5.7 in the upcoming blogs.

 

Advertisements

2 thoughts on “Getting Started with Mysqlpump ( Parallel mysqldump)

  1. Team,
    I want to take mysqlpump between 2 date ranges like below. Can we do this please suggest.

    Can we use where clause on mysqlpump. I want to fatch the data for selected date range like

    mysqlpump –user= –password= –database –tables where data >= ‘2015-12-01’ and ‘2016-12-01’ > dump.sql

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s