Faster Load data outfile in MySQL

While exporting the table with MySQL native utility, we don’t have any control on the process, and also there will be no progress update as well on the process completion. So when exporting the larger table will consume high resource utilization and also the disk space usage will also be high.

MySQL shell utility will make the process easier. It will export the table and we can import the data back with a parallel thread and also will provide the current progress status on export/import progress.

util.exportTable() utility was introduced in Shell – 8.0.22 version, will export the data in a controlled manner. We can store the data in either local or Cloud Infrastructure Object Storage bucket as well.

We will see about the compression ratio along with the time taken for native MySQL vs Shell utility

Feature :

  • Compression
  • Progress status
  • Supported output format – CSV,CSV-unix,TSV
  • Controlled process (Maxrate)
  • Output file to local or Oracle Cloud Infrastructure Object Storage bucket

Example :

MySQL  localhost  JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/sbtest.txt")
Initializing - done 
Gathering information - done 
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done       
Starting data dump
107% (26.22M rows / ~24.34M rows), 287.77K rows/s, 56.98 MB/s
Dump duration: 00:02:35s                                     
Total duration: 00:02:35s                                    
Data size: 5.18 GB                                           
Rows written: 26216172                                       
Bytes written: 5.18 GB                                       
Average throughput: 33.35 MB/s                               
                                                             
The dump can be loaded using:                                
util.importTable("file:///home/vagrant/sbtest.txt", {        
    "characterSet": "utf8mb4",
    "schema": "sbtest",
    "table": "sbtest1"
})                 

We should make sure the directory already exists. But default it will generate the file with fieldsTerminatedBy – TAB. At last, it will provide the import command which needs to use while restoring the data back.

Export as CSV :

To export the file as CSV we need to use the dialect option will determine the output format.

 MySQL  localhost  JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/sbtest.csv",{dialect:"csv"})
Initializing - done 
Gathering information - done 
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done       
Starting data dump
107% (26.22M rows / ~24.34M rows), 136.35K rows/s, 24.68 MB/s
Dump duration: 00:02:44s                                     
Total duration: 00:02:44s                                    
Data size: 5.31 GB                                           
Rows written: 26216172                                       
Bytes written: 5.31 GB                                       
Average throughput: 32.34 MB/s                               
                                                             
The dump can be loaded using:                                
util.importTable("file:///home/vagrant/sbtest.csv", {        
    "characterSet": "utf8mb4",
    "dialect": "csv",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS >                 

Compression:

Two methods of compression zstd and gzip, By default the compression is disabled. We can enable this by adding a compression option.

MySQL localhost JS > util.exportTableMySQL  localhost  JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/sbtest.zstd",{compression:"zstd"})
Initializing - done 
Gathering information - done 
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done       
Starting data dump
107% (26.22M rows / ~24.34M rows), 135.43K rows/s, 26.62 MB/s uncompressed, 11.92 MB/s compressed
Dump duration: 00:03:43s                                                                         
Total duration: 00:03:43s                                                                        
Uncompressed data size: 5.18 GB                                                                  
Compressed data size: 2.32 GB                                                                    
Compression ratio: 2.2                                                                           
Rows written: 26216172                                                                           
Bytes written: 2.32 GB                                                                           
Average uncompressed throughput: 23.20 MB/s                                                      
Average compressed throughput: 10.41 MB/s                                                        
                                                                                                 
The dump can be loaded using:                                                                    
util.importTable("file:///home/vagrant/sbtest.zstd", {                                           
    "characterSet": "utf8mb4",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS >                                           
                                                                        

Maxrate:

Exporting process should not become the bottleneck for other processes. To make it a controlled process, we can set the maximum number of bytes per second per thread for maxrate option.

MySQL  localhost  JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/sbtest_file.csv",{dialect:"csv",maxRate:"5000000",fieldsOptionallyEnclosed: true, fieldsTerminatedBy: ",", linesTerminatedBy: "\n", fieldsEnclosedBy: '"'})
Initializing - done 
Gathering information - done 
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done       
Starting data dump
53% (12.98M rows / ~24.34M rows), 36.47K rows/s, 7.55 MB/s 
76% (18.66M rows / ~24.34M rows), 31.73K rows/s, 6.30 MB/s
107% (26.22M rows / ~24.34M rows), 31.46K rows/s, 6.50 MB/s
Dump duration: 00:12:38s                                   
Total duration: 00:12:38s                                  
Data size: 5.28 GB                                         
Rows written: 26216172                                     
Bytes written: 5.28 GB                                     
Average throughput: 6.97 MB/s                              
                                                           
The dump can be loaded using:                              
util.importTable("file:///home/vagrant/sbtest_file.csv", { 
    "characterSet": "utf8mb4",
    "dialect": "csv",
    "fieldsEnclosedBy": "\"",
    "fieldsOptionallyEnclosed": true,
    "fieldsTerminatedBy": ",",
    "linesTerminatedBy": "\n",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS > 

Export time – Native Vs Shell utility

For testing the export time taken for Native MySQL vs Shell utility, I have used the below lab environment.

MySQL version5.7.35
Shell version8.0.27
Table size5.16 GB

Note: Data loading is done using sysbench.

Native method :

I have exported the data from the table with single thread native MySQL. The execution time was 3 min and13 sec and the file is around 5 GB.

mysql> select * from sbtest1 INTO OUTFILE '/var/lib/mysql-files/sample_native.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
Query OK, 26216172 rows affected (3 min 13.74 sec)

[root@centos11 mysql-files]# ls -ltrh
total 5.1G
-rw-rw-rw-. 1 mysql mysql 5.1G Nov 15 17:03 sample_native.csv

Shell utility :

The same table export is done using Shell utility – util.exportTable, the export is done within 2 min and 30 sec.

MySQL localhost JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/Sample_utility.csv",{dialect:"csv"})
Initializing - done
Gathering information - done
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done
Starting data dump
101% (26.22M rows / ~25.88M rows), 139.50K rows/s, 27.35 MB/s
Dump duration: 00:02:33s                                     
Total duration: 00:02:33s                                    
Data size: 5.31 GB                                           
Rows written: 26216172                                       
Bytes written: 5.31 GB                                       
Average throughput: 34.63 MB/s                               
                                                             
The dump can be loaded using:                                
util.importTable("file:///home/vagrant/Sample_utility.csv", {
    "characterSet": "utf8mb4",
    "dialect": "csv",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS > 


[vagrant@centos11 ~]$ ls -ltrh Sample_utility.csv
-rw-r-----. 1 root root 5.0G Nov 15 17:08 Sample_utility.csv
Native MySQLShell Utility
Time taken3 min and 13 sec2 min and 33 sec
Export file size5.1 GB5 GB

Compression ratio

While performing the export to a larger table we need to take care of the export file size as well. So I have compared the compression ratio for zstd and gzip methods.

gzip method :

MySQL localhost JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/Sample_utility.gz",{compression:"gzip"})
Initializing - done
Gathering information - done
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done
Starting data dump
101% (26.22M rows / ~25.88M rows), 122.02K rows/s, 24.26 MB/s uncompressed, 12.52 MB/s compressed
Dump duration: 00:03:49s                                                                         
Total duration: 00:03:50s                                                                        
Uncompressed data size: 5.18 GB                                                                  
Compressed data size: 2.68 GB                                                                    
Compression ratio: 1.9                                                                           
Rows written: 26216172                                                                           
Bytes written: 2.68 GB                                                                           
Average uncompressed throughput: 22.52 MB/s                                                      
Average compressed throughput: 11.64 MB/s                                                        
                                                                                                 
The dump can be loaded using:                                                                    
util.importTable("file:///home/vagrant/Sample_utility.gz", {                                     
    "characterSet": "utf8mb4",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS > 

[vagrant@centos11 ~]$ ls -ltrh Sample_utility.gz
-rw-r-----. 1 root root 2.5G Nov 15 17:14 Sample_utility.gz

Zstd method :

MySQL localhost JS > util.exportTable("sbtest.sbtest1","file:///home/vagrant/sbtest1.zstd",{compression:"zstd"})
Initializing - done
Gathering information - done
Running data dump using 1 thread.
NOTE: Progress information uses estimated values and may not be accurate.
Writing table metadata - done
Starting data dump
101% (26.22M rows / ~25.88M rows), 165.84K rows/s, 32.40 MB/s uncompressed, 14.50 MB/s compressed
Dump duration: 00:02:38s                                                                         
Total duration: 00:02:38s                                                                        
Uncompressed data size: 5.18 GB                                                                  
Compressed data size: 2.32 GB                                                                    
Compression ratio: 2.2                                                                           
Rows written: 26216172                                                                           
Bytes written: 2.32 GB                                                                           
Average uncompressed throughput: 32.61 MB/s                                                      
Average compressed throughput: 14.63 MB/s                                                        
                                                                                                 
The dump can be loaded using:                                                                    
util.importTable("file:///home/vagrant/employees.zstd", {                                        
    "characterSet": "utf8mb4",
    "schema": "sbtest",
    "table": "sbtest1"
})
 MySQL  localhost  JS > 

[vagrant@centos11 ~]$ ls -ltrh sbtest1.zstd
-rw-r-----. 1 vagrant vagrant 2.2G Nov 15 17:50 sbtest1.zstd
GzipZstd
Execution time3 min and 50 sec2 min and 38 sec
Export file size2.5 GB2.1 GB

By MySQL Shell utility we have exported the tables along with the compression.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s