Handling case sensitive column in pt-archiver

To copy the data of the particular column of the table to another table/server, We have an option to export the data as CSV and import the data back to a different table. But when the table size is large and we need to copy the data only for the required data to the target table will cause the load in the server since the table scanning is huge.

To overcome this, we have the pt-archiver copy the data from the source table to the destination as a whole or only for required columns. And also we can do this in a controlled manner as well. So there will be no performance impact even on the production time.

Source table structure :

mysql> show create table source\G
*************************** 1. row ***************************
       Table: source
Create Table: CREATE TABLE `source` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `FIRSTNAME` varchar(30) NOT NULL,
  `LASTNAME` varchar(30) NOT NULL,
  `email` varchar(50) DEFAULT NULL,
  `reg_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Destination table structure :

mysql> show create table destination\G
*************************** 1. row ***************************
       Table: destination
Create Table: CREATE TABLE `destination` (
  `firstname` varchar(30) DEFAULT NULL,
  `lastname` varchar(30) DEFAULT NULL,
  `email` varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

I have inserted only 10 records for the sample in the source table using mysql_random_data_load.

mysql> select count(*) from source;
+----------+
| count(*) |
+----------+
|       10 |
+----------+
1 row in set (0.00 sec)

While using pt-archiver to copy only the required column, we need to make sure we have mentioned the below options. Because by default pt-archiver will perform the column level check between source and the destination.

–columnsColumns need to be copied
–no-check-columnsTo overcome the column level check between source and destination.

–columns options are case sensitive, we need to mention the column name only in lower case, not in the upper case. If we mentioned the column names in the upper case, then the default value for the column will be inserted. If any of the columns is not mentioned for the copy process, the default value will be loaded for that column.

[vagrant@centos11 ~]$ pt-archiver --source h=localhost,D=archiver,t=source --no-delete --dest h=localhost,D=archiver,t=destination --no-safe-auto-increment --where '1=1' --limit 1 --txn-size 1 --progress 1 --statistics --no-check-charset --columns firstname,lastname --no-check-columns
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
	LANGUAGE = (unset),
	LC_ALL = (unset),
	LC_CTYPE = "UTF-8",
	LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
TIME                ELAPSED   COUNT
2021-11-23T20:57:12       0       0
2021-11-23T20:57:12       0       1
2021-11-23T20:57:12       0       2
2021-11-23T20:57:12       0       3
2021-11-23T20:57:12       0       4
2021-11-23T20:57:12       0       5
2021-11-23T20:57:12       0       6
2021-11-23T20:57:12       0       7
2021-11-23T20:57:12       0       8
2021-11-23T20:57:12       0       9
2021-11-23T20:57:12       0      10
2021-11-23T20:57:12       0      10
Started at 2021-11-23T20:57:12, ended at 2021-11-23T20:57:12
Source: D=archiver,h=localhost,t=source
Dest:   D=archiver,h=localhost,t=destination
SELECT 10
INSERT 10
DELETE 0
Action         Count       Time        Pct
commit            22     0.0318      57.90
select            11     0.0121      22.00
inserting         10     0.0062      11.28
other              0     0.0048       8.82
mysql> select * from destination;
+--------------------------------+--------------------------------+-------+
| firstname                      | lastname                       | email |
+--------------------------------+--------------------------------+-------+
| incidunt consectetur aut autem | accusamus et et provident quas | NULL  |
| et ab doloribus natus.         | veniam tempora odit facilis ea | NULL  |
| eveniet aut qui laborum aut er | sequi consectetur eum qui vero | NULL  |
| quasi qui natus optio unde.    | voluptatem dolorem facere ut q | NULL  |
| sunt voluptate qui ut ab a.    | nihil et deserunt quis eligend | NULL  |
| est ab ducimus odio sapiente a | dolor aut officiis nostrum del | NULL  |
| corrupti voluptatem debitis ne | rerum dolor accusantium qui ne | NULL  |
| ipsam provident exercitationem | similique rem fuga velit dicta | NULL  |
| odit ex quas unde repudiandae  | nostrum accusamus officia et.  | NULL  |
| impedit hic ratione quia minim | sed non beatae perferendis cup | NULL  |
+--------------------------------+--------------------------------+-------+
10 rows in set (0.00 sec)

We have gone through the best practices that need to be taken while performing the required column copy process using pt-archiver and also go through the –dry-run output carefully before executing it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s