In our work, We used to get a lot of requirements for replicating data from one data source to another.Previously I wrote replication from MySQL to Red-shift.
In this blog I am going to explain about replicating the data from MySQL to Oracle using Tungsten replicator.
1.0. Tungsten Replicator :
It is an open source replication engine supports data extract from MySQL, MySQL Variants such as RDS, Percona Server, MariaDB and Oracle and allows the data extracted to be applied on other data sources such as Vertica, Cassandra, Redshift etc.
Tungsten Replicator includes support for parallel replication, and advanced topologies such as fan-in and multi-master, and can be used efficiently in cross-site deployments.
1.1.0. Architecture :
There are three major components in tungsten replicator.
1. Extractor / Master Service
2. Transaction History Log (THL)
3. Applier / Slave Service
1.1.1. Extractor / Master Service :
The extractor component reads data from MySQL’s binary log and writes that information into the Transaction History Log (THL).
1.1.2. Transaction History Log (THL) :
The Transaction History Log (THL) acts as a translator between two different data sources. It stores transactional data from different data servers in a universal format using the replicator service acting as a master, That could then be processed Applier / Slave service.
1.1.3. Applier / Slave Service :
All the raw row-data recorded on the THL logs is re-assembled or constructed into another format such as JSON or BSON, or external CSV formats that enable the data to be loaded in bulk batches into a variety of different targets.
Therefore Statement information is not supported for heterogeneous deployments. So It’s mandatory that Binary log format on MySQL is ROW ( with Full Image ).
1.2.0. Schema Creation :
This heterogeneous replication does not replicated SQL statements, including DDL statements that would normally define and generate the table structures, a different method must be used.
Tungsten Replicator includes a tool called ddlscan which can read the schema definition from MySQL or Oracle and translate that into the schema definition required on the target database.
1.3.0. Pre Requisites:
1.3.1. Server Packages:
- JDK 7 or higher
- Ant 1.8 or higher
- Ruby 2.4
- Net-SSH
- Net-SCP
1.3.2. MySQL:
- All the tables to be replicated must have a primary key.
Following MySQL configuration should be enabled on MySQL
binlog-format = row binlog-row-image = full character-set-server=utf8 collation-server=utf8_general_ci default-time-zone='+00:00'
2.0. Requirements :
User creation in Oracle :
CREATE USER accounts_places IDENTIFIED BY accounts_1 DEFAULT TABLESPACE ACCOUNTS_PUB QUOTA UNLIMITED ON ACCOUNTS_PUB; GRANT CONNECT TO accounts_places; GRANT ALL PRIVILEGES TO accounts_places;
User creation in MySQL :
root@localhost:(none)> create user 'tungsten'@'%' identified by 'secret'; Query OK, 0 rows affected (0.01 sec) root@localhost:(none)> GRANT ALL PRIVILEGES ON *.* TO 'tungsten'@'%'; Query OK, 0 rows affected (0.00 sec)
We would need to replicate the NOTES_TESTING table from accounts schema on MySQL to Oracle.Structures of the table are given below.
CREATE TABLE NOTES_TESTING ( ID INT(11) NOT NULL AUTO_INCREMENT, NOTE TEXT, CREATED_AT DATETIME DEFAULT NULL, UPDATED_AT DATETIME DEFAULT NULL, PERSON_ID INT(11) DEFAULT NULL, ADDED_BY INT(11) DEFAULT NULL, PRIMARY KEY (ID));
Note :
The above table was created in MySQL .Moving forward the MySQL to Oracle replication a few datatypes are not supported in Oracle.Link
3.0. Implementation:
The implementation consists of following steps.
- Installation / Building tungsten from source
- Preparing equivalent schema for Oracle
- Configuring Master service
- Configuring Slave service
- Generating worker tables (temp tables used by tungsten) for replication to be created on MySQL
- Start the replication
3.1. Installation / Building From Source:
- Download the source package from the GIT.
# git clone https://github.com/continuent/tungsten-replicator.git
- Compile this package it will generate the tungsten-replicator.tar file.
# sh tungsten-replicator/builder/build.sh # mkdir -p tungsten
- Once the tar file is generated extract the file under the folder tungsten
# tar --strip-components 1 -zxvf tungsten-replicator/builder/build/tungsten-replicator-5.2.1.tar.gz -C tungsten/
3.2. Preparing equivalent table for Oracle :
In tungsten replicator the ddl extractor which will read table definitions from MySQL and create appropriate Oracle table definitions to use during replication.
./bin/ddlscan -user root -url 'jdbc:mysql:thin://mysql-stg:3306/accounts' -pass root2345 -template ddl-mysql-oracle.vm -db ACCOUNTS > access_log.ddl
This ddlscan will extract the mysql table definitions and stored in access_log.ddl file.
The table structure will be like this.
DROP TABLE ACCOUNTS.notes_testing; CREATE TABLE ACCOUNTS.notes_testing ( ID NUMBER(10, 0) NOT NULL, NOTE CLOB /* TEXT */, CREATED_AT DATE, UPDATED_AT DATE, PERSON_ID NUMBER(38,0), ADDED_BY NUMBER(38,0), PRIMARY KEY (ID) );
Then we need to restore this ddl to Oracle like this .
# cat access_log.ddl | sqlplus sys/oracle as sysdba
To check tables are created inside the correct accounts schema.
SQL> desc notes_testing; Name Null? Type -------- -------- ----------------- ID NOT NULL NUMBER(10, 0) NOTE CLOB CREATED_AT DATE UPDATED_AT DATE PERSON_ID NUMBER(38,0) ADDED_BY NUMBER(38,0)
3.3. Configuring Master Service:
- Switch to tungsten directory and Reset the defaults configuration file.
#cd ~/tungsten #./tools/tpm configure defaults --reset
- Configure the Master service on the directory of your choice, We have used /opt/master
- Following commands will prepare the configuration file for Master service.
./tools/tpm configure master \ --install-directory=/opt/master \ --enable-heterogeneous-service=true \ --enable-heterogeneous-master=true \ --members=mysql-stg \ --master=mysql-stg
./tools/tpm configure master --hosts=mysql-stg \ --replication-user=tungsten \ --replication-password=password \ --skip-validation-check=MySQLUnsupportedDataTypesCheck
- Once the configuration is prepared, Then we can install it using tpm.
./tools/tpm install Configuration is now complete. For further information, please consult Tungsten documentation, which is available at docs.continuent.com. NOTE >> Command successfully completed
- Now Master service will be configured under /opt/master/
- Start the tungsten Master service.
[root@mysql-stg tungsten]# /opt/master/tungsten/cluster-home/bin/startall Starting Tungsten Replicator Service... Waiting for Tungsten Replicator Service. running: PID:15141
- Verify it’s working by checking the master status.
[root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl services Processing services command... NAME VALUE ---- ----- appliedLastSeqno: 0 appliedLatency : 1.412 role : master serviceName : master serviceType : local started : true state : ONLINE Finished services command...
[root@mysql-stg tungsten]# /opt/master/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME VALUE ---- ----- appliedLastEventId : mysql-bin.000134:0000000000000652;-1 appliedLastSeqno : 0 appliedLatency : 1.412 autoRecoveryEnabled : false autoRecoveryTotal : 0 channels : 1 clusterName : master currentEventId : mysql-bin.000134:0000000000000652 currentTimeMillis : 1536839268029 dataServerHost : mysql-stg extensions : host : mysql-stg latestEpochNumber : 0 masterConnectUri : thl://localhost:/ masterListenUri : thl://mysql-stg:2112/ maximumStoredSeqNo : 0 minimumStoredSeqNo : 0 offlineRequests : NONE pendingError : NONE pendingErrorCode : NONE pendingErrorEventId : NONE pendingErrorSeqno : -1 pendingExceptionMessage: NONE pipelineSource : jdbc:mysql:thin://mysql-stg:3306/tungsten_master?noPrepStmtCache=true relativeLatency : 13.029 resourcePrecedence : 99 rmiPort : 10000 role : master seqnoType : java.lang.Long serviceName : master serviceType : local simpleServiceName : master siteName : default sourceId : mysql-stg state : ONLINE timeInStateSeconds : 12.854 timezone : GMT transitioningTo : uptimeSeconds : 13.816 useSSLConnection : false version : Tungsten Replicator 5.2.1 Finished status command...
- If the master did not start properly refer to this (/opt/master/service_logs/trepsvc.log) error log.
3.4. Configuring Slave service:
- Switch to tungsten directory and Reset the defaults configuration file.
#cd ~/tungsten #./tools/tpm configure defaults --reset
- Configure the Slave service on the directory of your choice, We have used /opt/slave
- Following commands will prepare the configuration file for Slave service.
./tools/tpm configure slave \ --install-directory=/opt/slave \ --enable-heterogeneous-service=true \ --members=mysql-stg
./tools/tpm configure slave --hosts=mysql-stg \ --datasource-type=oracle \ --datasource-host=172.17.4.106 \ --datasource-port=1526 \ --datasource-oracle-sid=PLACES \ --datasource-user=accounts_places \ --datasource-password=accounts_1 \ --svc-applier-filters=dropstatementdata,replicate \ --property=replicator.filter.replicate.do=accounts --property=replicator.applier.dbms.getColumnMetadataFromDB=false \ --skip-validation-check=InstallerMasterSlaveCheck \ --rmi-port=10002 \ --thl-port=2113 \ --master-thl-port=2112 \ --master-thl-host=mysql-stg
- Once the configuration is prepared, Then we can install it using tpm.
./tools/tpm install Configuration is now complete. For further information, please consult Tungsten documentation, which is available at docs.continuent.com. NOTE >> Command successfully completed
3.5. Starting Replication:
- Once the slave is configured then start the slave
[root@mysql-stg tungsten]# /opt/slave/tungsten/cluster-home/bin/startall Starting Tungsten Replicator Service... Waiting for Tungsten Replicator Service. running: PID:17039
- Verify it’s working by checking the slave status.
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl services Processing services command... NAME VALUE ---- ----- appliedLastSeqno: -1 appliedLatency : -1.0 role : slave serviceName : slave serviceType : unknown started : true state : OFFLINE:ERROR Finished services command...
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME VALUE ---- ----- appliedLastEventId : NONE appliedLastSeqno : -1 appliedLatency : -1.0 autoRecoveryEnabled : false autoRecoveryTotal : 0 channels : 1 clusterName : slave currentEventId : NONE currentTimeMillis : 1536839732221 dataServerHost : 172.17.4.106 extensions : host : 172.17.4.106 latestEpochNumber : -1 masterConnectUri : thl://mysql-stg:2112/ masterListenUri : null maximumStoredSeqNo : -1 minimumStoredSeqNo : -1 offlineRequests : NONE pendingError : NONE pendingErrorCode : NONE pendingErrorEventId : NONE pendingErrorSeqno : -1 pendingExceptionMessage: NONE pipelineSource : thl://mysql-stg:2112/ relativeLatency : -1.0 resourcePrecedence : 99 rmiPort : 10002 role : slave seqnoType : java.lang.Long serviceName : slave serviceType : local simpleServiceName : slave siteName : default sourceId : 172.17.4.106 state : ONLINE timeInStateSeconds : 4.476 timezone : GMT transitioningTo : uptimeSeconds : 77.996 useSSLConnection : false version : Tungsten Replicator 5.2.1 Finished status command...
4.0. Testing:
- Now both master & slave are in sync. Now I am going to insert a few record on MySQL server in notes_testing table.
insert into notes_testing values(1,'Mydbops ',NULL,NULL,13,45); insert into notes_testing values(2,'MySQL DBA',NULL,NULL,1,2);
- Above these records are inserted in the master server. At the same I have checked Oracle these records are replicated or not.
5.0. Troubleshooting:
While reconfiguring the tungsten replicatior I am getting below error replication is not synced.Here I have mentioned the sample error. Sample Error :
NAME VALUE ---- ----- appliedLastEventId : NONE appliedLastSeqno : -1 appliedLatency : -1.0 autoRecoveryEnabled : false autoRecoveryTotal : 0 channels : -1 clusterName : slave currentEventId : NONE currentTimeMillis : 1536839670076 dataServerHost : 172.17.4.106 extensions : host : 172.17.4.106 latestEpochNumber : -1 masterConnectUri : thl://mysql-stg:2112/ masterListenUri : null maximumStoredSeqNo : -1 minimumStoredSeqNo : -1 offlineRequests : NONE pendingError : Event extraction failed pendingErrorCode : NONE pendingErrorEventId : NONE pendingErrorSeqno : -1 pendingExceptionMessage: Client handshake failure: Client response validation failed: Master log does not contain requested transaction: master source ID=mysql-stg client source ID=172.17.4.106 requested seqno=3 client epoch number=0 master min seqno=0 master max seqno=0 pipelineSource : UNKNOWN relativeLatency : -1.0 resourcePrecedence : 99 rmiPort : 10002 role : slave
The reason is some transactions on the slave from a previous installation was not clear properly.
solution :
To clear old transactions from slave.
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl -service slave reset -all Do you really want to delete replication service slave completely? [yes/NO] yes
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl online
[root@mysql-stg tungsten]# /opt/slave/tungsten/tungsten-replicator/bin/trepctl status Processing status command... NAME VALUE ---- ----- appliedLastEventId : NONE appliedLastSeqno : -1 appliedLatency : -1.0 autoRecoveryEnabled: false autoRecoveryTotal: 0 channels : 1 clusterName: slave currentEventId : NONE currentTimeMillis: 1536839732221 dataServerHost : 172.17.4.106 extensions : host : 172.17.4.106 latestEpochNumber: -1 masterConnectUri : thl://mysql-stg:2112/ masterListenUri: null maximumStoredSeqNo : -1 minimumStoredSeqNo : -1 offlineRequests: NONE pendingError : NONE pendingErrorCode : NONE pendingErrorEventId: NONE pendingErrorSeqno: -1 pendingExceptionMessage: NONE pipelineSource : thl://mysql-stg:2112/ relativeLatency: -1.0 resourcePrecedence : 99 rmiPort: 10002 role : slave seqnoType: java.lang.Long serviceName: slave serviceType: local simpleServiceName: slave siteName : default sourceId : 172.17.4.106 state: ONLINE timeInStateSeconds : 4.476 timezone : GMT transitioningTo: uptimeSeconds: 77.996 useSSLConnection : false version: Tungsten Replicator 5.2.1 Finished status command...
6.0. Conclusion:
Tungsten replicator is a great tool when it comes to replication of data with heterogeneous data sources. If we understand it’s working, It’s easy to configure and perform efficiently.
Image Courtesy : Photo by Karl JK Hedin on Unsplash