ProxySQL Series: Seamless Replication Switchover Using MHA

This is our second blog in the ProxySQL Series ( Blog I MySQL Replication Read-write Split up ). Will cover how to integrate ProxySQL with MHA to handle failover of Database servers.

ProxySQL_MySQL_MHA_10

We already have Master – Slave replication setup behind ProxySQL from previous blog [ProxySQL On MySQL Replication]

For this setup we have added one more node for MHA Manager , Which will keep eye on Master and Slave status.

  • node5 (172.17.0.5) , MHA Manager

ProxySQL can be greatly configured with MHA for Highly available setup with zero downtime.

MHA role in failover :

MHA tool is used for failover.During failover, MHA promotes most updated slave (slave with most recent transactions) as new master and apply CHANGE MASTER command on new slave and change read_only flag on new master and slave.

ProxySQL role in failover :

When failover happened (due to crash or manual for any maintenance activity) ProxySQL will detect the change (checking read_only flag) and promotes new master server’s IP into writers hostgroup and start sending traffic on new master.

Each row in mysql_replication_hostgroups table in proxysql represent a pair of writer_hostgroup and reader_hostgroup .
ProxySQL will monitor the value of read_only from mysql_server_read_only_log for all the servers.
If read_only=1 the host is copied/moved to the reader_hostgroup, while if read_only=0 the host is copied/moved to the writer_hostgroup .

Installing MHA

If replication is classic binlog/pos format based then install MHA node on all hosts involving (manager,master,slaves), for GTID based replication it has to be installed only on the manager node.

Install MHA node :

Install MHA on all DB nodes and MHA manager server. More information

apt-get -y install libdbd-mysql-perl
dpkg -i mha4mysql-node_0.56-0_all.deb

Install MHA manager :

Only install on MHA manager server.

#dependencies
apt-get install -y  libdbi-perl libdbd-mysql-perl libconfig-tiny-perl liblog-dispatch-perl libparallel-forkmanager-perl libnet-amazon-ec2-perl

dpkg -i mha4mysql-manager_0.56-0_all.deb

Configuration changes :

Changes only on node5 (172.17.0.5) , MHA Manager :

Create directories :

mkdir -p /etc/mha/ /var/log/mha/

Config file :

cat /etc/mha/cluster1.conf

[server default]
# mysql user and password
user=root
password=xxx
# replication user password
repl_user=repl
repl_password=xxx

remote_workdir=/var/tmp
# working directory on the manager
manager_workdir=/var/log/mha/
# manager log file
manager_log=/var/log/mha/mha.log
ping_interval=15

*/As we don't have to deal with VIP's here, disable master_ip_failover_script */
#master_ip_failover_script=/usr/local/bin/master_ip_failover 
master_ip_online_change_script=/usr/local/bin/master_ip_online_change 
master_binlog_dir=/data/log/
secondary_check_script=/etc/mha/mha_prod/failover_triggered.sh
report_script=/etc/mha/mha_prod/failover_report.sh
master_pid_file=/var/run/mysqld/mysqld.pid 
ssh_user=root 
log_level=debug 
#set this to 0 if YOU ARE SURE THIS CAN"T BREAK YOUR REPLICATION 
check_repl_filter=1 
[server1] 
hostname=172.17.0.1 
port=3306 
[server2] 
hostname=172.17.0.2 
port=3306 
[server3] 
hostname=172.17.0.3 
port=3306 
no_master=1 

master_ip_failover : Script used to switch virtual IP address.
master_ip_online_change : Script used in switchover when master is online or dead.

NOTE: Don’t forget to comment out the “FIX ME” lines in the above scripts.

Custom scripts : Below scripts are optional

secondary_check_script It is always good to double check the availability of master.  More info

report_script  :  With this script we can configure alerts or email when failover completes. In Detail

Now run test against the cluster using below two scripts :
– masterha_check_repl
– masterha_check_ssh

Please note If this check fails then MHA will refuse to run any kind of failover.

root@MHA-Node# /etc/mha # masterha_check_ssh --conf=/etc/mha/cluster1.cnf
-- truncated long output
[info] All SSH connection tests passed successfully.

root@MHA-Node#  masterha_check_repl --conf=/etc/mha/cluster1.cnf
172.17.0.1(172.17.0.1:3306) (current master)
 +--172.17.0.2(172.17.0.2:3306)
 +--172.17.0.3(172.17.0.3:3306)

-- truncated long output
MySQL Replication Health is OK.

To run an Manual failover :

  • To Execute manual failover , Make sure your MHA manager is stooped.
  • new_master_host – Its optional parameter if you want to select new master. If we don’s specify any value then most updated slave considered as new Master.
masterha_master_switch --master_state=alive --conf=/etc/mha/cluster1.conf
--orig_master_is_new_slave [--new_master_host=] 

Automatic failover :

We need to run masterha_manager is background to monitor cluster status :

nohup masterha_manager --conf=/etc/mha/cluster1.cnf  < /dev/null > /var/log/mha/mha.log 2>&1 &

When Auto failover happen , In case of Master Crash , Logs look like

tail -f /var/log/mha/mha.log
 ----- Failover Report -----

Master 172.17.0.1(172.17.0.1:3306) is down!
Started automated(non-interactive) failover.
172.17.0.1(172.17.0.1:3306)
Selected 172.17.0.2(172.17.0.2:3306) as a new master.
172.17.0.2(172.17.0.2:3306): OK: Applying all logs succeeded.
172.17.0.2(172.17.0.2:3306): OK: Activated master IP address.
172.17.0.3(172.17.0.3:3306): OK: Slave started, replicating from 172.17.0.2(172.17.0.2:3306)
    172.17.0.2(172.17.0.2:3306): Resetting slave info succeeded.
    Master failover to 172.17.0.2(172.17.0.2:3306) completed successfully.

we can also check the status of masterha_manager:

RUNNING :
root@mysql-monitoring /etc/mha # masterha_check_status --conf=/etc/mha/cluster1.cnf
cluster1 (pid:15810) is running(0:PING_OK), master:172.17.0.1

NOT RUNNING :
root@mysql-monitoring /etc/mha # masterha_check_status --conf=/etc/mha/cluster1.cnf
cluster1 is stopped(2:NOT_RUNNING).

Remember masterha_manager script stops working in two situation :

  1. After automatic fail-over , for that we need to manually start above script to continue monitoring clusters nodes
  2. If any replication is not running on any slaves defined in cluster or any slave server is down.

Check backend status at ProxySQL :

Below table of ProxySQL shows , what is the current master and its slaves after failover with their ONLINE status.

Admin > select hostgroup,srv_host,status,Queries,Bytes_data_sent,Latency_us from stats_mysql_connection_pool where hostgroup in (0,1);
+-----------+------------+----------+---------+-----------------+------------+
| hostgroup | srv_host   | status   | Queries | Bytes_data_sent | Latency_us |
+-----------+------------+----------+---------+-----------------+------------+
| 0         | 172.17.0.1 | ONLINE   | 12349   | 76543232        | 144        |
| 1         | 172.17.0.2 | ONLINE   | 22135   | 87654356        | 190        |
| 1         | 172.17.0.3 | ONLINE   | 22969   | 85344235        | 110        |
| 1         | 172.17.0.1 | ONLINE   | 1672    | 4534332         | 144        |
+-----------+------------+----------+---------+-----------------+------------+

Preserve relay logs and purge regularly :

As we have two slaves in this setup , MHA is keeping relay logs for recovering other slaves. [To ensure that it disables  relay_log_purge ]

We need to periodically purges old relay logs like binary logs. MHA Node has a command line tool purge_relay_logs to do that

purge_relay_logs removes relay logs without blocking SQL threads. Relay logs need to be purged regularly (i.e. once per day, once per 6 hours, etc), so purge_relay_logs should be regularly invoked on each slave server at different time. It can be scheduled as a cron too.

[root@mysql-slave1]$

cat /etc/cron.d/purge_relay_logs
#purge relay logs after every 5 hours 
0 */5 * * * /usr/bin/purge_relay_logs --user=mha --password=PASSWORD --workdir=/data/archive_relay --disable_relay_log_purge >> /var/log/mha/purge_relay_logs.log 2>&1

Above script will purge relay logs and set relay_log_purge = 0 [OFF] to avoid automatic relay purge .

More Details : https://github.com/yoshinorim/mha4mysql-manager/wiki/Requirements

We can also have MySQL Utilities to perform the failover in ProxySQL too.The main advantage of using MHA-ProxySQL integration is, it avoids need for VIP or re-defining DNS after MHA failover , they are taken care by ProxySQL.

Advertisements

5 thoughts on “ProxySQL Series: Seamless Replication Switchover Using MHA

  1. There is a risk in using a read_only-based-proxy: that the old master would lose network for a while, then come back.
    In such scenario you will have failed over, promoted a new master with read_only=0, and then the old master walks in, advertising read_only=0. It was not accessible to MHA to set read_only=1. It still thinks its the master.
    ProxySQL (or equally any other proxy) would then face two servers both claiming to have read_only=0.
    The result depends on the proxy and configuration. A proxy might choose the former master; or may include both; or what have you.

    The scenario is in particular interesting when multiple availability zones / datacenters are involved.

    It’s a scenario which should be taken into account. If you use read_only-based-proxy you must accept that risk, This is of course legitimate, as long as one is aware of it.

    To solve this problem one must not rely on the servers’ own perception. That a server _thinks_ it should be writable is not enough. One would need another entity to assist in that decision making. There are various configurations to that, beyond the scope of this comment.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s