Making Maxscale/ProxySQL Highly Available ( 2 > 1 )

As Mydbops we are implementing Load Balancer using Maxscale or ProxySQL ( Our presentation ) for lot our client,  but these load balancers will become a SPOF (Single Point of failure) .  We have tried to explore services like HAProxy, Nginx, and Keepalived etc. Except Keepalived, all the services need to run on the standalone instance and  did not satisfy our exact need.

Keepalived does not requires any standalone instance, it can be deployed and configured with a minimal effort and provide the HA Solutions to the DB Infra. This approach not only fits for our DB setup, we can implement same even managing Application or Frontend server HA Solutions. Its very simple and elegant way.

Below we will see how to set up the configure the Keepalived for our requirements.

Screen Shot 2018-01-31 at 12.01.56 PM

The question arises, how do you load balance the Maxscale or ProxySQL servers themselves without any standalone instance? One way is to use keepalived. In order to use this solution, you need at least two Maxscale or ProxySQL servers. On both of them install keepalived as explained below. Both servers will have a floating IP. The application server will attempt to connect to 192.168.33.30 (Floating IP) . Depending on which Maxscale or ProxySQL server is the master, the IP will be owned by that server. If that server fails, then the backup server will start to issue gratuitous ARP responses for the same IP of 192.168.33.30 and the requests to the App servers will then go through the backup Maxscale or ProxySQL server which has now become the primary.

Note :

  1. ProxySQL has its clustering inbuilt with ProxySQL 1.4 to synchronise the configuration. ProxySQL has more fine grained configuration while comparing all SQL load balancers.
  2. Maxscale as a config file based setup . The configuration has to be synced manually across nodes ( but chances are minimal on production setup ).

Install Keepalived on both Master and Slave LB’s

yum install keepalived -y

Configure Keepalived on Master

vim /etc/keepalived/keepalived.conf
global_defs {
 notification_email {
 keepalive@mydbops.com
 }
 notification_email_from keepalived@mydbops.com
 smtp_server 192.168.1.100
 smtp_connect_timeout 30
 router_id LVS_DEVEL
}

vrrp_script check_maxscale {
 script "/bin/systemctl status maxscale.service"
 interval 2
 fall 2
 rise 2
}

vrrp_instance VI_1 {
 state MASTER
 interface enp0s8
 virtual_router_id 51
 priority 101
 advert_int 1
 authentication {
 auth_type PASS
 auth_pass 1111
 }
 virtual_ipaddress {
 192.168.33.30
 }
 track_script {
 check_maxscale
 }
}

Configure Keepalived on Slave

vim /etc/keepalived/keepalived.conf
global_defs {
 notification_email {
 admin@example.com
 }
 notification_email_from keepalived@example.com
 smtp_server 10.1.1.100
 smtp_connect_timeout 30
 router_id LVS_DEVEL
}

vrr_script check_maxscale {
 script "/bin/systemctl status maxscale.service"
 interval 2
 fall 2
 rise 2
}

vrrp_instance VI_1 {
 state BACKUP
 interface enp0s8
 virtual_router_id 51
 priority 100
 advert_int 1
 authentication {
 auth_type PASS
 auth_pass 1111
 }
 virtual_ipaddress {
 192.168.33.30
 }
 track_script {
 check_maxscale
 }
}

Start Keepalived on both Master and Slave

systemctl start keepalived

Verifying Keepalived is Running On Master

systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
 Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
 Active: active (running) since Tue 2018-01-23 13:21:57 UTC; 2h 22min ago
 Process: 3615 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 3616 (keepalived)
 CGroup: /system.slice/keepalived.service
 ├─3616 /usr/sbin/keepalived -D
 ├─3617 /usr/sbin/keepalived -D
 └─3618 /usr/sbin/keepalived -D

Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:56 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:26:01 mydbopslabs13 Keepalived_vrrp[3618]: Sending gratuitous ARP on enp0s8 for 192.168.33.30

Verifying Keepalived Running on Slave

systemctl status keepalived
keepalived.service - LVS and VRRP High Availability Monitor
 Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
 Active: active (running) since Tue 2018-01-23 13:21:30 UTC; 2h 25min ago
 Process: 8682 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 8683 (keepalived)
 CGroup: /system.slice/keepalived.service
 ├─8683 /usr/sbin/keepalived -D
 ├─8684 /usr/sbin/keepalived -D
 └─8685 /usr/sbin/keepalived -D

Jan 23 13:25:29 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:34 mydbopslabs12 Keepalived_vrrp[8685]: Sending gratuitous ARP on enp0s8 for 192.168.33.30
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Received advert with higher priority 101, ours 100
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jan 23 13:25:54 mydbopslabs12 Keepalived_vrrp[8685]: VRRP_Instance(VI_1) removing protocol VIPs.

As we know 192.168.33.13 is MASTER,  Floating IP is Assigned to enp0s8 when Master is UP.

ip add show enp0s8
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 08:00:27:c3:8e:7e brd ff:ff:ff:ff:ff:ff
inet 192.168.33.13/24 brd 192.168.33.255 scope global enp0s8
valid_lft forever preferred_lft forever
inet 192.168.33.30/32 scope global enp0s8
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fec3:8e7e/64 scope link
valid_lft forever preferred_lft forever

Notice in the above we have sections vrr_script check_maxscale. This code will run maxscale status on the MASTER. If the return code is ‘0’ then the service is considered to be up. If the return code is other than ‘0’ then the service is considered to be down and the BACKUP host will then take-over the floating IP.

Testing The Failover and How Keepalived help for minimal Downtime

For this testing, I will ping Floating IP 192.168.33.30 from some local machine and manually stop the MAXSCALE server. So that Keepalive recognizes the service failure and Slave server keepalive configured Floating on that machine.

Initiate Ping to Floating IP

ping 192.168.33.30

Stop MAXSCAL on Master

systemctl stop maxscale
systemctl status maxscale
maxscale.service - MariaDB MaxScale Database Proxy
Loaded: loaded (/usr/lib/systemd/system/maxscale.service; disabled; vendor preset: disabled)
Active: inactive (dead)

Jan 23 13:25:49 mydbopslabs13 maxscale[3883]: [mysqlmon] No Master can be determined
Jan 23 16:15:12 mydbopslabs13 systemd[1]: Stopping MariaDB MaxScale Database Proxy...
Jan 23 16:15:12 mydbopslabs13 maxscale[3883]: Waiting for housekeeper to shut down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Finished MaxScale log flusher.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper shutting down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: Housekeeper has shut down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale received signal SIGTERM. Exiting.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale is shutting down.
Jan 23 16:15:13 mydbopslabs13 maxscale[3883]: MaxScale shutdown completed.
Jan 23 16:15:13 mydbopslabs13 systemd[1]: Stopped MariaDB MaxScale Database Proxy.

Ping Output

ping 192.168.33.30
PING 192.168.33.30 (192.168.33.30) 56(84) bytes of data.
64 bytes from 192.168.33.30: icmp_seq=1 ttl=64 time=0.288 ms
64 bytes from 192.168.33.30: icmp_seq=2 ttl=64 time=0.285 ms
64 bytes from 192.168.33.30: icmp_seq=3 ttl=64 time=0.220 ms
64 bytes from 192.168.33.30: icmp_seq=4 ttl=64 time=0.208 ms
64 bytes from 192.168.33.30: icmp_seq=5 ttl=64 time=0.187 ms
64 bytes from 192.168.33.30: icmp_seq=6 ttl=64 time=0.236 ms
64 bytes from 192.168.33.30: icmp_seq=7 ttl=64 time=0.293 ms
64 bytes from 192.168.33.30: icmp_seq=8 ttl=64 time=0.198 ms
64 bytes from 192.168.33.30: icmp_seq=9 ttl=64 time=0.255 ms
64 bytes from 192.168.33.30: icmp_seq=10 ttl=64 time=0.440 ms
64 bytes from 192.168.33.30: icmp_seq=11 ttl=64 time=0.288 ms
64 bytes from 192.168.33.30: icmp_seq=12 ttl=64 time=0.366 ms
64 bytes from 192.168.33.30: icmp_seq=13 ttl=64 time=0.507 ms
64 bytes from 192.168.33.30: icmp_seq=14 ttl=64 time=0.401 ms
64 bytes from 192.168.33.30: icmp_seq=15 ttl=64 time=0.284 ms
64 bytes from 192.168.33.30: icmp_seq=16 ttl=64 time=0.366 ms
64 bytes from 192.168.33.30: icmp_seq=18 ttl=64 time=0.510 ms
64 bytes from 192.168.33.30: icmp_seq=19 ttl=64 time=0.242 ms
64 bytes from 192.168.33.30: icmp_seq=20 ttl=64 time=0.254 ms
64 bytes from 192.168.33.30: icmp_seq=21 ttl=64 time=0.175 ms
64 bytes from 192.168.33.30: icmp_seq=22 ttl=64 time=0.205 ms
64 bytes from 192.168.33.30: icmp_seq=23 ttl=64 time=0.246 ms
64 bytes from 192.168.33.30: icmp_seq=24 ttl=64 time=0.231 ms
64 bytes from 192.168.33.30: icmp_seq=25 ttl=64 time=0.249 ms
64 bytes from 192.168.33.30: icmp_seq=26 ttl=64 time=0.243 ms

when we see the above output only one packet was dropped. Within few seconds Floating IP Assigned to Slave and started to respond.

Check Floating IP assigned to Salve

ip add show enp0s8
3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
 link/ether 08:00:27:a8:1d:d7 brd ff:ff:ff:ff:ff:ff
 inet 192.168.33.12/24 brd 192.168.33.255 scope global enp0s8
 valid_lft forever preferred_lft forever
 inet 192.168.33.30/32 scope global enp0s8
 valid_lft forever preferred_lft forever
 inet6 fe80::a00:27ff:fea8:1dd7/64 scope link
 valid_lft forever preferred_lft forever

Once Master Maxscale service online automatically Floating re-assign to Master Server. These can be applied for the ProxySQL Cluster too.

We believe this blog make worth your time. If you have any comments please share below.

Advertisements

2 thoughts on “Making Maxscale/ProxySQL Highly Available ( 2 > 1 )

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s