ProxySQL Series : Query Cache with ProxySQL

We know that MySQL query cache is deprecated as of MySQL 5.7.20 and removed in MySQL 8.0 and It has been recommended to use ProxySQL (or other) external query cache instead.

In this blog post, we’ll present the ProxySQL query cache functionality and how does it help us.

How to setup ProxySQL between client and database servers is out of the scope of this article.

If you are looking for other articles on ProxySQL Series :

ProxySQL Query Cache :

ProxySQL query cache stores the text of a SELECT statement together with the corresponding result sent to the client. If the application will re-execute the same query, the resultset will be returned by the embedded Query Cache.

It is a quite common scenario to identify database load caused by not-optimal SELECT statements that generate a result set that should be cached for a few seconds. To implement a code change can be a long process (developers should write new code, build it, testing in staging, then deploy on production), and this is often not a suitable option during an emergency. As the configuration of the database proxy layer (ProxySQL in this case) falls under the responsibility of DBAs, to enable caching DBAs won’t require developers to make changes to the application.

Therefore this is a feature that empowers the DBAs.

proxysql_query_cache

How ProxySQL query cache gets enabled?

In ProxySQL query cache we have to define what traffic gets cached. This is done by defining query rules that match traffic that is going to be cached and setting a “cache_ttl” for the cached results

Below is the scenario for optimal use of query cache –

For one of our client, we were getting thread spike alerts from master node.

after debugging we saw ProxySQL is opening number of connection to a master server to execute some verification from code to see if the table has all correct columns.

An application was sending `DESCRIBE table` queries very often and developer did not have a quick fix to control or modify that code because it was coming from some ORM.

As per 10 minutes stats, we observed around 346K describe queries comming on DB , i.e. 578 describe queries per second.

mysql> select sum(count_star) from stats_mysql_query_digest where digest_text like 'DESCRIBE%';
+-----------------+
| sum(count_star) |
+-----------------+
| 346741          |
+-----------------+

We have decided to create query rule to forward all queries starts with DESCRIBE on replica nodes and cache result set for 60 minutes.

60 mins cache because we got to know from client, that DB schema changes happen mostly on weekend and every time they introduce changes in the database first and after ~ 60 mins later they deploy new code.

Taking this into consideration, we added the following rules:

1 : Writer hostgroup

2 : Reader hostgroup

INSERT INTO mysql_query_rules (rule_id,active,match_digest,destination_hostgroup,cache_ttl,apply,comment) VALUES (100,1,'^DESCRIBE',2,3600000,1,'Cache and reroute DESCRIBE Table queries on readers HG');

LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

mysql> select rule_id,active,username,schemaname,flagIN,digest,match_digest,match_pattern,destination_hostgroup,cache_ttl,apply,comment from mysql_query_rules where rule_id=164\G
************************* 1. row *************************
              rule_id: 100
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
               digest: NULL
         match_digest: ^DESCRIBE
        match_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 3600000
                apply: 1
              comment: Cache and reroute DESCRIBE Table queries on readers HG 

Execute below command to reset the stats_mysql_query_digest results:

SELECT 1 FROM stats_mysql_query_digest_reset LIMIT 1;

Monitor stats : 

Cached queries are the ones marked with a special hostgroup -1 (this means that these queries were not sent to any hostgroup), and the total execution time for the queries cached is 0 (this means that the request was served within the same events loop).

mysql> select hostgroup,digest_text,count_star,FROM_UNIXTIME(last_seen) from stats_mysql_query_digest where digest_text like '%DESCRIBE%';
+-----------+------------------------------------+------------+--------------------------+
| hostgroup | digest_text                        | count_star | FROM_UNIXTIME(last_seen) |
+-----------+------------------------------------+------------+--------------------------+
| -1        | DESCRIBE `agent`                   | 178        | 2018-10-11 11:35:43      |
| 2         | DESCRIBE `integration`.            | 1          | 2018-10-11 11:22:37      |
| 2         | DESCRIBE `options_available`       | 1          | 2018-10-11 11:22:37      |
| 2         | DESCRIBE `campaignsettings`        | 1          | 2018-10-11 11:22:37      |
| 2         | DESCRIBE `integration`             | 1          | 2018-10-11 11:22:38      |
| 2         | DESCRIBE `clients`                 | 1          | 2018-10-11 11:22:57      |
| 2         | DESCRIBE `companies`               | 1          | 2018-10-11 11:22:37      |
| 2         | DESCRIBE `vsts_options`            | 1          | 2018-10-11 11:22:37      |
| -1        | DESCRIBE `users_details`           | 190211     | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `ticket`                  | 9081       | 2018-10-11 11:35:43      |
| 2         | DESCRIBE `ticket`                  | 1          | 2018-10-11 11:22:38      |
| 2         | DESCRIBE `stats`                   | 1          | 2018-10-11 11:22:37      |
| -1        | DESCRIBE `stats`                   | 19856      | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `companies`               | 40079      | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `options`                 | 20590      | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `com_settings`            | 20215      | 2018-10-11 11:35:43      |
| 2         | DESCRIBE `users_details`           | 3          | 2018-10-11 11:23:00      |
| -1        | DESCRIBE `visit`                   | 20166      | 2018-10-11 11:35:43      |
| 2         | DESCRIBE `integration_many`        | 1          | 2018-10-11 11:22:37      |
| -1        | DESCRIBE `integration_many`        | 21875      | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `integration`             | 20767      | 2018-10-11 11:35:43      |
| -1        | DESCRIBE `task_integration`        | 9032       | 2018-10-11 11:35:43      |
+-----------+------------------------------------+------------+--------------------------+
22 rows in set (0.01 sec)

As per last 30 mins stats, only 13 DESCRIBE queries were served by MySQL and 676980 queries from proxysql cache

mysql> select hostgroup, sum(count_star) from stats_mysql_query_digest where digest_text like 'DESCRIBE%'  group by hostgroup;
+-----------+-----------------+
| hostgroup | sum(count_star) |
+-----------+-----------------+
| -1        | 676980          |
| 2         | 13              |
+-----------+-----------------+
2 rows in set (0.00 sec) 

Another use case we observed that when the client sends more queries on INFORMATION_SCHEMA to perform some pre-validation checks. we can use similar query rules to overcome this situation.

Conclusion: – For some application logic ProxySQL query cache can achieve ~2X performance boost at a minimum and can be more better if ProxySQL instance closer to the application.

ProxySQL query cache may serve stale resultsets due to the way it invalidates cached data (cached data are valid for the interval specified by “cache_ttl”, while MySQL’s cached data get invalidated each time data change). Every query that is cached may return stale data and this may or may not be acceptable by the application so make proper use of this feature on requirements.

Some FAQ’s answer on query cache :

1. Does prepared statement support ProxySQL QC?

– Query cache doesn’t work with PS

2. Does query cache pre-allocate memory based on query_cache_size_MB? Should we reduce the value if not in use?

– No, query cache is not preallocated, Therefore no need to reduce it.

3. How to purge content of the query cache

– There is no command to immediately delete the whole content of the query cache. To define cache invalidation only make use of “cache_ttl”

Some benchmark has been performed [here](https://proxysql.com/blog/scaling-with-proxysql-query-cache) to showcase how ProxySQL Query Cache boosts performance!

More details here : https://github.com/sysown/proxysql/wiki/Query-Cache

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s