Aurora MySQL Parallel query Performance and its gains

As a continuation to my previous blog , I have made benchmarking to find the performance improvement using parallel query.Before starting the testing, we will go through the monitoring stats variables and query analysis (explain).

Monitoring Aurora Parallel Query stats

S:NoParameter Purpose
1Aurora_pq_request_attemptedNo of attempt that query tried to use parallel query feature.
2Aurora_pq_request_executedNo of query executed by parallel query
3Aurora_pq_request_not_chosen_below_min_rowsDue to smaller row in table how many query bypass the parallel query.
4Aurora_pq_max_concurrent_requestsRepresent the no of parallel query can execute based on instance type.
5Aurora_pq_request_in_progressCurrent number of query which is executing by parallel query.

The above variables are status variables, many more variables are there for monitoring. Kindly have a look on. https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-mysql-parallel-query.html#aurora-mysql-parallel-query-monitoring

Performance analysis

Lab Environment :

Instance type (writer and reader)r3.large
Aurora version2.09
MySQL version5.7
Table size255 GB
Data Simulationsysbench
Architecture1 reader and 1 writer

Explain plan

By using the explain plain of the query we can able to see whether query is using parallel query or not.

Without parallel query

mysql>  explain select count(id) from sbtest1 where id > 12345 and k < 6789;
+----+-------------+---------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows      | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+
|  1 | SIMPLE      | sbtest1 | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL | 654698147 |    33.33 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+-----------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

With parallel query

mysql>  explain select count(id) from sbtest1 where id > 12345 and k < 6789;
+----+-------------+---------+------------+------+---------------+------+---------+------+------------+----------+----------------------------------------------------------------------------+
| id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows       | filtered | Extra                                                                      |
+----+-------------+---------+------------+------+---------------+------+---------+------+------------+----------+----------------------------------------------------------------------------+
|  1 | SIMPLE      | sbtest1 | NULL       | ALL  | PRIMARY       | NULL | NULL    | NULL | 1309396294 |    16.66 | Using where; Using parallel query (2 columns, 2 filters, 0 exprs; 0 extra) |
+----+-------------+---------+------------+------+---------------+------+---------+------+------------+----------+----------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

Observation in Extra ( Explain )

In explain plain, extra column we can see there are 4 options along with that. We will see one by one explanation of it.

Using parallel query (2 columns, 2 filters, 0 exprs; 0 extra)

ColumnsNo.of column in the query.
FiltersNo.of column in where clause with equal, not-equal, range.
ExprsColumn with function or operator , that can proceed by the parallel query.
ExtraNumber of expression that cannot be proceed by parallel query.

Testing

Instead of testing only with single format, we are going to test the parallel query with different WHERE clause and conditions.

Using function

Without parallel query

mysql> select sql_no_cache sum(k) from sbtest1 where upper(k)=231212 and upper(c) is not null;
+--------+
| sum(k) |
+--------+
|   NULL |
+--------+
1 row in set (2 hours 33 min 22.40 sec)

Without parallel Query there is a spike in system resources persists for a prolonged period.

With parallel query

mysql> select sql_no_cache sum(k) from sbtest1 where upper(k)=231212 and upper(c) is not null;
+--------+
| sum(k) |
+--------+
|   NULL |
+--------+
1 row in set (1 min 6.61 sec)

From the above stats, we can able to understand from the above stats that without parallel query the CPU utilisation was in complete saturation state and also the read latency of the server got increased due to the query. And the query execution took around 2 hours 30 mins to complete.

We can see there is a dramatic change post enabling the parallel query. The execution just took 1 min to complete as well as the resource utilisation was less, CPU just spike to 70% to process the query and the latency is also less.

Using Eq Ref

Without parallel query

mysql> select sql_no_cache count(*) from sbtest1 where k=7256238746;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (2 hours 25 min 9.11 sec)

With parallel query

mysql> select count(*) from sbtest1 where k=7256238746;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (2 min 12.22 sec)

Using Join Condition

Without parallel query

mysql> select sql_no_cache count(t1.k) from sbtest1 t1 inner join sbtest3 t2 on t1.id=t2.id where t1.k=247423;
+-------------+
| count(t1.k) |
+-------------+
|           0 |
+-------------+
1 row in set (10 min 57.72 sec)

With parallel query

mysql> select count(t1.k) from sbtest1 t1 inner join sbtest3 t2 on t1.id=t2.id where t1.k=247423;
+-------------+
| count(t1.k) |
+-------------+
|           0 |
+-------------+
1 row in set (2 min 38.48 sec)

Performance improvement

We have done a testing based on different pattern, now it’s time to consolidate the performance report.

Parallel QueryDisabledEnabled
Function2 hrs 33 mins1 min
Eq Ref2 hrs 25 mins2 mins
Inner Join10 mins2 mins

By the stats we can summarise that parallel query can be more gainful for MySQL analytical workload with better performance. We have gained around 150X performance in a few cases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s