30 mins with MySQL Query Rewriter

February 25, 2016

Lire cet article en français

TL;TR

Sometime DBAs have to deal with problematic queries and cannot tackle the problem at the source (problematic queries from ORMs, third party apps,… or source unknown…).

MySQL 5.7 provides :

  • A pre and post parse query rewrite APIs
    • Users can write their own plug-ins
    • You can eliminates many legacy use cases for proxies
  • With the post-parse query plugin, you can:
    • Rewrite problematic queries without the need to make application changes
    • Add hints
    • Modify join order

 

Query Rewrite Plugin APIs

Paraphrasing this MySQL Server Blog’s post :

MySQL now offer two APIs for writing query rewrite plugins.

Pre-parse rewrite plugin API,  is for when you know exactly – as in character-by-character exactly – what the offending queries look like. This one has a hook to intercept the query string right before it’s parsed.

Post-parse rewrite plugin API, comes in right after parsing and acts on the parse tree. It offers the basic functionality to walk over the parsed query, which is a lot more efficient than dealing with a string.

This blog post is about the Rewriter plugin, a post-parse query rewrite plugin, include in the MySQL 5.7 distribution (as of MySQL 5.7.6).

 

Rewriter plugin

Installation and checks

You are in MySQL world, so that’s obviously very easy! 🙂

To install the Rewriter query rewrite plugin, run install_rewriter.sql located in the share directory of your MySQL installation.

~ $ mysql -u root -p < install_rewriter.sql

 

You can now check if everything is ok

mysql> SELECT * FROM mysql.plugin;
+--------------------------+-------------+
| name                     | dl          |
+--------------------------+-------------+
| rewriter                 | rewriter.so |
+--------------------------+-------------+

mysql> SHOW GLOBAL VARIABLES LIKE 'rewriter%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| rewriter_enabled | ON    |
| rewriter_verbose | 1     |
+------------------+-------+

mysql> SHOW GLOBAL STATUS LIKE 'rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 0     |
| Rewriter_number_reloads           | 1     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

New database objects were also created

# Stored procedure

mysql> SELECT ROUTINE_NAME, ROUTINE_TYPE FROM information_schema.ROUTINES 
WHERE ROUTINE_SCHEMA='query_rewrite';
+---------------------+--------------+
| ROUTINE_NAME        | ROUTINE_TYPE |
+---------------------+--------------+
| flush_rewrite_rules | PROCEDURE    |
+---------------------+--------------+


# UDF

mysql> SELECT name, dl, type FROM mysql.func 
WHERE name LIKE '%rewrite%';
+--------------------+-------------+----------+
| name               | dl          | type     |
+--------------------+-------------+----------+
| load_rewrite_rules | rewriter.so | function |
+--------------------+-------------+----------+



# New table is created in a new schema

mysql> SHOW TABLES IN query_rewrite;
+-------------------------+
| Tables_in_query_rewrite |
+-------------------------+
| rewrite_rules           |
+-------------------------+

 

The rewrite_rules table in the query_rewrite database provides persistent storage for the rules that the Rewriter plugin uses to decide whether to rewrite statements.

mysql> SHOW CREATE TABLE query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
       Table: rewrite_rules
Create Table: CREATE TABLE `rewrite_rules` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `pattern` varchar(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `pattern_database` varchar(20) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `replacement` varchar(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `enabled` enum('YES','NO') CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'YES',
  `message` varchar(1000) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `pattern_digest` varchar(32) DEFAULT NULL,
  `normalized_pattern` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

 

You can enable and disable the plugin at runtime :

mysql> SET GLOBAL rewriter_enabled = ON;

mysql> SET GLOBAL rewriter_enabled = OFF;

 

Obviously it also possible to enable the plugin in the configuration file (my.cnf | my.ini)

[mysqld]

rewriter_enabled = ON

 

 

Rewrite a query

Post parse rewrite plugin works only with SELECT statements.

You can try non-select statements but you’ll see, in rewrite_rules.message column, the following error message when you’ll flush the rules:

          message: Pattern needs to be a select statement.

 

Rewriter plugin usage is very simple. Let’s start with a dummy example: transform a SELECT n into a SELECT n+1 (n is an integer).

Patterns

 -> SELECT 1      # Input

 <= SELECT 2    # Output

Add the rewrite rule

To add rules for the Rewriter plugin, add rows to the rewrite_rules table.

mysql> INSERT INTO query_rewrite.rewrite_rules (pattern, replacement) VALUES('SELECT ?', 'SELECT ? + 1');

 

rewrite_rules table content is:

mysql> SELECT * FROM query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
               id: 1
          pattern: SELECT ?
 pattern_database: NULL
      replacement: SELECT ? + 1
          enabled: YES
          message: NULL
    pattern_digest: NULL
normalized_pattern: NULL

 

Rewriter’s stats shows:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 0     |
| Rewriter_number_reloads           | 1     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

Flush the rewrite rule

Then invoke the flush_rewrite_rules() stored procedure to load the rules from the table into the plugin.

mysql> CALL query_rewrite.flush_rewrite_rules();
Query OK, 0 rows affected (0,01 sec)

When the plugin reads each rule from the rules table, it computes a normalized form (digest) from the pattern and a digest hash value, and updates the normalized_pattern and pattern_digest columns:

mysql> SELECT * FROM query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
               id: 1
          pattern: SELECT ?
 pattern_database: NULL
      replacement: SELECT ? + 1
          enabled: YES
          message: NULL
    pattern_digest: 2c7e64d74a4f06d8ceff62d23ae9180c
normalized_pattern: select ?

 

Within a pattern template, ? characters act as parameter markers that match data values. Parameter markers can be used only where data values should appear, not for SQL keywords, identifiers, and so forth. The ? characters should not be enclosed within quotation marks.

 

If the query parsing failed the stored procedure will raise an error:

mysql> CALL query_rewrite.flush_rewrite_rules();
ERROR 1644 (45000): Loading of some rule(s) failed.

You have more details in query_rewrite.rewrite_rules.message column

e.g.

message: Parse error in replacement: "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1"

Rewriter’s stats shows now:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 1     |
| Rewriter_number_reloads           | 2     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

Rewritten query

So we added and flushed the rules, we can now execute a query according to the pattern and see the behaviour

mysql> SELECT 1;
+-------+
| 1 + 1 |
+-------+
|     2 |
+-------+
1 row in set, 1 warning (0,00 sec)

 

Magic !!!

The query was rewritten “on the fly”. However some interesting details are hiding in the warning.

mysql> SHOW WARNINGS;
+-------+------+------------------------------------------------------------------------+
| Level | Code | Message                                                                |
+-------+------+------------------------------------------------------------------------+
| Note  | 1105 | Query 'select 1' rewritten to 'SELECT 1 + 1' by a query rewrite plugin |
+-------+------+------------------------------------------------------------------------+

All we need to know are in the Note : 1105

Rewriter’s stats are updated accordingly:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 1     |
| Rewriter_number_reloads           | 2     |
| Rewriter_number_rewritten_queries | 1     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

To disable an existing rule, modify its enabled column and reload the table into the plugin:

mysql> UPDATE query_rewrite.rewrite_rules SET enabled = 'NO' WHERE id = 1;

mysql> CALL query_rewrite.flush_rewrite_rules();

 

You can also delete table rows

mysql> TRUNCATE TABLE query_rewrite.rewrite_rules;  # Delete all the rows (all the rules)

mysql> CALL query_rewrite.flush_rewrite_rules();

 

OK folks! Let’s see more relevant examples

 

Rewrite plugin examples

Ex 1

Rewrite a Join into a Sub-query; ie for performance reason you must rewrite the query but you don’t have access to the app.

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < {integer};

 <= SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < {integer});

 

Without the rule

SELECT count(distinct emp_no) 
FROM employees.employees INNER JOIN employees.salaries USING(emp_no) 
WHERE DATEDIFF(to_date, from_date) < 2000;
+------------------------+
| count(distinct emp_no) |
+------------------------+
|                 300024 |
+------------------------+

1 row in set (12,93 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement
) 
VALUES
(
'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < ?', 
'SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < ?)'
);


CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT count(distinct emp_no) 
FROM employees.employees INNER JOIN employees.salaries USING(emp_no) 
WHERE DATEDIFF(to_date, from_date) < 2000;
+---------------+
| count(emp_no) |
+---------------+
|        300024 |
+---------------+

1 row in set, 1 warning (3,77 sec)

The query time is dropped from 12.93 s to 3.77

SHOW WARNINGS;
*************************** 1. row ***************************
 Level: Note
  Code: 1105
Message: Query 'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < 2000' rewritten to 'SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < 2000)' by a query rewrite plugin

 

 

Ex 2

Limit query execution time; ie add MySQL 5.7 hint /*+ MAX_EXECUTION_TIME(X)*/ that is query duration can’t be more than X milliseconds

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 <= SELECT /*+ MAX_EXECUTION_TIME(10000)*/ count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 

Without the rule

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 110000;
1 row in set (11,82 sec)

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 70000;
1 row in set (9,22 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement, 
pattern_database
) 
VALUES
(
'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = ?', 
'SELECT /*+ MAX_EXECUTION_TIME(10000)*/ count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary =  ?', 
'employees'
);


CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 110000;
ERROR 3024 (HY000): Query execution was interrupted, maximum statement execution time exceeded

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 70000;
1 row in set (9,85 sec)

 

 

Ex 3

Schema evolution; ie table column added (or dropped) but you can’t change the query (or not immediately)

Patterns

 -> SELECT first_name, last_name FROM employees.employees WHERE year(hire_date) = 2000;

 <= SELECT first_name, last_name, birth_date FROM employees.employees WHERE year(hire_date) = 2000;

 

Without the rule

SELECT 
    first_name, 
    last_name 
FROM employees.employees WHERE year(hire_date) = 2000;
+-------------+------------+
| first_name  | last_name  |
+-------------+------------+
| Ulf         | Flexer     |

...
13 rows in set (0,01 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement, 
pattern_database
) 
VALUES
(
'SELECT first_name, last_name FROM employees.employees WHERE year(hire_date) = ?', 
'SELECT first_name, last_name, birth_date FROM employees.employees WHERE year(hire_date) =  ?', 
'employees'
);

CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT 
    first_name, 
    last_name 
FROM employees.employees WHERE year(hire_date) = 2000;
+-------------+------------+------------+
| first_name  | last_name  | birth_date |
+-------------+------------+------------+
| Ulf         | Flexer     | 1960-09-09 |
...

13 rows in set (0,01 sec)

 

Other ideas ?

– Index hints : the optimizer don’t use the best index, so you can rewrite the query with USE | FORCE | IGNORE index

– Prevent SELECT with “infinite result, so you can rewrite the query adding LIMIT 1000 or whatever.

Like mysql client text –safe-update option

 $ ./mysql --help | grep dummy
  -U, --i-am-a-dummy  Synonym for option --safe-updates, -U.

Sky is the limit 🙂

 

The Rewriter plugin does not replace proper code and server optimisation neither fine tuned query.

Nevertheless it can be very useful.

 

Give it a try, it definitely worth more than 30 minutes.

 

 

Going further

Post-parse query rewrite plugin

 

Pre-parse query rewrite plugin

 

Thank you for using MySQL!

Comments are closed.