30 mins with MySQL Query Rewriter

February 25, 2016

Lire cet article en français

TL;TR

Sometime DBAs have to deal with problematic queries and cannot tackle the problem at the source (problematic queries from ORMs, third party apps,… or source unknown…).

MySQL 5.7 provides :

  • A pre and post parse query rewrite APIs
    • Users can write their own plug-ins
    • You can eliminates many legacy use cases for proxies
  • With the post-parse query plugin, you can:
    • Rewrite problematic queries without the need to make application changes
    • Add hints
    • Modify join order

 

Query Rewrite Plugin APIs

Paraphrasing this MySQL Server Blog’s post :

MySQL now offer two APIs for writing query rewrite plugins.

Pre-parse rewrite plugin API,  is for when you know exactly – as in character-by-character exactly – what the offending queries look like. This one has a hook to intercept the query string right before it’s parsed.

Post-parse rewrite plugin API, comes in right after parsing and acts on the parse tree. It offers the basic functionality to walk over the parsed query, which is a lot more efficient than dealing with a string.

This blog post is about the Rewriter plugin, a post-parse query rewrite plugin, include in the MySQL 5.7 distribution (as of MySQL 5.7.6).

 

Rewriter plugin

Installation and checks

You are in MySQL world, so that’s obviously very easy! 🙂

To install the Rewriter query rewrite plugin, run install_rewriter.sql located in the share directory of your MySQL installation.

 

You can now check if everything is ok

New database objects were also created

 

The rewrite_rules table in the query_rewrite database provides persistent storage for the rules that the Rewriter plugin uses to decide whether to rewrite statements.

 

You can enable and disable the plugin at runtime :

 

Obviously it also possible to enable the plugin in the configuration file (my.cnf | my.ini)

 

 

Rewrite a query

Post parse rewrite plugin works only with SELECT statements.

You can try non-select statements but you’ll see, in rewrite_rules.message column, the following error message when you’ll flush the rules:

 

Rewriter plugin usage is very simple. Let’s start with a dummy example: transform a SELECT n into a SELECT n+1 (n is an integer).

Patterns

 -> SELECT 1      # Input

 <= SELECT 2    # Output

Add the rewrite rule

To add rules for the Rewriter plugin, add rows to the rewrite_rules table.

 

rewrite_rules table content is:

 

Rewriter’s stats shows:

 

Flush the rewrite rule

Then invoke the flush_rewrite_rules() stored procedure to load the rules from the table into the plugin.

When the plugin reads each rule from the rules table, it computes a normalized form (digest) from the pattern and a digest hash value, and updates the normalized_pattern and pattern_digest columns:

 

Within a pattern template, ? characters act as parameter markers that match data values. Parameter markers can be used only where data values should appear, not for SQL keywords, identifiers, and so forth. The ? characters should not be enclosed within quotation marks.

 

If the query parsing failed the stored procedure will raise an error:

You have more details in query_rewrite.rewrite_rules.message column

e.g.

Rewriter’s stats shows now:

 

Rewritten query

So we added and flushed the rules, we can now execute a query according to the pattern and see the behaviour

 

Magic !!!

The query was rewritten “on the fly”. However some interesting details are hiding in the warning.

All we need to know are in the Note : 1105

Rewriter’s stats are updated accordingly:

 

To disable an existing rule, modify its enabled column and reload the table into the plugin:

 

You can also delete table rows

 

OK folks! Let’s see more relevant examples

 

Rewrite plugin examples

Ex 1

Rewrite a Join into a Sub-query; ie for performance reason you must rewrite the query but you don’t have access to the app.

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < {integer};

 <= SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < {integer});

 

Without the rule

 

Add the rewrite rule

 

With the rule

The query time is dropped from 12.93 s to 3.77

 

 

Ex 2

Limit query execution time; ie add MySQL 5.7 hint /*+ MAX_EXECUTION_TIME(X)*/ that is query duration can’t be more than X milliseconds

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 <= SELECT /*+ MAX_EXECUTION_TIME(10000)*/ count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 

Without the rule

 

Add the rewrite rule

 

With the rule

 

 

Ex 3

Schema evolution; ie table column added (or dropped) but you can’t change the query (or not immediately)

Patterns

 -> SELECT first_name, last_name FROM employees.employees WHERE year(hire_date) = 2000;

 <= SELECT first_name, last_name, birth_date FROM employees.employees WHERE year(hire_date) = 2000;

 

Without the rule

 

Add the rewrite rule

 

With the rule

 

Other ideas ?

– Index hints : the optimizer don’t use the best index, so you can rewrite the query with USE | FORCE | IGNORE index

– Prevent SELECT with “infinite result, so you can rewrite the query adding LIMIT 1000 or whatever.

Like mysql client text –safe-update option

Sky is the limit 🙂

 

The Rewriter plugin does not replace proper code and server optimisation neither fine tuned query.

Nevertheless it can be very useful.

 

Give it a try, it definitely worth more than 30 minutes.

 

 

Going further

Post-parse query rewrite plugin

 

Pre-parse query rewrite plugin

 

Thank you for using MySQL!

Comments are closed.