MySQL Security – User Account Locking

March 14, 2018

If you are using MySQL 8.0, I would recommend you to read : MySQL Security – Failed-Login Tracking and Temporary Account Locking


When thinking about security within a MySQL installation, you should consider a wide range of possible procedures / best practices and how they affect the security of your MySQL server and related applications. MySQL provides many tools / features / plugins in order to protect your data including some advanced features like Transparent Data Encryption aka TDEAudit, Data Masking & De-Identification, Firewall, Password Management, Password Validation Plugin, The Connection-Control Plugins, etc…

MySQL Security

For security reasons some context require you to setup a user account locking policy. Thus an unauthorized user is not able (anymore) to login to the MySQL server. In this 3rd article of the MySQL Security series, we will see how to [un]lock a user account.

User Account Locking

MySQL supports locking and unlocking user accounts using the ACCOUNT LOCK and ACCOUNT UNLOCK clauses for the CREATE USER and ALTER USER statements:

  • When used with CREATE USER, these clauses specify the initial locking state for a new account. In the absence of either clause, the account is created in an unlocked state.
  • When used with ALTER USER, these clauses specify the new locking state for an existing account. In the absence of either clause, the account locking state remains unchanged.

So let’s create a user batchman { not this one 🙂 } with all privileges on sakila database:

mysql> 
select VERSION();
+-----------+
| version() |
+-----------+
| 5.7.21    |
+-----------+

CREATE USER batchman@localhost IDENTIFIED BY 'p4s5W0%d';

GRANT ALL ON sakila.* TO batchman@localhost;

Account locking state is recorded in the account_locked column of the mysql.user table:

mysql> 
SELECT user, host, account_locked FROM mysql.user WHERE user = 'batchman'\G
*************************** 1. row ***************************
          user: batchman
          host: localhost
account_locked: N

The output from SHOW CREATE USER indicates whether an account is locked or unlocked:

mysql> 
SHOW CREATE USER batchman@localhost\G
*************************** 1. row ***************************
CREATE USER for batchman@localhost: CREATE USER 'batchman'@'localhost' IDENTIFIED WITH 'mysql_native_password' 
AS '*15E0AF3C6647A20428477A460202CD7C89D78DDF' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK

Indeed accounts are unlock by default.

The new user account is allowed to connect to the server and can handle sakila’s objects:

$ mysql -u batchman -p

mysql_batchman> 
SELECT USER(); SHOW SCHEMAS;
+--------------------+
| USER()             |
+--------------------+
| batchman@localhost |
+--------------------+


+--------------------+
| Database           |
+--------------------+
| information_schema |
| sakila             |
+--------------------+

Lock account

What if your security policy requires you to lock this account?

ACCOUNT LOCK clause is the answer :

mysql> 
ALTER USER batchman@localhost ACCOUNT LOCK;


SELECT user, host, account_locked FROM mysql.user WHERE user = 'batchman';
+----------+-----------+----------------+
| user     | host      | account_locked |
+----------+-----------+----------------+
| batchman | localhost | Y              |
+----------+-----------+----------------+


mysql> 
SHOW CREATE USER batchman@localhost\G
*************************** 1. row ***************************
CREATE USER for batchman@localhost: CREATE USER 'batchman'@'localhost' 
IDENTIFIED WITH 'mysql_native_password' AS '*15E0AF3C6647A20428477A460202CD7C89D78DDF' 
REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT LOCK

batchman@localhost is now locked!

Any connection attempt with this user will failed :

-- User try to connect
$ mysql -u batchman -p 

ERROR 3118 (HY000): Access denied for user 'batchman'@'localhost'. Account is locked.

Note.

If the user is still connected you’ll need to kill the connection.

MySQL returns an ER_ACCOUNT_HAS_BEEN_LOCKED error.

The server increments the Locked_connects status variable that indicates the number of attempts to connect to a locked account and writes a message to the error log :

-- Back to the administrator session
mysql> 
SHOW GLOBAL STATUS LIKE 'Locked_connects';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Locked_connects | 1     |
+-----------------+-------+

-- In the error log
mysql> system tail -n1 /var/log/mysqld.log
2018-02-01T15:30:35.649393Z 65 [Note] Access denied for user 'batchman'@'localhost'. Account is locked.

In order to go further

MySQL Security Series

  1. Password Validation Plugin
  2. Password Management
  3. User Account Locking
  4. The Connection-Control Plugins
  5. Enterprise Audit
  6. Enterprise Transparent Data Encryption (TDE)
  7. Enterprise Firewall
  8. Enterprise Data Masking and De-Identification

Reference Manual

MySQL Security

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

15

MySQL Security – Password Management

March 7, 2018

When thinking about security within a MySQL installation, you should consider a wide range of possible procedures / best practices and how they affect the security of your MySQL server and related applications. MySQL provides many tools / features / plugins in order to protect your data including some advanced features like Transparent Data Encryption aka TDEAudit, Data Masking & De-Identification, Firewall, Password Validation PluginUser Account Locking, The Connection-Control Plugins, etc…

MySQL Security

Some regulations required that the password is renewed in a timely and appropriate manner (e.g. every 90 days). In this article, 2nd of the MySQL  Security series, we will see how to establish a policy for password expiration with MySQL 5.7 Password Management.

Password Management

Basic password policy practices teaches us :

  • Each user must have a password
  • A user’s password should be changed periodically

MySQL provides password-expiration capability, which enables database administrators to require that users reset their password.

Let’s have a closer look!

$ mysql -u root -p
mysql {root}> 
SELECT VERSION();
+------------+
| VERSION()  |
+------------+
| 5.7.21-log |
+------------+

Basically there are different clauses a DBA can use with CREATE USER or ALTER USER to establish a per account password expiration policy.

Let’s play with some of them.

PASSWORD EXPIRE

Force user to change its password at the first connection.

-- Mark the password expired so that the user must choose a new one at the first connection to the server
mysql {root}> 
CREATE USER 'aUser'@'localhost' IDENTIFIED BY 'AutoGeneratedPwd' PASSWORD EXPIRE;

SELECT user, host, password_lifetime, password_expired, password_last_changed FROM mysql.user WHERE user = 'aUser'\G
*************************** 1. row ***************************
                 user: aUser
                 host: localhost
    password_lifetime: NULL
     password_expired: Y
password_last_changed: 2018-02-08 14:22:24

Note that password_expired column is Y.

This new MySQL user will be able to connect to the server but he must reset its password.

$ mysql -u aUser -p
...

mysql {aUser}> 
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

ALTER USER 'aUser'@'localhost' IDENTIFIED BY 'new_passw0rd';
Query OK, 0 rows affected (0.00 sec)


SELECT USER();
+-----------------+
| USER()          |
+-----------------+
| aUser@localhost |
+-----------------+

Column password_expired is now N.

mysql {root}> 
SELECT user, host, password_lifetime, password_expired, password_last_changed FROM mysql.user WHERE user = 'aUser'\G
*************************** 1. row ***************************
                 user: aUser
                 host: localhost
    password_lifetime: NULL
     password_expired: N
password_last_changed: 2018-02-08 14:23:50

PASSWORD EXPIRE INTERVAL N DAY

Force user to change its password every N days.

mysql {root}> 
-- Require that a new password be chosen every 90 days:
CREATE USER 'aNewUser'@'localhost' IDENTIFIED BY 'n3w_password'  PASSWORD EXPIRE INTERVAL 90 DAY;

SELECT user, host, password_lifetime, password_expired, password_last_changed FROM mysql.user WHERE user = 'aNewUser'\G
*************************** 1. row ***************************
                 user: aNewUser
                 host: localhost
    password_lifetime: 90
     password_expired: N
password_last_changed: 2018-02-08 14:24:53

Note that password_lifetime column is 90.

After 90 days any statement will generate error 1820 :

$ mysql -u aNewUser -p

mysql {aNewUser}> SELECT USER();
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.

Password could be reset with ALTER USER command :

mysql {root}>
ALTER USER 'aNewUser'@'localhost' IDENTIFIED BY '4noth3r_password' PASSWORD EXPIRE INTERVAL 90 DAY;

PASSWORD EXPIRE NEVER

It’s also possible to disable password expiration for an account

mysql {root}>
-- Disables password expiration for the account so that its password never expires.
CREATE USER 'pingDB'@'localhost' IDENTIFIED BY 'new_p4ssword' PASSWORD EXPIRE NEVER;

SELECT user, host, password_lifetime, password_expired, password_last_changed FROM mysql.user WHERE user = 'pingDB'\G
*************************** 1. row ***************************
                 user: pingDB
                 host: localhost
    password_lifetime: 0
     password_expired: N
password_last_changed: 2018-02-08 14:29:43

Note that password_lifetime column is 0.

PASSWORD EXPIRE DEFAULT

This clause sets the account so that the global password expiration policy applies, as specified by the default_password_lifetime system variable.

In MySQL 5.7 it applies to accounts that use MySQL built-in authentication methods (accounts that use an authentication plugin of mysql_native_password or sha256_password).

The default default_password_lifetime value is 0, which disables automatic password expiration. If the value of default_password_lifetime is a positive integer N, it indicates the permitted password lifetime; passwords must be changed every N days.

default_password_lifetime can be changed at runtime with SET GLOBAL command. However it must be set in the MySQL configuration file for persistence.

e.g.

$ cat my.cnf 

[mysqld]
# global policy that passwords have a lifetime of approximately 3 months
default_password_lifetime=90
...

e.g. Create a user account with the default global password policy (90 days)

mysql>
-- default_password_lifetime set to 90 days
SET GLOBAL default_password_lifetime = 90;

SHOW VARIABLES LIKE 'default_password_lifetime';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| default_password_lifetime | 90    |
+---------------------------+-------+

-- Sets the account so that the global expiration policy applies, as specified by the default_password_lifetime system variable.
CREATE USER 'dba'@'localhost' IDENTIFIED BY 'new_pas5word' PASSWORD EXPIRE DEFAULT;

In order to go further

MySQL Security Series

  1. Password Validation Plugin
  2. Password Management
  3. User Account Locking
  4. The Connection-Control Plugins
  5. Enterprise Audit
  6. Enterprise Transparent Data Encryption (TDE)
  7. Enterprise Firewall
  8. Enterprise Data Masking and De-Identification

Reference Manual

MySQL Security

 

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

14

MySQL Security – Password Validation Plugin

March 1, 2018

When thinking about security within a MySQL installation, you should consider a wide range of possible procedures / best practices and how they affect the security of your MySQL server and related applications. MySQL provides many tools / features / plugins in order to protect your data including some advanced features like Transparent Data Encryption aka TDEAudit, Data Masking & De-Identification, Firewall, Password Management, User Account Locking, The Connection-Control Plugins, etc…

MySQL Security

In this article, 1st of a MySQL Security series, we will see how to enforce Strong Passwords with Password Validation Plugin when using MySQL 5.7.

Authentication with ID and password is a very simple and common (because it’s simple) way to secure the access to a resource, however the password can be the weak point of this system. In order to increase the security level, you can required that your user passwords meet certain minimal security requirements, using the MySQL Password validation plugin!

Password Validation Plugin

The Password validation plugin serves to test passwords and improve security. It exposes a set of system variables that enable you to define password policy.

For ALTER USER, CREATE USER, GRANT, and SET PASSWORD statements the plugin checks the password against the current password policy and rejects it if it is weak.

Examples are made with MySQL CE 5.7.21 on Linux:

mysql> SHOW VARIABLES LIKE 'version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| version                 | 5.7.21                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+

Installation

Plugins are located in the… plugin directory. To know where is your MySQL plugin directory you can use SHOW VARIABLES :

mysql> 
SHOW VARIABLES LIKE 'plugin_dir';
+---------------+--------------------------+
| Variable_name | Value                    |
+---------------+--------------------------+
| plugin_dir    | /usr/lib64/mysql/plugin/ |
+---------------+--------------------------+

system ls -l /usr/lib64/mysql/plugin/ | grep validate
-rwxr-xr-x 1 root root   29336 Dec 28 04:07 validate_password.so

Use the regular INSTALL PLUGIN statement:

-- Install validate_password plugin
mysql> 
INSTALL PLUGIN validate_password SONAME 'validate_password.so';

-- Check validate_password status
SELECT PLUGIN_NAME, PLUGIN_STATUS 
FROM INFORMATION_SCHEMA.PLUGINS 
WHERE PLUGIN_NAME LIKE 'validate%';
+-------------------+---------------+
| PLUGIN_NAME       | PLUGIN_STATUS |
+-------------------+---------------+
| validate_password | ACTIVE        |
+-------------------+---------------+

INSTALL PLUGIN loads the plugin, and also registers it in the mysql.plugins system table to cause the plugin to be loaded for each subsequent normal server startup.

Alternatively you can modify the MySQL configuration file (e.g. my.cnf or my.ini) and reboot the instance.

e.g.

# sample from my.cnf
[mysqld]
plugin-load-add=validate_password.so

When installed some system and status variables are available:

mysql> SHOW VARIABLES LIKE 'validate%';
+--------------------------------------+--------+
| Variable_name                        | Value  |
+--------------------------------------+--------+
| validate_password_check_user_name    | OFF    |
| validate_password_dictionary_file    |        |
| validate_password_length             | 8      |
| validate_password_mixed_case_count   | 1      |
| validate_password_number_count       | 1      |
| validate_password_policy             | MEDIUM |
| validate_password_special_char_count | 1      |
+--------------------------------------+--------+


SHOW STATUS LIKE 'validate%';
+-----------------------------------------------+---------------------+
| Variable_name                                 | Value               |
+-----------------------------------------------+---------------------+
| validate_password_dictionary_file_last_parsed | 2018-02-06 14:58:19 |
| validate_password_dictionary_file_words_count | 0                   |
+-----------------------------------------------+---------------------+

They are described here.

Playtime

Let’s play a little a bit with the Password Validation Plugin.

Set Password Validation Plugin to the LOW level

When validate_password_policy is set to LOW (or 0) it checks only the length i.e. validate_password_length >= 8 (by default)

mysql> 
SET GLOBAL validate_password_policy = 0;


SHOW VARIABLES LIKE 'validate_password_policy';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| validate_password_policy | LOW   |
+--------------------------+-------+

Warning

Passwords in the following examples are not secure. Do NOT use trivial passwords!

User creation that is not satisfy the policy will failed

mysql> 
-- NOK because password length < 8 
CREATE USER u_low1 IDENTIFIED by 'p';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements


-- OK because password length >= 8
CREATE USER u_low2 IDENTIFIED by 'p2345678';
Query OK, 0 rows affected (0.01 sec)

CREATE USER u_low3 IDENTIFIED by 'pppppppp';
Query OK, 0 rows affected (0.00 sec)



mysql> 
-- new users created
SELECT user FROM mysql.user WHERE user LIKE 'u%';
+--------+
| user   |
+--------+
| u_low2 |
| u_low3 |
+--------+

Set Password Validation Plugin to the MEDIUM level

When validate_password_policy is set to MEDIUM (or 1) it checks

  • the length i.e. validate_password_length >= 8 (by default)
  • numeric
  • lowercase/uppercase
  • special characters
mysql> 
SET GLOBAL validate_password_policy = 1;


SHOW VARIABLES LIKE 'validate_password_policy';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| validate_password_policy | MEDIUM |
+--------------------------+--------+
mysql>
-- NOK because password length < 8, no special character, nor numeric, nor uppercase
CREATE USER u_medium1 IDENTIFIED by 'p';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirement

-- NOK because no special character, nor uppercase
CREATE USER u_medium2 IDENTIFIED by 'p2345678';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

-- NOK because no uppercase
CREATE USER u_medium3 IDENTIFIED by 'p_345678';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements

-- NOK because no uppercase
CREATE USER u_medium4 IDENTIFIED by 'p_p45678';
ERROR 1819 (HY000): Your password does not satisfy the current policy requirements


-- OK because password length >= 8, numeric, lowercase/uppercase, special character
CREATE USER u_medium5 IDENTIFIED by 'p_P45678';
Query OK, 0 rows affected (0.00 sec)



mysql>
-- new users created
SELECT user FROM mysql.user WHERE user LIKE 'u%';
+-----------+
| user      |
+-----------+
| u_low2    |
| u_low3    |
| u_medium5 |
+-----------+

Set Password Validation Plugin to the STRONG level

When validate_password_policy is set to STRONG (or 2) it checks

  • the length i.e. validate_password_length >= 8 (by default)
  • numeric
  • lowercase/uppercase
  • special characters
  • dictionary file
mysql> 
SET GLOBAL validate_password_policy = 2;


SHOW VARIABLES LIKE 'validate_password_policy';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| validate_password_policy | STRONG |
+--------------------------+--------+

The main difference with the medium policy is the possibility to use a dictionary file to for checking password against. Set validate_password_dictionary_file variable. By default, this variable has an empty value and dictionary checks are not performed.

-- No dictionary file by default
mysql> 
SHOW VARIABLES LIKE 'validate_password_dictionary_file';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| validate_password_dictionary_file |       |
+-----------------------------------+-------+

In order to go further

MySQL Security Series

  1. Password Validation Plugin
  2. Password Management
  3. User Account Locking
  4. The Connection-Control Plugins
  5. Enterprise Audit
  6. Enterprise Transparent Data Encryption (TDE)
  7. Enterprise Firewall
  8. Enterprise Data Masking and De-Identification

Reference Manual

MySQL Security

 

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

16

Setting up ProxySQL 1.4 with MySQL 5.7 Group Replication

January 9, 2018

Lire cet article en français

There are 3 pillars for a database architecture:

  • Monitoring
  • Backup / Restore process
  • High Availability

This blog post is about database High Availability; more precisely about one of the best combo of the moment :
MySQLProxySQL

 

  • MySQL 5.7 Group Replication : the only native HA solution for MySQL, it’s a Single/Multi-master update everywhere replication plugin for MySQL with built-in automatic distributed recovery, conflict detection and group membership.
  • ProxySQL 1.4 : probably the best proxy for MySQL.

 

Note : MySQL replication is indeed more than a serious alternative for HA. By the way ProxySQL natively handle it.

Note : ProxySQL have a plethora of features, however the purpose of this article is its use in a MySQL Group Replication context..

Note : MySQL NDB Cluster is also a HA solution adapted for the distributed computing environment. This is out of the scope of this article.

 

ProxySQL is compatible with MySQL Group Replication since version 1.3 see (in French) : Configurer ProxySQL pour MySQL Group Replication. However, version 1.4 have a native MySQL Group Replication support. It is therefore easier to use these two popular technologies together.

This is what we will see now.

 

In this article I will make the following assumptions :

 

MySQL Group Replication

Specifications

  • Server version : 5.7.20
  • Plugin version : 1.0
  • Node 1 : mysql_node1, 172.22.0.10 : 3306
  • Node 2 : mysql_node2, 172.22.0.20 : 3306
  • Node 3 : mysql_node3, 172.22.0.30 : 3306

These characteristics therefore represent a 3-nodes MySQL Group Replication cluster, up and running:

node1>
-- MySQL Server version number
SELECT left(version(),6);
+-------------------+
| left(version(),6) |
+-------------------+
| 5.7.20            |
+-------------------+

-- MySQL Group Replication plugin details
SELECT * FROM INFORMATION_SCHEMA.PLUGINS WHERE PLUGIN_NAME LIKE 'group%'\G
*************************** 1. row ***************************
           PLUGIN_NAME: group_replication
        PLUGIN_VERSION: 1.0
         PLUGIN_STATUS: ACTIVE
           PLUGIN_TYPE: GROUP REPLICATION
   PLUGIN_TYPE_VERSION: 1.1
        PLUGIN_LIBRARY: group_replication.so
PLUGIN_LIBRARY_VERSION: 1.7
         PLUGIN_AUTHOR: ORACLE
    PLUGIN_DESCRIPTION: Group Replication (1.0.0)
        PLUGIN_LICENSE: PROPRIETARY
           LOAD_OPTION: FORCE_PLUS_PERMANENT

-- MySQL Group Replication member status
SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 3c2039a6-f152-11e7-90da-0242ac16001e
 MEMBER_HOST: mysql_node3
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 3c2039a8-f152-11e7-9132-0242ac160014
 MEMBER_HOST: mysql_node2
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
*************************** 3. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 3c36e366-f152-11e7-91a5-0242ac16000a
 MEMBER_HOST: mysql_node1
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

MySQL Enterprise Monitor gives us a graphical view of the cluster and its state: (click to enlarge) :

MySQL

 

The cluster is in single primary mode, ie only one node is available for reading & writing at a time (while the other 2 nodes are read-only).

node1>
-- Is single primary mode activated?
SHOW VARIABLES LIKE 'group_replication_single_primary_mode';
+---------------------------------------+-------+
| Variable_name                         | Value |
+---------------------------------------+-------+
| group_replication_single_primary_mode | ON    |
+---------------------------------------+-------+

-- Who is the primary node?
SELECT MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE
FROM performance_schema.replication_group_members
INNER JOIN performance_schema.global_status ON (MEMBER_ID = VARIABLE_VALUE)
WHERE VARIABLE_NAME='group_replication_primary_member'\G
*************************** 1. row ***************************
   MEMBER_ID: 3c36e366-f152-11e7-91a5-0242ac16000a
 MEMBER_HOST: mysql_node1
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

MySQL Enterprise Monitor shows us (click to enlarge):

MySQL

 

I am going to extend the MySQL 5.7 sys schema with the following scriptaddition_to_sys.sql

USE sys;

DELIMITER $$

CREATE FUNCTION IFZERO(a INT, b INT)
RETURNS INT
DETERMINISTIC
RETURN IF(a = 0, b, a)$$

CREATE FUNCTION LOCATE2(needle TEXT(10000), haystack TEXT(10000), offset INT)
RETURNS INT
DETERMINISTIC
RETURN IFZERO(LOCATE(needle, haystack, offset), LENGTH(haystack) + 1)$$

CREATE FUNCTION GTID_NORMALIZE(g TEXT(10000))
RETURNS TEXT(10000)
DETERMINISTIC
RETURN GTID_SUBTRACT(g, '')$$

CREATE FUNCTION GTID_COUNT(gtid_set TEXT(10000))
RETURNS INT
DETERMINISTIC
BEGIN
  DECLARE result BIGINT DEFAULT 0;
  DECLARE colon_pos INT;
  DECLARE next_dash_pos INT;
  DECLARE next_colon_pos INT;
  DECLARE next_comma_pos INT;
  SET gtid_set = GTID_NORMALIZE(gtid_set);
  SET colon_pos = LOCATE2(':', gtid_set, 1);
  WHILE colon_pos != LENGTH(gtid_set) + 1 DO
     SET next_dash_pos = LOCATE2('-', gtid_set, colon_pos + 1);
     SET next_colon_pos = LOCATE2(':', gtid_set, colon_pos + 1);
     SET next_comma_pos = LOCATE2(',', gtid_set, colon_pos + 1);
     IF next_dash_pos < next_colon_pos AND next_dash_pos < next_comma_pos THEN
       SET result = result +
         SUBSTR(gtid_set, next_dash_pos + 1,
                LEAST(next_colon_pos, next_comma_pos) - (next_dash_pos + 1)) -
         SUBSTR(gtid_set, colon_pos + 1, next_dash_pos - (colon_pos + 1)) + 1;
     ELSE
       SET result = result + 1;
     END IF;
     SET colon_pos = next_colon_pos;
  END WHILE;
  RETURN result;
END$$

CREATE FUNCTION gr_applier_queue_length()
RETURNS INT
DETERMINISTIC
BEGIN
  RETURN (SELECT sys.gtid_count( GTID_SUBTRACT( (SELECT
Received_transaction_set FROM performance_schema.replication_connection_status
WHERE Channel_name = 'group_replication_applier' ), (SELECT
@@global.GTID_EXECUTED) )));
END$$

CREATE FUNCTION gr_member_in_primary_partition()
RETURNS VARCHAR(3)
DETERMINISTIC
BEGIN
  RETURN (SELECT IF( MEMBER_STATE='ONLINE' AND ((SELECT COUNT(*) FROM
performance_schema.replication_group_members WHERE MEMBER_STATE != 'ONLINE') >=
((SELECT COUNT(*) FROM performance_schema.replication_group_members)/2) = 0),
'YES', 'NO' ) FROM performance_schema.replication_group_members JOIN
performance_schema.replication_group_member_stats USING(member_id));
END$$

CREATE VIEW gr_member_routing_candidate_status AS SELECT
sys.gr_member_in_primary_partition() as viable_candidate,
IF( (SELECT (SELECT GROUP_CONCAT(variable_value) FROM
performance_schema.global_variables WHERE variable_name IN ('read_only',
'super_read_only')) != 'OFF,OFF'), 'YES', 'NO') as read_only,
sys.gr_applier_queue_length() as transactions_behind, Count_Transactions_in_queue as 'transactions_to_cert' from performance_schema.replication_group_member_stats;$$

DELIMITER ;

 

So I load the functions and views into the cluster primary node (mysql_node1) :

# Loading extra functions and views to sys schema on the cluster (using a primary node)

$ mysql -u root -p -h mysql_node1 < ./addition_to_sys.sql

 

This script will allow ProxySQL to monitor the state of the cluster nodes.

e.g.

node1> 
-- Status of the primary node
SELECT * FROM sys.gr_member_routing_candidate_status;
+------------------+-----------+---------------------+----------------------+
| viable_candidate | read_only | transactions_behind | transactions_to_cert |
+------------------+-----------+---------------------+----------------------+
| YES              | NO        |                   0 |                    0 |
+------------------+-----------+---------------------+----------------------+
node2> 
-- Status of a secondary node
SELECT * FROM sys.gr_member_routing_candidate_status;
+------------------+-----------+---------------------+----------------------+
| viable_candidate | read_only | transactions_behind | transactions_to_cert |
+------------------+-----------+---------------------+----------------------+
| YES              | YES       |                   0 |                    0 |
+------------------+-----------+---------------------+----------------------+

 

On the cluster side, the last configuration step consists of creating the supervision users that will be used by ProxySQL (yes, there is a relationship with the previous step) :).

Here again I use the primary of the group :

node1>
-- Create ProxySQL monitoring user inside the cluster
CREATE USER proxysqlmonitor@'%' IDENTIFIED BY 'Very-S3cr3t';

GRANT SELECT ON sys.* TO proxysqlmonitor@'%';

 

Let’s configure ProxySQL now !

 

ProxySQL

Specifications

  • Proxy version : 1.4.4
  • Admin Interface : 172.22.0.2:6032
  • Cluster connection : 172.22.0.2:3306

 

$ proxysql --version
ProxySQL version 1.4.4-139-ga51b040, codename Truls

$ service proxysql status
ProxySQL is running (58).

 

ProxySQL configuration can be done online, which is obviously a very good thing.

Let’s start by logging into the admin interface on the default port 6032 with the default admin user and password… : admin (!) – Defaults that can and must be changed in real life.

$ mysql -u admin -p -P 6032 --protocol=tcp main

 

Servers configuration

First step, add the cluster nodes to the proxy :

-- Add 3 nodes of the cluster into the mysql_servers table
proxy> 
INSERT INTO mysql_servers (hostgroup_id, hostname, port) VALUES (2, 'mysql_node1', 3306), (2, 'mysql_node2', 3306), (2, 'mysql_node3', 3306);
proxy> 
select * from mysql_servers;
+--------------+-------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname    | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+-------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 2            | mysql_node1 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 2            | mysql_node2 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 2            | mysql_node3 | 3306 | ONLINE | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
+--------------+-------------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+

 

Hostgroups configuration

I have done a brief presentation of ProxySQL 1.3 objects last time : Configurer ProxySQL pour MySQL Group Replication

Version 1.4 has some differences, the most notable in our context is the new table : mysql_group_replication_hostgroups :

proxy> 
SHOW CREATE TABLE main.mysql_group_replication_hostgroups\G
*************************** 1. row ***************************ajouter les nœuds du cluster
       table: mysql_group_replication_hostgroups
Create Table: CREATE TABLE mysql_group_replication_hostgroups (
    writer_hostgroup INT CHECK (writer_hostgroup>=0) NOT NULL PRIMARY KEY,
    backup_writer_hostgroup INT CHECK (backup_writer_hostgroup>=0 AND backup_writer_hostgroup<>writer_hostgroup) NOT NULL,
    reader_hostgroup INT NOT NULL CHECK (reader_hostgroup<>writer_hostgroup AND backup_writer_hostgroup<>reader_hostgroup AND reader_hostgroup>0),
    offline_hostgroup INT NOT NULL CHECK (offline_hostgroup<>writer_hostgroup AND offline_hostgroup<>reader_hostgroup AND backup_writer_hostgroup<>offline_hostgroup AND offline_hostgroup>=0),
    active INT CHECK (active IN (0,1)) NOT NULL DEFAULT 1,
    max_writers INT NOT NULL CHECK (max_writers >= 0) DEFAULT 1,
    writer_is_also_reader INT CHECK (writer_is_also_reader IN (0,1)) NOT NULL DEFAULT 0,
    max_transactions_behind INT CHECK (max_transactions_behind>=0) NOT NULL DEFAULT 0,
    comment VARCHAR,
    UNIQUE (reader_hostgroup),
    UNIQUE (offline_hostgroup),
    UNIQUE (backup_writer_hostgroup))

 

The best description I have found is available in my brilliant colleague’s blog post MySQL Group Replication: native support in ProxySQL

I quote

There are many new columns, let’s have a look at their meaning:

Column Name Description
writer_hostgroup the id of the hostgroup that will contain all the members that are writer
backup_writer_hostgroup if the group is running in multi-primary mode, there are multi writers (read_only=0) but if the amount of these writer is
larger than the max_writers, the extra nodes are located in that backup writer group
reader_hostgroup the id of the hostgroup that will contain all the members in read_only
offline_hostgroup the id of the hostgroup that will contain the host not being online or not being part of the Group
active when enabled, ProxySQL monitors the Group and move the server according in the appropriate hostgroups
max_writers limit the amount of nodes in the writer hostgroup in case of group in multi-primary mode
writer_is_also_reader boolean value, 0 or 1, when enabled, a node in the writer hostgroup will also belongs the the reader hostgroup
max_transactions_behind if the value is greater than 0, it defines how much a node can be lagging in applying the transactions from the Group, see this post for more info

by  @Lefred 🙂

 

Our configuration is as follows :

  • writer_hostgroup = 2
  • backup_writer_hostgroup = 4
  • reader_hostgroup = 3
  • offline_hostgroup = 1
  • active = 1
  • max_writers = 1
  • writer_is_also_reader = 1
  • max_transactions_behind = 0

Which gives us :

proxy> 
INSERT INTO mysql_group_replication_hostgroups (writer_hostgroup,backup_writer_hostgroup,reader_hostgroup,offline_hostgroup,active,max_writers,writer_is_also_reader,max_transactions_behind) VALUES (2,4,3,1,1,1,1,0);

-- Save & load new configuration
SAVE MYSQL SERVERS TO DISK;
LOAD MYSQL SERVERS TO RUNTIME;
proxy> 
select * from mysql_group_replication_hostgroups\G
*************************** 1. row ***************************
       writer_hostgroup: 2
backup_writer_hostgroup: 4
       reader_hostgroup: 3
      offline_hostgroup: 1
                 active: 1
            max_writers: 1
  writer_is_also_reader: 1
max_transactions_behind: 0
                comment: NULL

 

Supervision configuration

A little further up we created a supervision user in the cluster (remember?).

It is this user that will use ProxySQL to be aware of the state of the various nodes of the cluster :

-- Set user name & password for monitoring module
SET mysql-monitor_username='proxysqlmonitor';
SET mysql-monitor_password='Very-S3cr3t';

-- Save & load new configuration
SAVE MYSQL VARIABLES TO DISK;
LOAD MYSQL VARIABLES TO RUNTIME;
proxy> 
select hostgroup_id, hostname, status  from runtime_mysql_servers;
+--------------+-------------+--------+
| hostgroup_id | hostname    | status |
+--------------+-------------+--------+
| 2            | mysql_node1 | ONLINE |
| 2            | mysql_node3 | ONLINE |
| 2            | mysql_node2 | ONLINE |
+--------------+-------------+--------+

 

Application user creation

The application connects to the MySQL server with a user and password (in the best case: D).

This user must obviously exist in the different MySQL instances that form the cluster, with the correct MySQL privileges.

This user must also be created in ProxySQL :

proxy> 
INSERT INTO mysql_users (username, password, active, default_hostgroup, max_connections) VALUES ('app_user', 'app_pwd, 1, 2, 200);

-- Save & load new configuration
SAVE MYSQL USERS TO DISK;
LOAD MYSQL USERS TO RUNTIME;
proxy> 
select * from mysql_users;
*************************** 1. row ***************************
              username: app_user
              password: app_pwd
                active: 1
               use_ssl: 0
     default_hostgroup: 2
        default_schema: NULL
         schema_locked: 0
transaction_persistent: 1
          fast_forward: 0
               backend: 1
              frontend: 1
       max_connections: 200

 

Change the listening port

By default the application will connect to the cluster through ProxySQL using port 6032 (in our case: 172.22.0.2:6032)

In real life, applications usually connect to MySQL using its default port, 3306.

It is therefore possible (not required) to modify the ProxySQL port to listen on 3306 :

proxy> 
SET mysql-interfaces='0.0.0.0:3306;/tmp/proxysql.sock';

SAVE MYSQL VARIABLES TO DISK;
proxy> 
SHOW VARIABLES LIKE 'mysql-interfaces'\G
*************************** 1. row ***************************
Variable_name: mysql-interfaces
        Value: 0.0.0.0:3306;/tmp/proxysql.sock

 

Note : For a mysterious reason, this last configuration change is not online. In other words I must restart ProxySQL for this change to be taken into account.

$ service proxysql restart
Shutting down ProxySQL: DONE!
Starting ProxySQL: DONE!

 

The ProxySQL configuration for MySQL Group Replication is now complete \o/

I’m taking this opportunity to introduce a new table in version 1.4 : mysql_server_group_replication_log.

Useful for monitoring :

proxy> 
select * from mysql_server_group_replication_log order by time_start_us desc limit 3\G
*************************** 1. row ***************************
           hostname: mysql_node3
               port: 3306
      time_start_us: 1515079109822616
    success_time_us: 1782
   viable_candidate: YES
          read_only: YES
transactions_behind: 0
              error: NULL
*************************** 2. row ***************************
           hostname: mysql_node2
               port: 3306
      time_start_us: 1515079109822292
    success_time_us: 1845
   viable_candidate: YES
          read_only: YES
transactions_behind: 0
              error: NULL
*************************** 3. row ***************************
           hostname: mysql_node1
               port: 3306
      time_start_us: 1515079109821971
    success_time_us: 1582
   viable_candidate: YES
          read_only: NO
transactions_behind: 0
              error: NULL

 

Playtime

As a reminder, workflow is as follows :

  • The app connects to ProxySQL (i.e. it only sees and knows the proxy)
  • ProxySQL retrieves transactions and redirects them to the primary node of the MySQL Group Replication cluster.
  • In case of primary node crash / shutdown,
    • MySQL Group Replication elects a new primary
    • ProxySQL identifies the new primary and sends transactions to this new primary

 

The application must therefore be configured to connect to ProxySQL. For example, if my app is Drupal  then my settings.php file will look like the following code snippet :

...
$databases['default']['default'] = array (
  'database' => 'drupal',
  'username' => 'app_user',
  'password' => 'app_pwd',
  'prefix' => 'drupal_',
  'host' => '172.22.0.2',
  'port' => '3306',
  'namespace' => 'Drupal\\Core\\Database\\Driver\\mysql',
  'driver' => 'mysql',
);
...
  • User : app_user
  • Password : app_pwd
  • Port : 3306 (ProxySQL)
  • Host : 172.22.0.2 (ProxySQL)

QED!

 

Let’s simulate all this in the command line!

My app is the mysql command-line tool. To simplify my commands, I first create a configuration file for the mysql client that contains the following information (btw it’s not recommended to have passwords in clear in a text file)  :

$ cat /tmp/temp.cnf
[mysql]
user=app_user
password=app_pwd
protocol=tcp

 

My servers are configured with the variable report_host filled (see http://dasini.net/blog/2016/11/08/deployer-un-cluster-mysql-group-replication/).

$ for x in {1..5}; do mysql --defaults-file=/tmp/temp.cnf -h 172.22.0.2 -P 3306  -BNe"SELECT @@report_host;" ; done;
mysql_node1
mysql_node1
mysql_node1
mysql_node1
mysql_node1

The cluster primary node is still mysql_node1 (no database failover yet since the beginning of this article, but we get there).

 

Now let’s test the failover with a slightly more complex example.

First, let’s create the test.poc table in InnoDB format :

proxy>
CREATE SCHEMA test;

CREATE TABLE test.poc (
id tinyint unsigned NOT NULL AUTO_INCREMENT,
host varchar(11),
time timestamp,
PRIMARY KEY (id)
) ENGINE=InnoDB;

 

The test application will run the following 2 queries :

  • INSERT INTO test.poc (host) VALUES (@@report_host);
  • SELECT * FROM test.poc;

 

With these 2 queries, the little scripts below and a timely kill -9, we will be able to follow the routing process (ProxySQL) and the database failover (MySQL Group Replication).

$ while true; do
>   mysql --defaults-file=/tmp/temp.cnf -h 172.22.0.2 -P 3306 -BNe"INSERT INTO test.poc (host) VALUES (@@report_host); SELECT * FROM test.poc;";
>   echo;
>   sleep 4;
> done

2	mysql_node1	2018-01-04 16:42:31 <=> Inserted row #1

2	mysql_node1	2018-01-04 16:42:31
9	mysql_node1	2018-01-04 16:42:35 <=> Inserted row #2

2	mysql_node1	2018-01-04 16:42:31
9	mysql_node1	2018-01-04 16:42:35
11	mysql_node3	2018-01-04 16:42:43 <=> Inserted row #3 => Failover! New primary is mysql_node3

2	mysql_node1	2018-01-04 16:42:31
9	mysql_node1	2018-01-04 16:42:35
11	mysql_node3	2018-01-04 16:42:43
18	mysql_node3	2018-01-04 16:42:47 <=> Inserted row #4

Transactions 1 & 2 (ids 2 et 9) are run on mysql_node 1.

From the third transaction (ids 11 et 18), they are run on the new primary mysql_node 3, because mysql_node 1 crashed.

 

Let’s finish this tutorial with pictures.

 

MySQL Enterprise Monitor

As stated in the introduction, a database architecture should rely on the 3 pillars : Monitoring / Backup process / High Availability.

Below a brief introduction of MySQL Enterprise Monitor (MEM) which is the MySQL monitoring tool, available with the commercial editions of MySQL (MySQL Enterprise Edition). MEM enables monitoring of MySQL instances and their hosts, notification of potential issues and problems, and advice on how to correct issues. You can also monitor and troubleshoot all types of MySQL Replication including MySQL Group Replication.

If you want to try our  Enterprise tools, please click here.

Below are some screenshots of the various states of the cluster supervised by MySQL Enterprise Monitor (click to enlarge):

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

MySQL

 

 

The final word(s)

You already know, MySQL Replication and MySQL semi-synchronous Replication.

MySQL Group Replication is an additional tool that brings in especially the concept of built-in Automatic Database Failoverthat was missing in MySQL.

These architectures are used most of the time with a proxy. ProxySQL is without a doubt one of the best companion for MySQL today.

 

Note : MySQL provides MySQL InnoDB Cluster, which combines MySQL technologies ( MySQL Group Replication, MySQL Router and MySQL Shell) to enable you to create highly available clusters of MySQL server instances.
Tutorial : Tutoriel – Déployer MySQL 5.7 InnoDB Cluster (French)

 

 

References

dasini.net

MySQL Group Replication

ProxySQL

 

Thanks for using MySQL!

 

3

How to Get MySQL?

July 24, 2017

Lire cet article en français

When starting a new project, it is generally recommended to go on the most recent version of MySQL, to take advantage of the latest features but also (mainly?) to be sure to be up to date with the security patches.

Powered by MySQL

This blog post centralizes the various URLs where to download the world’s most popular open source database.

MySQL Community Edition

If you’re looking for the latest binaries, client/server packages or source code, in 64/32 bits for Linux, Windows, Mac, Solaris, FreeBSD :

Linux Repository

On your Linux distros, the MySQL is often very old, (or worst you could download a totally different database).

The solution ? Install the MySQL APT/Yum/SUSE repository, which allows a simple and convenient way to install and update the latest GA, as well as the other products :

BTW please have a look on Ivan Ma‘s blog post : Installation of MySQL.

Others

Official MySQL container images, created, maintained and supported by MySQL :

GitHub repository, created, maintained and supported by MySQL :


MySQL NDB Cluster

MySQL NDB Cluster, is a high-availability, high-redundancy version of MySQL adapted for the distributed computing environment. Available in the above repositories, as well as in these 2 dedicated URLs :


MySQL Enterprise Edition

Oracle customers can sign in to their My Oracle Support account:

Oracle Software Delivery Cloud (e-delivery) allows you to evaluate the MySQL Enterprise products :

Official MySQL Enterprise container images:

MySQL Enterprise Edition is also available in Oracle Cloud :


Old versions

Sometimes we do not have a choice, a particular version is mandatory. You will find it in the archives :

Thanks for using MySQL!

1

MySQL first Public Releases

July 19, 2017

Lire cet article en français

I regularly meet with MySQL customers and I’m still a little bit surprised to see critical applications running on “not really” recent versions (to put it mildly) 🙂

The good news is that obviously old versions of MySQL are sufficiently stable and powerful to run the modern business. However, even if I understand that it is sometimes appropriate to freeze all layers of an architecture, it is often a shame not to take advantage of the latest improvements from a performance, stability, security point of view and obviously for the new features that the latest GA provides :

 

Time being relative, let’s back to the future of MySQL first public releases (not GA) …

 

MySQL 3.22 | 1998

  • #1 movie in box office : Saving Private Ryan
  • #1 at Billboard’s Top Hot 100 : Too Close – Next
  • Event : September 4, Google, Inc. is founded in Menlo Park, California, by Stanford University PhD candidates Larry Page and Sergey Brin

 

Sources:
http://dev.cs.ovgu.de/db/mysql/News-3.22.x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/1998
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_1998

 

MySQL 3.23 | 1999

 

Sources:
http://mysql.localhost.net.ar/doc/refman/4.1/en/news-3-23-x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/1999
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_1999

 

MySQL 4.0 | 2001

 

Sources:
http://mysql.localhost.net.ar/doc/refman/4.1/en/news-4-0-x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2001
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2001

 

MySQL 4.1 | 2003

 

Sources:
http://mysql.localhost.net.ar/doc/refman/4.1/en/news-4-1-x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2003
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2003

 

MySQL 5.0 | 2003

 

Sources:
http://ftp.nchu.edu.tw/MySQL/doc/refman/5.0/en/news-5-0-x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2003
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2003

 

MySQL 5.1 | 2005

 

Sources:
http://ftp.nchu.edu.tw/MySQL/doc/refman/5.1/en/news-5-1-x.html
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2005
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2005

 

MySQL 5.5 | 2009

  • #1 movie in box office : Avatar
  • #1 at Billboard’s Top Hot 100 : Boom Boom Pow – The Black Eyed Peas
  • Event : July 22, The longest total solar eclipse of the 21st century, lasting up to 6 minutes and 38.8 seconds, occurs over parts of Asia and the Pacific Ocean.

 

Sources:
https://dev.mysql.com/doc/relnotes/mysql/5.5/en/
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2009
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2009

 

MySQL 5.6 | 2011

 

Sources:
https://dev.mysql.com/doc/relnotes/mysql/5.6/en/
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2011
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2011

 

MySQL 5.7 | 2013

 

Sources:
https://dev.mysql.com/doc/relnotes/mysql/5.7/en/
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2013
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2013

 

MySQL 8.0 | 2016

 

Sources:
https://dev.mysql.com/doc/relnotes/mysql/8.0/en/
http://www.boxofficemojo.com/yearly/
https://en.wikipedia.org/wiki/2016
https://en.wikipedia.org/wiki/Billboard_Year-End_Hot_100_singles_of_2016

 

Sakila in Norway

Thanks for using MySQL!

2

30 mins with MySQL Query Rewriter

February 25, 2016

Lire cet article en français

TL;TR

Sometime DBAs have to deal with problematic queries and cannot tackle the problem at the source (problematic queries from ORMs, third party apps,… or source unknown…).

MySQL 5.7 provides :

  • A pre and post parse query rewrite APIs
    • Users can write their own plug-ins
    • You can eliminates many legacy use cases for proxies
  • With the post-parse query plugin, you can:
    • Rewrite problematic queries without the need to make application changes
    • Add hints
    • Modify join order

 

Query Rewrite Plugin APIs

Paraphrasing this MySQL Server Blog’s post :

MySQL now offer two APIs for writing query rewrite plugins.

Pre-parse rewrite plugin API,  is for when you know exactly – as in character-by-character exactly – what the offending queries look like. This one has a hook to intercept the query string right before it’s parsed.

Post-parse rewrite plugin API, comes in right after parsing and acts on the parse tree. It offers the basic functionality to walk over the parsed query, which is a lot more efficient than dealing with a string.

This blog post is about the Rewriter plugin, a post-parse query rewrite plugin, include in the MySQL 5.7 distribution (as of MySQL 5.7.6).

 

Rewriter plugin

Installation and checks

You are in MySQL world, so that’s obviously very easy! 🙂

To install the Rewriter query rewrite plugin, run install_rewriter.sql located in the share directory of your MySQL installation.

~ $ mysql -u root -p < install_rewriter.sql

 

You can now check if everything is ok

mysql> SELECT * FROM mysql.plugin;
+--------------------------+-------------+
| name                     | dl          |
+--------------------------+-------------+
| rewriter                 | rewriter.so |
+--------------------------+-------------+

mysql> SHOW GLOBAL VARIABLES LIKE 'rewriter%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| rewriter_enabled | ON    |
| rewriter_verbose | 1     |
+------------------+-------+

mysql> SHOW GLOBAL STATUS LIKE 'rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 0     |
| Rewriter_number_reloads           | 1     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

New database objects were also created

# Stored procedure

mysql> SELECT ROUTINE_NAME, ROUTINE_TYPE FROM information_schema.ROUTINES 
WHERE ROUTINE_SCHEMA='query_rewrite';
+---------------------+--------------+
| ROUTINE_NAME        | ROUTINE_TYPE |
+---------------------+--------------+
| flush_rewrite_rules | PROCEDURE    |
+---------------------+--------------+


# UDF

mysql> SELECT name, dl, type FROM mysql.func 
WHERE name LIKE '%rewrite%';
+--------------------+-------------+----------+
| name               | dl          | type     |
+--------------------+-------------+----------+
| load_rewrite_rules | rewriter.so | function |
+--------------------+-------------+----------+



# New table is created in a new schema

mysql> SHOW TABLES IN query_rewrite;
+-------------------------+
| Tables_in_query_rewrite |
+-------------------------+
| rewrite_rules           |
+-------------------------+

 

The rewrite_rules table in the query_rewrite database provides persistent storage for the rules that the Rewriter plugin uses to decide whether to rewrite statements.

mysql> SHOW CREATE TABLE query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
       Table: rewrite_rules
Create Table: CREATE TABLE `rewrite_rules` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `pattern` varchar(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `pattern_database` varchar(20) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `replacement` varchar(10000) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `enabled` enum('YES','NO') CHARACTER SET utf8 COLLATE utf8_bin NOT NULL DEFAULT 'YES',
  `message` varchar(1000) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  `pattern_digest` varchar(32) DEFAULT NULL,
  `normalized_pattern` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

 

You can enable and disable the plugin at runtime :

mysql> SET GLOBAL rewriter_enabled = ON;

mysql> SET GLOBAL rewriter_enabled = OFF;

 

Obviously it also possible to enable the plugin in the configuration file (my.cnf | my.ini)

[mysqld]

rewriter_enabled = ON

 

 

Rewrite a query

Post parse rewrite plugin works only with SELECT statements.

You can try non-select statements but you’ll see, in rewrite_rules.message column, the following error message when you’ll flush the rules:

          message: Pattern needs to be a select statement.

 

Rewriter plugin usage is very simple. Let’s start with a dummy example: transform a SELECT n into a SELECT n+1 (n is an integer).

Patterns

 -> SELECT 1      # Input

 <= SELECT 2    # Output

Add the rewrite rule

To add rules for the Rewriter plugin, add rows to the rewrite_rules table.

mysql> INSERT INTO query_rewrite.rewrite_rules (pattern, replacement) VALUES('SELECT ?', 'SELECT ? + 1');

 

rewrite_rules table content is:

mysql> SELECT * FROM query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
               id: 1
          pattern: SELECT ?
 pattern_database: NULL
      replacement: SELECT ? + 1
          enabled: YES
          message: NULL
    pattern_digest: NULL
normalized_pattern: NULL

 

Rewriter’s stats shows:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 0     |
| Rewriter_number_reloads           | 1     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

Flush the rewrite rule

Then invoke the flush_rewrite_rules() stored procedure to load the rules from the table into the plugin.

mysql> CALL query_rewrite.flush_rewrite_rules();
Query OK, 0 rows affected (0,01 sec)

When the plugin reads each rule from the rules table, it computes a normalized form (digest) from the pattern and a digest hash value, and updates the normalized_pattern and pattern_digest columns:

mysql> SELECT * FROM query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
               id: 1
          pattern: SELECT ?
 pattern_database: NULL
      replacement: SELECT ? + 1
          enabled: YES
          message: NULL
    pattern_digest: 2c7e64d74a4f06d8ceff62d23ae9180c
normalized_pattern: select ?

 

Within a pattern template, ? characters act as parameter markers that match data values. Parameter markers can be used only where data values should appear, not for SQL keywords, identifiers, and so forth. The ? characters should not be enclosed within quotation marks.

 

If the query parsing failed the stored procedure will raise an error:

mysql> CALL query_rewrite.flush_rewrite_rules();
ERROR 1644 (45000): Loading of some rule(s) failed.

You have more details in query_rewrite.rewrite_rules.message column

e.g.

message: Parse error in replacement: "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1"

Rewriter’s stats shows now:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 1     |
| Rewriter_number_reloads           | 2     |
| Rewriter_number_rewritten_queries | 0     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

Rewritten query

So we added and flushed the rules, we can now execute a query according to the pattern and see the behaviour

mysql> SELECT 1;
+-------+
| 1 + 1 |
+-------+
|     2 |
+-------+
1 row in set, 1 warning (0,00 sec)

 

Magic !!!

The query was rewritten “on the fly”. However some interesting details are hiding in the warning.

mysql> SHOW WARNINGS;
+-------+------+------------------------------------------------------------------------+
| Level | Code | Message                                                                |
+-------+------+------------------------------------------------------------------------+
| Note  | 1105 | Query 'select 1' rewritten to 'SELECT 1 + 1' by a query rewrite plugin |
+-------+------+------------------------------------------------------------------------+

All we need to know are in the Note : 1105

Rewriter’s stats are updated accordingly:

mysql> SHOW GLOBAL status LIKE '%rewriter%';
+-----------------------------------+-------+
| Variable_name                     | Value |
+-----------------------------------+-------+
| Rewriter_number_loaded_rules      | 1     |
| Rewriter_number_reloads           | 2     |
| Rewriter_number_rewritten_queries | 1     |
| Rewriter_reload_error             | OFF   |
+-----------------------------------+-------+

 

To disable an existing rule, modify its enabled column and reload the table into the plugin:

mysql> UPDATE query_rewrite.rewrite_rules SET enabled = 'NO' WHERE id = 1;

mysql> CALL query_rewrite.flush_rewrite_rules();

 

You can also delete table rows

mysql> TRUNCATE TABLE query_rewrite.rewrite_rules;  # Delete all the rows (all the rules)

mysql> CALL query_rewrite.flush_rewrite_rules();

 

OK folks! Let’s see more relevant examples

 

Rewrite plugin examples

Ex 1

Rewrite a Join into a Sub-query; ie for performance reason you must rewrite the query but you don’t have access to the app.

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < {integer};

 <= SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < {integer});

 

Without the rule

SELECT count(distinct emp_no) 
FROM employees.employees INNER JOIN employees.salaries USING(emp_no) 
WHERE DATEDIFF(to_date, from_date) < 2000;
+------------------------+
| count(distinct emp_no) |
+------------------------+
|                 300024 |
+------------------------+

1 row in set (12,93 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement
) 
VALUES
(
'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < ?', 
'SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < ?)'
);


CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT count(distinct emp_no) 
FROM employees.employees INNER JOIN employees.salaries USING(emp_no) 
WHERE DATEDIFF(to_date, from_date) < 2000;
+---------------+
| count(emp_no) |
+---------------+
|        300024 |
+---------------+

1 row in set, 1 warning (3,77 sec)

The query time is dropped from 12.93 s to 3.77

SHOW WARNINGS;
*************************** 1. row ***************************
 Level: Note
  Code: 1105
Message: Query 'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE DATEDIFF(to_date, from_date) < 2000' rewritten to 'SELECT count(emp_no) FROM employees.employees WHERE emp_no IN ( SELECT emp_no FROM employees.salaries WHERE DATEDIFF(to_date, from_date) < 2000)' by a query rewrite plugin

 

 

Ex 2

Limit query execution time; ie add MySQL 5.7 hint /*+ MAX_EXECUTION_TIME(X)*/ that is query duration can’t be more than X milliseconds

Patterns

 -> SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 <= SELECT /*+ MAX_EXECUTION_TIME(10000)*/ count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = {integer};

 

Without the rule

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 110000;
1 row in set (11,82 sec)

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 70000;
1 row in set (9,22 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement, 
pattern_database
) 
VALUES
(
'SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = ?', 
'SELECT /*+ MAX_EXECUTION_TIME(10000)*/ count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary =  ?', 
'employees'
);


CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 110000;
ERROR 3024 (HY000): Query execution was interrupted, maximum statement execution time exceeded

SELECT count(distinct emp_no) FROM employees.employees INNER JOIN employees.salaries USING(emp_no) WHERE salary = 70000;
1 row in set (9,85 sec)

 

 

Ex 3

Schema evolution; ie table column added (or dropped) but you can’t change the query (or not immediately)

Patterns

 -> SELECT first_name, last_name FROM employees.employees WHERE year(hire_date) = 2000;

 <= SELECT first_name, last_name, birth_date FROM employees.employees WHERE year(hire_date) = 2000;

 

Without the rule

SELECT 
    first_name, 
    last_name 
FROM employees.employees WHERE year(hire_date) = 2000;
+-------------+------------+
| first_name  | last_name  |
+-------------+------------+
| Ulf         | Flexer     |

...
13 rows in set (0,01 sec)

 

Add the rewrite rule

INSERT INTO query_rewrite.rewrite_rules 
(
pattern, 
replacement, 
pattern_database
) 
VALUES
(
'SELECT first_name, last_name FROM employees.employees WHERE year(hire_date) = ?', 
'SELECT first_name, last_name, birth_date FROM employees.employees WHERE year(hire_date) =  ?', 
'employees'
);

CALL query_rewrite.flush_rewrite_rules();

 

With the rule

SELECT 
    first_name, 
    last_name 
FROM employees.employees WHERE year(hire_date) = 2000;
+-------------+------------+------------+
| first_name  | last_name  | birth_date |
+-------------+------------+------------+
| Ulf         | Flexer     | 1960-09-09 |
...

13 rows in set (0,01 sec)

 

Other ideas ?

– Index hints : the optimizer don’t use the best index, so you can rewrite the query with USE | FORCE | IGNORE index

– Prevent SELECT with “infinite result, so you can rewrite the query adding LIMIT 1000 or whatever.

Like mysql client text –safe-update option

 $ ./mysql --help | grep dummy
  -U, --i-am-a-dummy  Synonym for option --safe-updates, -U.

Sky is the limit 🙂

 

The Rewriter plugin does not replace proper code and server optimisation neither fine tuned query.

Nevertheless it can be very useful.

 

Give it a try, it definitely worth more than 30 minutes.

 

 

Going further

Post-parse query rewrite plugin

 

Pre-parse query rewrite plugin

 

Thank you for using MySQL!

Comments Off on 30 mins with MySQL Query Rewriter

30 mins with JSON in MySQL

November 17, 2015

Lire cet article en français

Note: This article is inspired by Morgan Tocker‘s talk MySQL 5.7 + JSON.

Note 2: You may also be interested by 30 mins with MySQL JSON functions

Note 3: Handling JSON documents could be also done with MySQL Document Store.

As you may know MySQL 5.7 is GA and has over than 150 new features. One of them is a Native JSON Data Type and JSON Functions: “Allows for efficient and flexible storage, search and manipulation of schema-less data. Enhancements include a new internal binary format, support for easy integration within SQL, and index management on the JSON Documents using generated columns”.

Sounds interesting! Let’s take half an hour to have a foretaste…

Get JSON documents

First let’s get data in JSON format. Mirco Zeiss provides a really big JSON file representing san francisco’s subdivision parcels (from SF Open Data) at https://github.com/zemirco/sf-city-lots-json

To use these data some tweaks are necessary:

$ grep "^{ \"type" citylots.json > properties.json

$ head -n1 properties.json 
{  
   "type":"Feature",
   "properties":{  
      "MAPBLKLOT":"0001001",
      "BLKLOT":"0001001",
      "BLOCK_NUM":"0001",
      "LOT_NUM":"001",
      "FROM_ST":"0",
      "TO_ST":"0",
      "STREET":"UNKNOWN",
      "ST_TYPE":null,
      "ODD_EVEN":"E"
   },
   "geometry":{  
      "type":"Polygon",
      "coordinates":[  
         [  
            [  
               -122.422003528252475,
               37.808480096967251,
               0.0
            ],
            [  
               -122.422076013325281,
               37.808835019815085,
               0.0
            ],
            [  
               -122.421102174348633,
               37.808803534992904,
               0.0
            ],
            [  
               -122.421062569067274,
               37.808601056818148,
               0.0
            ],
            [  
               -122.422003528252475,
               37.808480096967251,
               0.0
            ]
         ]
      ]
   }
}

Looks good!

Note: The size of JSON documents stored in JSON columns is limited to the value of the max_allowed_packet system variable. (While the server manipulates a JSON value internally in memory, it can be larger; the limit applies when the server stores it.).

Our JSON document will be stored in an InnoDB table: features

CREATE TABLE features (
 id int(11) NOT NULL AUTO_INCREMENT,
 feature json NOT NULL,
 PRIMARY KEY (id)
) ENGINE=InnoDB;

Another way to store JSON documents is to put them in a string (VARCHAR or TEXT).

Let’s see if there are some differences between JSON documents stored in a string or in a JSON column.

CREATE TABLE features_TEXT (
 id int(11) NOT NULL AUTO_INCREMENT,
 feature longtext NOT NULL,
 PRIMARY KEY (id)
) ENGINE=InnoDB;

Note: TEXT type is not large enough to handle our JSON data. (ERROR 1406 (22001): Data too long for column ‘feature’ at row 17360). LONGTEXT will do the job.

Populate tables

Note: In order to have a better idea of query execution time on my old (and not so stable) laptop with a small MySQL config (e.g. Buffer pool = 128MB), I ran the queries many time using mysqlslap: mysqlslap -c1 -i <N> { Concurrency = 1 / Iteration > 20 (depending on the query duration) }

So most of the time I’ll show mysqlslap output e.g. “Minimum number of seconds to run all queries: 59.392 seconds

provides by mysqlslap instead of regular query output e.g. “Query OK, 206560 rows affected (59.39 sec)“.

Copy JSON data in features

LOAD DATA INFILE 'properties.json' INTO TABLE features (feature);

Minimum number of seconds to run all queries: 59.392 seconds

Copy JSON data in features_TEXT

LOAD DATA INFILE 'properties.json' INTO TABLE features_TEXT (feature);

Minimum number of seconds to run all queries: 39.784 seconds

Loading 206560 records into my respective tables shows performance difference about 40% slower in JSON column compare to TEXT.

However, be aware that MySQL JSON data type provides:

  • Automatic validation of JSON documents stored in JSON columns. Meaning that invalid documents produce an error.
  • Optimized storage format. JSON documents stored in JSON columns are converted to an internal format that permits quick read access to document elements. When the server later must read a JSON value stored in this binary format, the value need not be parsed from a text representation. The binary format is structured to enable the server to look up subobjects or nested values directly by key or array index without reading all values before or after them in the document.

Nothing comparable with the TEXT data type, in other words these features have a cost, logic and fair!

Let’s have a look on tables metadata:

SHOW TABLE STATUS LIKE 'features'\G
*************************** 1. row ***************************
          Name: features
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 184218
Avg_row_length: 1250
   Data_length: 230326272  #220 MB
  Index_length: 0
     Data_free: 3145728
SHOW TABLE STATUS LIKE 'features_TEXT'\G
*************************** 1. row ***************************
          Name: features_TEXT
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 188784
Avg_row_length: 1370
   Data_length: 258654208  #247 MB
  Index_length: 0
     Data_free: 4194304

Interesting point here, LONGTEXT data type consume more space than the JSON (optimized storage format) data type, about 20% more.

Return data from a JSON document

MySQL 5.7 provides a bunch of JSON functions.

JSON_EXTRACT returns data from a JSON document. Furthermore since MySQL 5.7.9 you can use inlined JSON path expressions that simplifies queries that deal with JSON data and make them more human-readable:

e.g.

JSON_EXTRACT(col, “$.json_field”) is similar to col->”$.json_field”

So how about retrieve our JSON documents?

Table with JSON data type

SELECT DISTINCT feature->"$.type" AS json_extract FROM features\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features
        type: ALL
possible_keys: NULL
         key: NULL
     key_len: NULL
         ref: NULL
        rows: 182309
    filtered: 100.00
       Extra: Using temporary

Minimum number of seconds to run all queries: 4.470 seconds

Table with TEXT data type

SELECT DISTINCT feature->"$.type" AS json_extract FROM features_TEXT\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features_TEXT
        type: ALL
possible_keys: NULL
         key: NULL
     key_len: NULL
         ref: NULL
        rows: 177803
    filtered: 100.00
       Extra: Using temporary

Minimum number of seconds to run all queries: 29.365 seconds

Get these documents implies a full table scan (no surprise).

However we can see the power of the MySQL JSON internal format that permits quick read access to document elements.

In this example the query execution time is about 7 times faster with JSON data type compare to TEXT.

Generated column

JSON columns cannot be indexed. BUT you can work around this restriction by creating an index on a generated column that extracts a scalar value from the JSON column. Generated columns can either be materialized (stored) or non-materialized (virtual).

Create a generated column is quite easy. And a VIRTUAL one is costless because column values are not stored, but are evaluated when rows are read, immediately after any BEFORE triggers.

A virtual column takes no storage. It’s usually what you’ll need in real life.

Last but not least it’s only about metadata change so adding a virtual column is fast (no table rebuild). It only requires a quick system table update that registers the new metadata.

ALTER TABLE features ADD feature_type VARCHAR(30) AS (feature->"$.type") VIRTUAL;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0
ALTER TABLE features_TEXT ADD feature_type VARCHAR(30) AS  (feature->"$.type") VIRTUAL;
Query OK, 0 rows affected (0.10 sec)
Records: 0  Duplicates: 0  Warnings: 0

Note: The disadvantage of such approach is that values are stored twice; once as the value of the generated column and once in the index.

New table descriptions are now:

CREATE TABLE features (
 id int(11) NOT NULL AUTO_INCREMENT,
 feature json NOT NULL,
 feature_type varchar(30) GENERATED ALWAYS AS (feature->"$.type") VIRTUAL,
 PRIMARY KEY (id)
) ENGINE=InnoDB
CREATE TABLE features_TEXT (
 id int(11) NOT NULL AUTO_INCREMENT,
 feature longtext NOT NULL,
 feature_type varchar(30) GENERATED ALWAYS AS (feature->"$.type") VIRTUAL,
 PRIMARY KEY (id)
) ENGINE=InnoDB

Let’s have a look on table metadata:

ANALYZE TABLE features, features_TEXT;

SHOW TABLE STATUS LIKE 'features'\G
*************************** 1. row ***************************
          Name: features
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 184218
Avg_row_length: 1250
   Data_length: 230326272  #220 MB
  Index_length: 0
     Data_free: 314572
SHOW TABLE STATUS LIKE 'features_TEXT'\G
*************************** 1. row ***************************
          Name: features_TEXT
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 188784
Avg_row_length: 1370
   Data_length: 258654208  #247 MB
  Index_length: 0
     Data_free: 4194304

Identical!

As expected data length is respectively the same.

Is there any cost difference between selecting a JSON documents from the virtual column and the JSON_EXTRACT function?

SELECT DISTINCT feature_type FROM features\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features
        type: ALL
possible_keys: NULL
         key: NULL
     key_len: NULL
         ref: NULL
        rows: 195195
    filtered: 100.00
       Extra: Using temporary

Minimum number of seconds to run all queries: 2.790 seconds

SELECT DISTINCT feature_type FROM features_TEXT\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features_TEXT
        type: ALL
possible_keys: NULL
         key: NULL
     key_len: NULL
         ref: NULL
        rows: 171004
    filtered: 100.00
       Extra: Using temporary

Minimum number of seconds to run all queries: 25.933 seconds

Obviously the QEP is the same: Full Table Scan (FTS).

Anyway 2 comments:

  • MySQL JSON internal format is still more efficient than TEXT data type, in this example query execution time is about 8 times faster with JSON.
  • In this example FTS on the virtual generated column (feature_type) is faster than the usage of json_extract function on the JSON document in the SELECT clause (from 4.470 to 2.790).

Create indexes on generated column

As of MySQL 5.7.8, InnoDB supports secondary indexes on virtual columns.

Adding or dropping a secondary index on a virtual column is an in-place operation.

ALTER TABLE features ADD INDEX (feature_type);
Query OK, 0 rows affected (5.04 sec)
Records: 0  Duplicates: 0  Warnings: 0
ALTER TABLE features_TEXT ADD INDEX (feature_type);
Query OK, 0 rows affected (27.89 sec)
Records: 0  Duplicates: 0  Warnings: 0

New table descriptions are:

CREATE TABLE features (
 id` int(11) NOT NULL AUTO_INCREMENT,
 feature` json NOT NULL,
 feature_type varchar(30) GENERATED ALWAYS AS (feature->"$.type") VIRTUAL,
 PRIMARY KEY (id),
 KEY feature_type (feature_type)
) ENGINE=InnoDB
CREATE TABLE features_TEXT (
 id int(11) NOT NULL AUTO_INCREMENT,
 feature longtext NOT NULL,
 feature_type varchar(30) GENERATED ALWAYS AS (feature->"$.type") VIRTUAL,
 PRIMARY KEY (id),
 KEY feature_type (feature_type)
) ENGINE=InnoDB

Let’s have another look on table metadata:

ANALYZE TABLE features, features_TEXT;

SHOW TABLE STATUS LIKE 'features'\G
*************************** 1. row ***************************
          Name: features
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 180400
Avg_row_length: 1276
   Data_length: 230326272  #220 MB
  Index_length: 5783552    #6 MB
     Data_free: 5242880
SHOW TABLE STATUS LIKE 'features_TEXT'\G
*************************** 1. row ***************************
          Name: features_TEXT
        Engine: InnoDB
       Version: 10
    Row_format: Dynamic
          Rows: 192445
Avg_row_length: 1344
   Data_length: 258654208  #247 MB
  Index_length: 5783552    #6 MB
     Data_free: 2097152

The index on feature_type column is materialized. Its size it’s approximately 6 MB.

Now because of this index, the query should be more efficient:

SELECT DISTINCT feature_type FROM features\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features
        type: index
possible_keys: feature_type
         key: feature_type
     key_len: 33
         ref: NULL
        rows: 193763
    filtered: 100.00
       Extra: Using index

Minimum number of seconds to run all queries: 0.178 seconds

SELECT DISTINCT feature_type FROM features_TEXT\G
*************************** 1. row ***************************
          id: 1
 select_type: SIMPLE
       table: features_TEXT
        type: index
possible_keys: feature_type
         key: feature_type
     key_len: 33
         ref: NULL
        rows: 208134
    filtered: 100.00
       Extra: Using index

Minimum number of seconds to run all queries: 0.178 seconds

As expected the optimizer uses the index (feature_type) and the query execution time is much better in both cases (from 2.790 to 0.178 for JSON column).

Wrapup

MySQL 5.7 implements native JSON data type support and provides a set of function that allows to Create, Search, Modify JSON values and Return JSON attributes values as well. That’s good and I guess many developers will be happy to use this new feature.

Generated columns is also an interesting feature. It could be used among others to simulate functional indexes, as a materialized cache for often used expressions… or like we did to provide index management on the JSON documents.

Give it a try, it definitely worth more than 30 minutes.

Want to know more about MySQL 5.7?

Going further

MySQL Documentation

The JSON data type

https://dev.mysql.com/doc/refman/5.7/en/json.html

JSON Functions

https://dev.mysql.com/doc/refman/5.7/en/json-functions.html

CREATE TABLE and Generated Columns

http://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-generated-columns

MySQL Server Blog

Native JSON Data Type and Binary Format

http://mysqlserverteam.com/json-labs-release-native-json-data-type-and-binary-format/

JSON functions

http://mysqlserverteam.com/json-labs-release-json-functions-part-1-manipulation-json-data/

http://mysqlserverteam.com/mysql-5-7-lab-release-json-functions-part-2-querying-json-data/

https://mysqlserverteam.com/new-json-functions-in-mysql-5-7-22/

Inline JSON Path Expressions in MySQL 5.7

http://mysqlserverteam.com/inline-json-path-expressions-in-mysql-5-7/

Getting Started With MySQL & JSON on Windows

http://mysqlserverteam.com/getting-started-with-mysql-json-on-windows/

Effective Functional Indexes in InnoDB

http://mysqlserverteam.com/json-labs-release-effective-functional-indexes-in-innodb/

MySQL 5.7

What Is New in MySQL 5.7

https://dev.mysql.com/doc/refman/5.7/en/mysql-nutshell.html

Complete list of new features in MySQL 5.7

http://www.thecompletelistoffeatures.com/

Thanks for using MySQL!

5

Free ebooks on Packt Publishing website

February 28, 2015
Tags:

 

Packt Publishing started a Free Learning Campaign by providing a free ebook everyday till 6th March 2015.

How ? Follow this link: http://bit.ly/1DoRno5

Comments Off on Free ebooks on Packt Publishing website

Generate html with the mysql client

September 12, 2013

I’m a big fan of the MySQL command line tool ie the default text client modestly named: mysql. I use it everyday because you can do almost everything with it (DML, DDL, DCL, administrative tasks,…).

It has many options including generate xml or html output.

$ mysql --help | grep "\--html"
 -H, --html          Produce HTML output.

 

So you can easily script it and generate an HTML document:

 

$ mysql -e"SELECT now() as DATE;" --html
<TABLE BORDER=1><TR><TH>DATE</TH></TR><TR><TD>2013-09-11 19:25:41</TD></TR></TABLE>

 

It’s very simple and it’s valid HTML but it is not valid XHTML!

Why, in 2013, a tool generates HTML rather than XHTML ?

1