MySQL InnoDB Cluster – Recovering and provisioning with MySQL Enterprise Backup

July 11, 2019
Full moon by Olivier DASINI

Like I stated in my previous article – MySQL InnoDB Cluster – Recovering and provisioning with mysqldump :
“As the administrator of a cluster, among other tasks, you should be able to restore failed nodes and grow (or shrink) your cluster by adding (or removing) new nodes”.
Well, I still agree with myself 🙂

MySQL customers using a Commercial Edition have access to MySQL Enterprise Backup (MEB) which provide enterprise-grade physical backup and recovery for MySQL.

MEB delivers hot, online, non-blocking backups on multiple platforms including Linux, Windows, Mac & Solaris.
More details here.

Note:
If you want to know how to recover a node and/or how to provision nodes with mysqldump please read this blog post.

Context

Let’s make it as simple as possible 🙂
I’m using MySQL Enterprise 8.0.16, available for MySQL customers on My Oracle Support or on Oracle Software Delivery Cloud.

I have an InnoDB Cluster setup, up and running.
So my main assumption is that you already know what is MySQL Group Replication & MySQL InnoDB Cluster.
Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague Ivan.

All nodes must have the right MySQL Enterprise Backup privileges.
Details here.

All nodes must have same values respectively for log-bin & relay-log:
For example: log-bin=binlog & relay-log=relaylog (on all nodes)

Note:
Depending on how you configured your MySQL InnoDB Cluster, some steps could be slightly different.

Scenario 1 – Node Recovering

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3, in single primary mode
  • MySQL Router is configured to enable R/W connections on 3306 and RO connections on 3307
  • M1 is currently the primary (that is in Read/Write mode)
  • M2 & M3 are currently the secondaries (that is Read Only mode)
  • M1 failed! Assuming it is irreconcilably corrupted :'(
  • M2 & M3 are now the (new temporary) cluster

The goal then is to rebuild M1 and put it back to the cluster.

So like I said before we have a 3 nodes MySQL Enterprise 8.0.16 InnoDB Cluster up and running:

$ mysqlsh clusterAdmin@{mysqlRouterIP}:3306 --cluster
...

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M1:3306"
}

Then node M1 crashed… (status is “MISSING“) :

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

M1 was the primary.
The cluster initiated an automatic database failover to elect a new primary… blablabla
Well you already know the story 🙂

After a while M1 is fixed and ready to be part of the cluster again.
To minimize the recovery time instead of using the last backup we prefer to take a fresh one.

Speaking of backup, I recommend reading the excellent blog post from my colleague JesperMySQL Backup Best Practices.

Let’s take a fresh backup on a secondary node (we could also used the primary).

MySQL Enterprise Backup is a very versatile tool and has many different configuration options.
For clarity I’ll use a simplistic command. Please read the MEB documentation for a more “production style” commande.
The backup will roughly looks like :

$ mysqlbackup --defaults-file=/etc/my.cnf  --with-timestamp --messages-logdir=/data/backups/ --backup-image=/data/backups/db.mbi backup-to-image
...
-------------------------------------------------------------
   Parameters Summary         
-------------------------------------------------------------
   Start LSN                  : 44603904
   End LSN                    : 44607630
-------------------------------------------------------------

mysqlbackup completed OK! with 1 warnings

Please note that it is highly recommended, in addition to the my.cnf to include in your backup process a copy of the auto.cnf and mysqld-auto.cnf configuration files for all nodes.

If you “lose” your auto.cnf file, don’t worry the server will generate a new one for you.
However the recovery process will be slightly different… (more on that below).

Now it’s time to restore this backup on node M1.

Because this server is part of a MySQL InnoDB Cluster, obviously there are some extra steps compare to a standalone server restoration.

Node Recovering

The node recovering process is simple:

  • Delete all contents of the MySQL Server data directory
  • Restore the backup
  • Restore the auto.cnf file
  • Restore the mysqld-auto.cnf file (if there is one)
  • Start the MySQL instance

This gives us on M1 , something like (simplified version, please adapt to your context) :

# Delete all contents of the MySQL Server data directory
$ rm -rf /var/lib/mysql/*


# Restore the backup
$ mysqlbackup --backup-dir=/exp/bck --datadir=/var/lib/mysql --backup-image=/data/backups/db.mbi copy-back-and-apply-log


# Restore the auto.cnf file
$ cp -p /data/backups/auto.cnf  /var/lib/mysql


# Restore the mysqld-auto.cnf file 
$ cp -p /data/backups/mysqld-auto.cnf  /var/lib/mysql


# Start the MySQL instance
service mysql start 

Then you can connect to the cluster and… see that the node M1 is recovering (“status: RECOVERING”) or if you’re not fast enough that the node is again part of the cluster (“status: ONLINE”):

$ mysqlsh clusterAdmin@{mysqlRouterIP}:3306 --cluster
...

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "RECOVERING", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}


// After a while 


MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Lost the auto.cnf file

As promised, the case when the auto.cnf configuration file is not restored.
In fact, in this case the cluster would see this node as a new node (because the server will have a new UUID).
So the process for putting it back is different.

Also note that if you loose the mysqld-auto.cnf file you’ll probably need to configure (again) the server to be Group Replication aware.

To begin, you must stop the Group Replication plugin on the node that needs to be restored (M1):

$ mysqlsh root@M1 --sql -e"STOP GROUP_REPLICATION;"

Then if necessary you can check the configuration and/or configure the node:

MySQL JS> dba.checkInstanceConfiguration("root@M1:3306")
...


MySQL JS> dba.configureInstance("root@M1:3306")
...

You need to remove the old node from the InnoDB Cluster metadata:

MySQL JS> cluster.rescan()
Rescanning the cluster...

Result of the rescanning operation for the 'default' ReplicaSet:
{
    "name": "default", 
    "newTopologyMode": null, 
    "newlyDiscoveredInstances": [], 
    "unavailableInstances": [
        {
            "host": "M1:3306", 
            "label": "M1:3306", 
            "member_id": "6ad8caed-9d90-11e9-96e5-0242ac13000b"
        }
    ]
}

The instance 'M1:3306' is no longer part of the ReplicaSet.
The instance is either offline or left the HA group. You can try to add it to the cluster again with the cluster.rejoinInstance('M1:3306') command or you can remove it from the cluster configuration.
Would you like to remove it from the cluster metadata? [Y/n]: Y
Removing instance from the cluster metadata...
The instance 'M1:3306' was successfully removed from the cluster metadata.

Add the “new” node:

MySQL JS> cluster.addInstance("clusterAdmin@M1:3306")
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Validating instance at M1:3306...

This instance reports its own address as M1

Instance configuration is suitable.
The instance 'clusterAdmin@M1:3306' was successfully added to the cluster.

Check – and after the recovery stage, the “new” node is online:

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "RECOVERING", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

// After a while...

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Et voilà!

An simple alternative to deal with this “unpleasantness“, if you don’t need to configure the node, is basically to remove the node and add it again.
Below an example with M3:

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_PARTIAL", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}


MySQL JS> cluster.removeInstance("root@M3:3306")
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.

Instance 'M3:3306' is attempting to leave the cluster...

The instance 'M3:3306' was successfully removed from the cluster.


MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}


MySQL JS> cluster.addInstance("root@M3:3306")
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Validating instance at M3:3306...

This instance reports its own address as M3

Instance configuration is suitable.
The instance 'root@M3:3306' was successfully added to the cluster.


MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Scenario 2 – Node Provisioning

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3 in single primary mode
  • MySQL Router is configured to enable R/W connections on port 3306 and RO connections on port 3307
  • M2 is currently the primary (that is Read/Write mode)
  • M1 & M3 are currently the secondaries (that is Read Only mode)

The goal then is to add 2 new nodes: M4 & M5

So we have the 3 nodes MySQL 8.0.16 InnoDB Cluster that we used in the first part of this article. And it is up and running.

Actually adding new nodes is very close to what we have done previously.

The process is :

  • Deploy the new MySQL instance preferably already configured for Group Replication
  • Restore the data in the way that we have seen previously

Check the configuration and the configuration itself can be done respectively with dba.checkInstanceConfiguration() and dba.configure() functions (and it could also be useful to use checkInstanceState()see this article).
e.g. on node M4:

$ mysqlsh clusterAdmin@M4:3306 -- dba checkInstanceConfiguration
Validating MySQL instance at M4:3306 for use in an InnoDB cluster...

This instance reports its own address as M4

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'M4:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}

Restore the backup on M4, the new node:

# Restore the backup on M4
$ mysqlbackup --backup-dir=/exp/bck --datadir=/var/lib/mysql --backup-image=/data/backups/db.mbi copy-back-and-apply-log

An finally add the new node (M4):

// Add the new instance
MySQL JS> cluster.addInstance("clusterAdmin@M4:3306")


// Check
MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M4:3306": {
                "address": "M4:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Same process for the last node, M5.

You end up with a 5 nodes MySQL InnoDB Cluster \o/:

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to 2 failures.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M4:3306": {
                "address": "M4:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M5:3306": {
                "address": "M5:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

This is one way to do node recovery and provisioning using MySQL Enterprise Backup.

You may also take a look on this article: InnoDB Cluster: Recovering an instance with MySQL Enterprise Backup – from my colleague Keith.

If you are not yet a MySQL customer, and therefore you are not able to enjoy our advanced features/tools and technical support, so probably that mysqldump could fit here. Please read: MySQL InnoDB Cluster – Recovering and provisioning with mysqldump.

Note that some new features are coming in this area… 🙂
Stay tuned!

References

Misc
Node 3 – Group Replication configuration

SQL> SHOW VARIABLES LIKE 'group_replication%';
+-----------------------------------------------------+--------------------------------------+
| Variable_name                                       | Value                                |
+-----------------------------------------------------+--------------------------------------+
| group_replication_allow_local_lower_version_join    | OFF                                  |
| group_replication_auto_increment_increment          | 7                                    |
| group_replication_autorejoin_tries                  | 0                                    |
| group_replication_bootstrap_group                   | OFF                                  |
| group_replication_communication_debug_options       | GCS_DEBUG_NONE                       |
| group_replication_communication_max_message_size    | 10485760                             |
| group_replication_components_stop_timeout           | 31536000                             |
| group_replication_compression_threshold             | 1000000                              |
| group_replication_consistency                       | EVENTUAL                             |
| group_replication_enforce_update_everywhere_checks  | OFF                                  |
| group_replication_exit_state_action                 | READ_ONLY                            |
| group_replication_flow_control_applier_threshold    | 25000                                |
| group_replication_flow_control_certifier_threshold  | 25000                                |
| group_replication_flow_control_hold_percent         | 10                                   |
| group_replication_flow_control_max_quota            | 0                                    |
| group_replication_flow_control_member_quota_percent | 0                                    |
| group_replication_flow_control_min_quota            | 0                                    |
| group_replication_flow_control_min_recovery_quota   | 0                                    |
| group_replication_flow_control_mode                 | QUOTA                                |
| group_replication_flow_control_period               | 1                                    |
| group_replication_flow_control_release_percent      | 50                                   |
| group_replication_force_members                     |                                      |
| group_replication_group_name                        | 28f66c86-9d66-11e9-876e-0242ac13000b |
| group_replication_group_seeds                       | M1:33061,M2:33061                    |
| group_replication_gtid_assignment_block_size        | 1000000                              |
| group_replication_ip_whitelist                      | AUTOMATIC                            |
| group_replication_local_address                     | M3:33061                             |
| group_replication_member_expel_timeout              | 0                                    |
| group_replication_member_weight                     | 50                                   |
| group_replication_message_cache_size                | 1073741824                           |
| group_replication_poll_spin_loops                   | 0                                    |
| group_replication_recovery_complete_at              | TRANSACTIONS_APPLIED                 |
| group_replication_recovery_get_public_key           | OFF                                  |
| group_replication_recovery_public_key_path          |                                      |
| group_replication_recovery_reconnect_interval       | 60                                   |
| group_replication_recovery_retry_count              | 10                                   |
| group_replication_recovery_ssl_ca                   |                                      |
| group_replication_recovery_ssl_capath               |                                      |
| group_replication_recovery_ssl_cert                 |                                      |
| group_replication_recovery_ssl_cipher               |                                      |
| group_replication_recovery_ssl_crl                  |                                      |
| group_replication_recovery_ssl_crlpath              |                                      |
| group_replication_recovery_ssl_key                  |                                      |
| group_replication_recovery_ssl_verify_server_cert   | OFF                                  |
| group_replication_recovery_use_ssl                  | ON                                   |
| group_replication_single_primary_mode               | ON                                   |
| group_replication_ssl_mode                          | REQUIRED                             |
| group_replication_start_on_boot                     | ON                                   |
| group_replication_transaction_size_limit            | 150000000                            |
| group_replication_unreachable_majority_timeout      | 0                                    |
+-----------------------------------------------------+--------------------------------------+

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

2

MySQL InnoDB Cluster – Recovering and provisioning with mysqldump

July 9, 2019
Butterfly by Olivier DASINI

As the administrator of a cluster, among other tasks, you should be able to restore failed nodes and grow (or shrink) your cluster by adding (or removing) new nodes.

In MySQL, as a backup tool (and if your amount of data is not too big), you can use mysqldump a client utility that performs logical backups.
The results are SQL statements that reproduce the original schema objects and data.

For substantial amounts of data however, a physical backup solution such as MySQL Enterprise Backup is faster, particularly for the restore operation.
Hey! guess what? You can read: MySQL InnoDB Cluster – Recovering and provisioning with MySQL Enterprise Backup

Context

Let’s make it as simple as possible 🙂
I’m using MySQL 8.0.16.
I have an InnoDB Cluster setup – up and running.
So my main assumption is that you already know what is MySQL Group Replication & MySQL InnoDB Cluster.
Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague Ivan.

Note:

Depending on how you configured your MySQL InnoDB Cluster, some steps could be slightly different.

Scenario 1 – Node Recovering

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3, in single primary mode
  • MySQL Router is configured to enable R/W connections on port 3306 and RO connections on port 3307
  • M1 is currently the primary (so in Read/Write mode)
  • M2 & M3 are currently the secondaries (i.e. Read Only mode)
  • M1 failed! Some tables are irreconcilably corrupted 🙁
  • M2 & M3 are now the (new temporary) cluster

The goal then is to rebuild M1 and put it back to the cluster.

So like I stated we have a 3 nodes MySQL 8.0.16 InnoDB Cluster up and running:

$ mysqlsh clusterAdmin@{mysqlRouterIP}:3306 --cluster
...

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M1:3306"
}

Then M1 failed (status is “MISSING“) :

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

M1 was the primary.
The cluster initiated an automatic database failover to elect a new primary… blablabla
Anyway you already know the story 🙂

After a while M1 is finally fixed and ready to be part of the cluster again.
To minimize the recovery time instead of using the last dump we prefer to take a fresh one.

Speaking of backup, I recommend reading the excellent blog post from my colleague JesperMySQL Backup Best Practices.

Let’s take a fresh dump on a secondary node using MySQL Router, on port 3307 with my custom configuration (we could also used the primary).
The dump will roughly looks like :

mysqldump --defaults-file=/etc/my.cnf -u mdump -p -h {mysqlRouterIP} -P 3307 --all-databases --routines --events --single-transaction --flush-privileges --hex-blob --log-error=/var/log/mysqldump.log --result-file=/data/backups/dump.sql

Minimum privileges for the MySQL user with which mysqldump connects to the server

Actually this minimum privileges depends on what object you wan to dump and what for the dump it is.

However the following privileges should be fine for most of the classical usages:

GRANT SELECT, SHOW VIEW, EVENT, TRIGGER, LOCK TABLES, CREATE, ALTER, RELOAD, REPLICATION CLIENT, REPLICATION_SLAVE_ADMIN ON *.* TO <dumpUser>

Please note that it is highly recommended, in addition to the my.cnf, to include in your backup process a copy of the auto.cnf and mysqld-auto.cnf configuration files for all nodes.

If you “lose” your auto.cnf file, don’t worry the server will generate a new one for you.
However the recovery process will be slightly different… (more on that below).

Now it’s time to restore this dump on node M1.

Because this server is part of InnoDB Cluster, clearly there are some extra steps compare to a standalone server restoration.

Restore the data

First, restore the data on M1:

  • It’s a logical restoration so the server to restore must be up 😀
  • Group Replication plugin must be stopped
    • STOP GROUP_REPLICATION;
  • Disable logging to the binary log
    • SET SQL_LOG_BIN=0;
  • Delete binary log files
    • RESET MASTER;
  • Clear the master info and relay log info repositories and deletes all the relay log files
    • RESET SLAVE;
  • Enable updates
    • SET GLOBAL super_read_only=0;
  • Load the dump
    • source /data/backups/dump.sql

This gives us:

mysqlsh root@M1:3306 --sql
...

M1 SQL> STOP GROUP_REPLICATION;
Query OK, 0 rows affected (12.04 sec)

M1 SQL> SET SQL_LOG_BIN=0;
Query OK, 0 rows affected (0.00 sec)

M1 SQL> RESET MASTER; 
Query OK, 0 rows affected (0.06 sec)

M1 SQL> RESET SLAVE;
Query OK, 0 rows affected (0.13 sec)

M1 SQL> SET GLOBAL super_read_only=0;
Query OK, 0 rows affected (0.00 sec)

M1 SQL> source /data/backups/dump.sql
Query OK, 0 rows affected (0.00 sec)
...

Put the node back to the cluster

Second, put the node back to the cluster.
Connect to MySQL Router on the primary (port 3306 in my case):

$ mysqlsh clusterAdmin@{mysqlRouterIP}:3306 --cluster
...

MySQL JS> cluster.rejoinInstance("clusterAdmin@M1:3306")
Rejoining the instance to the InnoDB cluster. Depending on the original problem that made the instance unavailable, the rejoin operation might not be successful and further manual steps will be needed to fix the underlying problem.

Please monitor the output of the rejoin operation and take necessary action if the instance cannot rejoin.

Rejoining instance to the cluster ...

The instance 'M1:3306' was successfully rejoined on the cluster.

Now you should check the new cluster status

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Note:

The cluster status of the restored node will be in the status “RECOVERING” before to be “ONLINE”.

Lost the auto.cnf file

As promised, the case when the auto.cnf configuration file is not restored.
In fact, in that case the cluster would see this node as a new node (because the server will have a new UUID).
So the process for putting it back is different.

Also note that if you loose the mysqld-auto.cnf file you’ll probably need to configure (again) the server to be Group Replication aware.

So basically the process is doing some cleaning and then add the old node like it was a new node.


Assuming Group Replication plugin is stopped on M1:

// Check the configuration even more important if you lost mysqld-auto.cnf :)
MySQL JS> dba.checkInstanceConfiguration('clusterAdmin@M1:3306')
Validating MySQL instance at M1:3306 for use in an InnoDB cluster...

This instance reports its own address as M1

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'M1:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}


// If needed configure your instance
MySQL JS> dba.configureInstance('clusterAdmin@M1:3306')


// Remove the old node from the cluster metadata
MySQL JS> cluster.rescan()
Rescanning the cluster...

Result of the rescanning operation for the 'default' ReplicaSet:
{
    "name": "default", 
    "newTopologyMode": null, 
    "newlyDiscoveredInstances": [], 
    "unavailableInstances": [
        {
            "host": "M1:3306", 
            "label": "M1:3306", 
            "member_id": "a3f1ee50-9be3-11e9-a3fe-0242ac13000b"
        }
    ]
}

The instance 'M1:3306' is no longer part of the ReplicaSet.
The instance is either offline or left the HA group. You can try to add it to the cluster again with the cluster.rejoinInstance('M1:3306') command or you can remove it from the cluster configuration.
Would you like to remove it from the cluster metadata? [Y/n]: Y
Removing instance from the cluster metadata...
The instance 'M1:3306' was successfully removed from the cluster metadata.


// Add the new instance
MySQL JS> cluster.addInstance("clusterAdmin@M1:3306")
A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Validating instance at M1:3306...

This instance reports its own address as M1

Instance configuration is suitable.
The instance 'clusterAdmin@M1:3306' was successfully added to the cluster.


// Check
MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Et voilà !

Scenario 2 – Node Provisioning

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3 in single primary mode
  • MySQL Router is configured to enable R/W connections on port 3306 and RO connections on port 3307
  • M2 is currently the primary in Read/Write mode
  • M1 & M3 are currently the secondaries in Read Only mode

The goal then is to add 2 new nodes: M4 & M5

So we have the 3 nodes MySQL 8.0.16 InnoDB Cluster that we used in the first part of this article.
And it is up and running.

Actually adding new nodes is very close to what we have done previously.

The process is :

  • Deploy the new MySQL instance preferably already configured for Group Replication
  • Restore the data in the way that we have seen previously

Check the configuration and the configuration itself can be done respectively with dba.checkInstanceConfiguration() and dba.configure() functions (and it could also be useful to use checkInstanceState()see this article).
e.g. on node M4:

$ mysqlsh clusterAdmin@M4:3306 -- dba checkInstanceConfiguration
Validating MySQL instance at M4:3306 for use in an InnoDB cluster...

This instance reports its own address as M4

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'M4:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}

The first part of the restore process is the same than the one we have seen:

mysqlsh root@M4:3306 --sql
...

M4 SQL> STOP GROUP_REPLICATION; -- if necessary
Query OK, 0 rows affected (12.04 sec)

M4 SQL> SET SQL_LOG_BIN=0;
Query OK, 0 rows affected (0.00 sec)

M4 SQL> RESET MASTER; 
Query OK, 0 rows affected (0.06 sec)

M4 SQL> RESET SLAVE;
Query OK, 0 rows affected (0.13 sec)

M4 SQL> SET GLOBAL super_read_only=0;
Query OK, 0 rows affected (0.00 sec)

M4 SQL> source /data/backups/dump.sql
Query OK, 0 rows affected (0.00 sec)
...

For the second part, we will now add the new node (M4) :

// Add the new instance
MySQL JS> cluster.addInstance("clusterAdmin@M4:3306")


// Check
MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M4:3306": {
                "address": "M4:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

Note:

If necessary, just before the addInstance() you can do the checkInstanceConfiguration() and configureInstance().

Same process for the last node, M5.

And finally you got a 5 nodes MySQL InnoDB Cluster \o/:

MySQL JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "M2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to 2 failures.", 
        "topology": {
            "M1:3306": {
                "address": "M1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M2:3306": {
                "address": "M2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M3:3306": {
                "address": "M3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M4:3306": {
                "address": "M4:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }, 
            "M5:3306": {
                "address": "M5:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.16"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "M2:3306"
}

This is one way to do node recovery and provisioning using mysqldump, when your amount of data is not big.

When logical backup is not efficient it is time to use an hot, online and non-blocking physical backup tool like MySQL Enterprise Backup.
Actually, it’s even easier!
Please read: MySQL InnoDB Cluster – Recovering and provisioning with MySQL Enterprise Backup.

A final word to say that some new features are coming in this area… 🙂
Stay tuned!

References

Misc
Node 1 – Group Replication configuration

mysql> SHOW VARIABLES LIKE 'group_replication%';
+-----------------------------------------------------+-------------------------------------+
| Variable_name                                       | Value                               |
+-----------------------------------------------------+-------------------------------------+
| group_replication_allow_local_lower_version_join    | OFF                                 |
| group_replication_auto_increment_increment          | 7                                   |
| group_replication_autorejoin_tries                  | 1                                   |
| group_replication_bootstrap_group                   | OFF                                 |
| group_replication_communication_debug_options       | GCS_DEBUG_NONE                      |
| group_replication_communication_max_message_size    | 10485760                            |
| group_replication_components_stop_timeout           | 31536000                            |
| group_replication_compression_threshold             | 1000000                             |
| group_replication_consistency                       | EVENTUAL                            |
| group_replication_enforce_update_everywhere_checks  | OFF                                 |
| group_replication_exit_state_action                 | READ_ONLY                           |
| group_replication_flow_control_applier_threshold    | 25000                               |
| group_replication_flow_control_certifier_threshold  | 25000                               |
| group_replication_flow_control_hold_percent         | 10                                  |
| group_replication_flow_control_max_quota            | 0                                   |
| group_replication_flow_control_member_quota_percent | 0                                   |
| group_replication_flow_control_min_quota            | 0                                   |
| group_replication_flow_control_min_recovery_quota   | 0                                   |
| group_replication_flow_control_mode                 | QUOTA                               |
| group_replication_flow_control_period               | 1                                   |
| group_replication_flow_control_release_percent      | 50                                  |
| group_replication_force_members                     |                                     |
| group_replication_group_name                        | d1b109bf-9be3-11e9-9ea2-0242ac13000b|
| group_replication_group_seeds                       | M2:33061,M3:33061,M4:33061,M5:33061 |
| group_replication_gtid_assignment_block_size        | 1000000                             |
| group_replication_ip_whitelist                      | AUTOMATIC                           |
| group_replication_local_address                     | M1:33061                            |
| group_replication_member_expel_timeout              | 0                                   |
| group_replication_member_weight                     | 50                                  |
| group_replication_message_cache_size                | 1073741824                          |
| group_replication_poll_spin_loops                   | 0                                   |
| group_replication_recovery_complete_at              | TRANSACTIONS_APPLIED                |
| group_replication_recovery_get_public_key           | OFF                                 |
| group_replication_recovery_public_key_path          |                                     |
| group_replication_recovery_reconnect_interval       | 60                                  |
| group_replication_recovery_retry_count              | 10                                  |
| group_replication_recovery_ssl_ca                   |                                     |
| group_replication_recovery_ssl_capath               |                                     |
| group_replication_recovery_ssl_cert                 |                                     |
| group_replication_recovery_ssl_cipher               |                                     |
| group_replication_recovery_ssl_crl                  |                                     |
| group_replication_recovery_ssl_crlpath              |                                     |
| group_replication_recovery_ssl_key                  |                                     |
| group_replication_recovery_ssl_verify_server_cert   | OFF                                 |
| group_replication_recovery_use_ssl                  | ON                                  |
| group_replication_single_primary_mode               | ON                                  |
| group_replication_ssl_mode                          | REQUIRED                            |
| group_replication_start_on_boot                     | ON                                  |
| group_replication_transaction_size_limit            | 150000000                           |
| group_replication_unreachable_majority_timeout      | 0                                   |
+-----------------------------------------------------+-------------------------------------+

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

2

Check the MySQL server startup configuration

June 11, 2019
Caribbean by Olivier DASINI

Since 8.0.16, MySQL Server supports a validate-config option that enables the startup configuration to be checked for problems without running the server in normal operational mode:

  • If no errors are found, the server terminates with an exit code of 0.
  • If an error is found, the server displays a diagnostic message and terminates with an exit code of 1.

validate-config can be used any time, but is particularly useful after an upgrade, to check whether any options previously used with the older server are considered by the upgraded server to be deprecated or obsolete.

First let’s get some information about my MySQL version and configuration.

$ mysqld --help --verbose | head -n13
mysqld  Ver 8.0.16 for Linux on x86_64 (MySQL Community Server - GPL)
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Starts the MySQL database server.

Usage: mysqld [OPTIONS]

Default options are read from the following files in the given order:
/etc/my.cnf /etc/mysql/my.cnf /usr/etc/my.cnf ~/.my.cnf 

I’m using MySQL 8.0.16.
The default options configuration are read in the given order from :

  • /etc/my.cnf
  • /etc/mysql/my.cnf
  • /usr/local/mysql/etc/my.cnf
  • ~/.my.cnf

Now let’s check my MySQL server startup configuration :

$ mysqld --validate-config
$

No error !
No output, everything looks good.
My server will start with this configuration.

If there is an error, the server terminates.
The output is obviously different :

$ mysqld --validate-config --fake-option
2019-06-05T15:10:08.653775Z 0 [ERROR] [MY-000068] [Server] unknown option '--fake-option'.
2019-06-05T15:10:08.653822Z 0 [ERROR] [MY-010119] [Server] Aborting

Usually your configuration options are written in your configuration file (in general named my.cnf).
Therefore you can also use validate-config in this context :

$ mysqld --defaults-file=/etc/my.cnf --validate-config 
$ 

Note:

defaults-file, if specified, must be the first option on the command line.

Furthermore you can handle the verbosity using log_error_verbosity :

  • A value of 1 gives you ERROR
  • A value of 2 gives you ERROR & WARNING
  • A value of 3 gives you ERROR, WARNING & INFORMATION (i.e. note)

With a verbosity of 2, in addition to errors, we will be able to display warnings :

$ mysqld --defaults-file=/etc/my.cnf --validate-config  --log_error_verbosity=2
2019-06-05T15:53:42.785422Z 0 [Warning] [MY-011068] [Server] The syntax 'expire-logs-days' is deprecated and will be removed in a future release. Please use binlog_expire_logs_seconds instead.
2019-06-05T15:53:42.785660Z 0 [Warning] [MY-010101] [Server] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.

Nothing very serious, however it is a best practice to delete warnings, when possible.

So I fixed these warnings :

$ mysqld --defaults-file=/etc/my.cnf --validate-config  --log_error_verbosity=2
2019-06-05T16:04:32.363297Z 0 [ERROR] [MY-000067] [Server] unknown variable 'binlog_expire_logs_second=7200'.
2019-06-05T16:04:32.363369Z 0 [ERROR] [MY-010119] [Server] Aborting

Oops!!! There is a typo… :-0
I wrote binlog_expire_logs_second instead of binlog_expire_logs_seconds.
(I forgot the final “s”)

In that case, my MySQL server could not start.
Thanks to validate-config !
I can now avoid some unpleasant experience when starting the server 🙂

With the correct spelling I have now no error and no warning :

$ mysqld --defaults-file=/etc/my.cnf --validate-config  --log_error_verbosity=2
$ 

Note that you could also use verbosity 3

$ mysqld --defaults-file=/etc/my.cnf --validate-config  --log_error_verbosity=3
2019-06-05T16:02:03.589770Z 0 [Note] [MY-010747] [Server] Plugin 'FEDERATED' is disabled.
2019-06-05T16:02:03.590719Z 0 [Note] [MY-010733] [Server] Shutting down plugin 'MyISAM'
2019-06-05T16:02:03.590763Z 0 [Note] [MY-010733] [Server] Shutting down plugin 'CSV'

validate-config is convenient and can be very useful.
It may be worthwhile to include it in your upgrade process.

References

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

1

MySQL 8.0.16 New Features Summary

June 5, 2019
Sakila mozaik by Olivier DASINI

Presentation of some of the new features of MySQL 8.0.16 released on April 25, 2019.

Agenda

  • mysql_upgrade is no longer necessary
  • CHECK Constraints
  • Constant-Folding Optimization
  • SYSTEM_USER & partial_revokes
  • Chinese collation for utf8mb4
  • Performance Schema keyring_keys table
  • MySQL Shell Enhancements
  • MySQL Router Enhancements
  • InnoDB Cluster Enhancements
  • Group Replication Enhancements
  • Size of the binary tarball for Linux
  • Server quick settings validation

Download this presentation and others on my SlideShare account.

I’ve also made a video (in French) on my Youtube channel.

You can subscribe here.

That might interest you

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

4

MySQL InnoDB Cluster – HowTo #2 – Validate an instance

May 21, 2019
Sakila HA by Olivier DASINI

How do I… Validate an instance for MySQL InnoDB Cluster usage?

Short answer

Use:

checkInstanceConfiguration()

Long answer…

In this article I assuming you already know what is MySQL Group Replication & MySQL InnoDB Cluster.
Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague Ivan.

During the cluster creation process or when you want to add a node to a running cluster, the chosen MySQL instance must be valid for an InnoDB Cluster usage.
That is, be compliant with Group Replication requirements.

MySQL Shell provide a simple and easy way to check if your instance is valid: checkInstanceConfiguration()

I’m using MySQL Shell 8.0.16:

$ mysqlsh
MySQL Shell 8.0.16

Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.

MySQL JS> 

In this scenario my cluster is not created yet. However the logic would have been the same for adding a node to a running cluster.

Ask for help

The built-in help is simply awesome!

MySQL JS> dba.help('checkInstanceConfiguration')
NAME
      checkInstanceConfiguration - Validates an instance for MySQL InnoDB
                                   Cluster usage.

SYNTAX
      dba.checkInstanceConfiguration(instance[, options])

WHERE
      instance: An instance definition.
      options: Data for the operation.

RETURNS
       A descriptive text of the operation result.

DESCRIPTION
      This function reviews the instance configuration to identify if it is
      valid for usage with group replication. Use this to check for possible
      configuration issues on MySQL instances before creating a cluster with
      them or adding them to an existing cluster.

      The instance definition is the connection data for the instance.

      For additional information on connection data use \? connection.

      Only TCP/IP connections are allowed for this function.

      The options dictionary may contain the following options:

      - mycnfPath: Optional path to the MySQL configuration file for the
        instance. Alias for verifyMyCnf
      - verifyMyCnf: Optional path to the MySQL configuration file for the
        instance. If this option is given, the configuration file will be
        verified for the expected option values, in addition to the global
        MySQL system variables.
      - password: The password to get connected to the instance.
      - interactive: boolean value used to disable the wizards in the command
        execution, i.e. prompts are not provided to the user and confirmation
        prompts are not shown.

      The connection password may be contained on the instance definition,
      however, it can be overwritten if it is specified on the options.

      The returned descriptive text of the operation result indicates whether
      the instance is valid for InnoDB Cluster usage or not. If not, a table
      containing the following information is presented:

      - Variable: the invalid configuration variable.
      - Current Value: the current value for the invalid configuration
        variable.
      - Required Value: the required value for the configuration variable.
      - Note: the action to be taken.

      The note can be one of the following:

      - Update the config file and update or restart the server variable.
      - Update the config file and restart the server.
      - Update the config file.
      - Update the server variable.
      - Restart the server.

EXCEPTIONS
      ArgumentError in the following scenarios:

      - If the instance parameter is empty.
      - If the instance definition is invalid.
      - If the instance definition is a connection dictionary but empty.

      RuntimeError in the following scenarios:

      - If the instance accounts are invalid.
      - If the instance is offline.
      - If the instance is already part of a Replication Group.
      - If the instance is already part of an InnoDB Cluster.
      - If the given the instance cannot be used for Group Replication.

Check Instance Configuration

In order to check a MySQL instance I must connect to that instance, either by connecting to that instance with MySQL Shell or by providing the connection data to the function:

MySQL JS> dba.checkInstanceConfiguration('root@172.20.0.11')
Validating MySQL instance at 172.20.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1

Checking whether existing tables comply with Group Replication requirements...
WARNING: The following tables do not have a Primary Key or equivalent column: 
test.squares, test.people, test.animal

Group Replication requires tables to use InnoDB and have a PRIMARY KEY or PRIMARY KEY Equivalent (non-null unique key). Tables that do not follow these requirements will be readable but not updateable when used with Group Replication. If your applications make updates (INSERT, UPDATE or DELETE) to these tables, ensure they use the InnoDB storage engine and have a PRIMARY KEY or PRIMARY KEY Equivalent.

Checking instance configuration...

Some configuration options need to be fixed:
+--------------------------+---------------+----------------+--------------------------------------------------+
| Variable                 | Current Value | Required Value | Note                                             |
+--------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum          | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                | OFF           | ON             | Update read-only variable and restart the server |
| server_id                | 1             | <unique ID>    | Update read-only variable and restart the server |
+--------------------------+---------------+----------------+--------------------------------------------------+

Some variables need to be changed, but cannot be done dynamically on the server.
Please use the dba.configureInstance() command to repair these issues.

{
    "config_errors": [
        {
            "action": "server_update", 
            "current": "CRC32", 
            "option": "binlog_checksum", 
            "required": "NONE"
        }, 
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        }, 
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        }, 
        {
            "action": "restart", 
            "current": "1", 
            "option": "server_id", 
            "required": "<unique ID>"
        }
    ], 
    "status": "error"
}

The output depends on the instance current status.
In my case 3 tables do not meet the requirements because of lack of Primary key (or non-null unique key).
Also I need to set correctly 4 variables and I must restart the MySQL instance because of 3 of them.

Automation

It is not always convenient (or recommended) to do these kind of task manually.
MySQL Shell is built in regards to DevOps usage :

$ mysqlsh -e "dba.checkInstanceConfiguration('root@172.20.0.12')"
Validating MySQL instance at 172.20.0.12:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node2

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...

Some configuration options need to be fixed:
+--------------------------+---------------+----------------+--------------------------------------------------+
| Variable                 | Current Value | Required Value | Note                                             |
+--------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum          | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                | OFF           | ON             | Update read-only variable and restart the server |
| server_id                | 1             | <unique ID>    | Update read-only variable and restart the server |
+--------------------------+---------------+----------------+--------------------------------------------------+

Some variables need to be changed, but cannot be done dynamically on the server.
Please use the dba.configureInstance() command to repair these issues.

Or even more practical:

$ mysqlsh -- dba checkInstanceConfiguration --user=root --host=172.20.0.13
Validating MySQL instance at 172.20.0.13:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node3

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...

Some configuration options need to be fixed:
+--------------------------+---------------+----------------+--------------------------------------------------+
| Variable                 | Current Value | Required Value | Note                                             |
+--------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum          | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                | OFF           | ON             | Update read-only variable and restart the server |
| server_id                | 1             | <unique ID>    | Update read-only variable and restart the server |
+--------------------------+---------------+----------------+--------------------------------------------------+

Some variables need to be changed, but cannot be done dynamically on the server.
Please use the dba.configureInstance() command to repair these issues.

{
    "config_errors": [
        {
            "action": "server_update", 
            "current": "CRC32", 
            "option": "binlog_checksum", 
            "required": "NONE"
        }, 
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        }, 
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        }, 
        {
            "action": "restart", 
            "current": "1", 
            "option": "server_id", 
            "required": "<unique ID>"
        }
    ], 
    "status": "error"
}

An other option is to create a script and pass it to MySQL Shell.
A very simple (and naive) example could be:

$ cat /tmp/servers.js
dba.checkInstanceConfiguration('root@172.20.0.11');
dba.checkInstanceConfiguration('root@172.20.0.12');
dba.checkInstanceConfiguration('root@172.20.0.13');

then process the file:

$ mysqlsh  -f /tmp/servers.js
Validating MySQL instance at 172.20.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance '172.20.0.11:3306' is valid for InnoDB cluster usage.

Validating MySQL instance at 172.20.0.12:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node2

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance '172.20.0.12:3306' is valid for InnoDB cluster usage.

Validating MySQL instance at 172.20.0.13:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node3

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance '172.20.0.13:3306' is valid for InnoDB cluster usage.

In the previous scenario all the MySQL instances was set properly before the check.

Note that all that has been done previously in Javascript can also be done in Python :

MySQL 172.20.0.11:33060+ JS> \py
Switching to Python mode...

MySQL 172.20.0.11:33060+ Py> dba.check_instance_configuration()
Validating MySQL instance at 172.20.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance '172.20.0.11:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}
$ mysqlsh root@172.20.0.11 --py -f check_servers.py
...

To summarize

Q: How do I validate an instance for MySQL InnoDB Cluster usage?

A: Use check_instance_configuration()

References

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

1

CHECK constraints in MySQL

May 14, 2019
Above the clouds by Olivier DASINI

MySQL (really) supports CHECK CONSTRAINT since version 8.0.16.
In this article I will show you 2 things:

  1. An elegant way to simulate check constraint in MySQL 5.7 & 8.0.
  2. How easy and convenient it is to use CHECK constraints starting from MySQL 8.0.16.

Please note that this article is strongly inspired by Mablomy‘s blog post: CHECK constraint for MySQL – NOT NULL on generated columns.

I’m using the optimized MySQL Server Docker images, created, maintained and supported by the MySQL team at Oracle.
For clarity I chose MySQL 8.0.15 for the check constraint hack and obviously 8.0.16 for the “real” check constraint implementation.


Deployment of MySQL 8.0.15 & MySQL 8.0.16:

$ docker run --name=mysql-8.0.15 -e MYSQL_ROOT_PASSWORD=unsafe -d mysql/mysql-server:8.0.15
 d4ce35e429e08bbf46a02729e6667458e2ed90ce94e7622f1342ecb6c0dfa009
$ docker run --name=mysql-8.0.16 -e MYSQL_ROOT_PASSWORD=unsafe -d mysql/mysql-server:8.0.16
 d3b22dff1492fe6cb488a7f747e4709459974e79ae00b60eb0aee20546b68a0f

Note:

Obviously using a password on the command line interface can be insecure.

Please read the best practices of deploying MySQL on Linux with Docker.

Example 1

Check constraints hack

$ docker exec -it mysql-8.0.15 mysql -uroot -p --prompt='mysql-8.0.15> '
Enter password: 

mysql-8.0.15> CREATE SCHEMA test;
Query OK, 1 row affected (0.03 sec)

mysql-8.0.15> USE test
Database changed

mysql-8.0.15> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.15    |
+-----------+


mysql-8.0.15> 
CREATE TABLE checker_hack ( 
    i tinyint, 
    i_must_be_between_7_and_12 BOOLEAN 
         GENERATED ALWAYS AS (IF(i BETWEEN 7 AND 12, true, NULL)) 
         VIRTUAL NOT NULL
);

As you can see, the trick is to use Generated Columns, available since MySQL 5.7 and the flow control operator IF where the check condition is put.

mysql-8.0.15> INSERT INTO checker_hack (i) VALUES (11);
Query OK, 1 row affected (0.03 sec)

mysql-8.0.15> INSERT INTO checker_hack (i) VALUES (12);
Query OK, 1 row affected (0.01 sec)


mysql-8.0.15> SELECT i FROM checker_hack;
+------+
| i    |
+------+
|   11 |
|   12 |
+------+
2 rows in set (0.00 sec)

As expected, values that respect the condition (between 7 and 12) can be inserted.

mysql-8.0.15> INSERT INTO checker_hack (i) VALUES (13);
ERROR 1048 (23000): Column 'i_must_be_between_7_and_12' cannot be null


mysql-8.0.15> SELECT i FROM checker_hack;
+------+
| i    |
+------+
|   11 |
|   12 |
+------+
2 rows in set (0.00 sec)

Outside the limits, an error is raised.
We have our “check constraint” like feature 🙂

Check constraint since MySQL 8.0.16

$ docker exec -it mysql-8.0.16 mysql -uroot -p --prompt='mysql-8.0.16> '
Enter password: 

mysql-8.0.16> CREATE SCHEMA test;
Query OK, 1 row affected (0.08 sec)

mysql-8.0.16> USE test
Database changed

mysql-8.0.16> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.16    |
+-----------+


mysql-8.0.16> 
CREATE TABLE checker ( 
    i tinyint, 
    CONSTRAINT i_must_be_between_7_and_12 CHECK (i BETWEEN 7 AND 12 )
);

Since MySQL 8.0.16, the CHECK keyword do the job.
I would recommend to name wisely your constraint.
The syntax is:

[CONSTRAINT [symbol]] CHECK (expr) [[NOT] ENFORCED]

From there, the following is rather obvious:

mysql-8.0.16> INSERT INTO checker (i) VALUES (11);
Query OK, 1 row affected (0.02 sec)

mysql-8.0.16> INSERT INTO checker (i) VALUES (12);
Query OK, 1 row affected (0.03 sec)


mysql-8.0.16> SELECT i FROM checker;
+------+
| i    |
+------+
|   11 |
|   12 |
+------+
2 rows in set (0.00 sec)

mysql-8.0.16> INSERT INTO checker (i) VALUES (13);
ERROR 3819 (HY000): Check constraint 'i_must_be_between_7_and_12' is violated.


mysql-8.0.16> SELECT i FROM checker;
+------+
| i    |
+------+
|   11 |
|   12 |
+------+
2 rows in set (0.00 sec)

Easy! 🙂

Example 2

You can check a combination of columns.

Check constraints hack

mysql-8.0.15> 
CREATE TABLE squares_hack (
     dx DOUBLE, 
     dy DOUBLE, 
     area_must_be_larger_than_10 BOOLEAN 
           GENERATED ALWAYS AS (IF(dx*dy>10.0, true, NULL)) NOT NULL
);

mysql-8.0.15> INSERT INTO squares_hack (dx,dy) VALUES (7,4);
Query OK, 1 row affected (0.02 sec)


mysql-8.0.15> INSERT INTO squares_hack (dx,dy) VALUES (2,4);
ERROR 1048 (23000): Column 'area_must_be_larger_than_10' cannot be null


mysql-8.0.15> SELECT dx, dy FROM squares_hack;
+------+------+
| dx   | dy   |
+------+------+
|    7 |    4 |
+------+------+
1 row in set (0.00 sec)

Check constraint since MySQL 8.0.16

mysql-8.0.16> 
CREATE TABLE squares (
     dx DOUBLE, 
     dy DOUBLE, 
     CONSTRAINT area_must_be_larger_than_10 CHECK ( dx * dy > 10.0 )
);


mysql-8.0.16> INSERT INTO squares (dx,dy) VALUES (7,4);
Query OK, 1 row affected (0.01 sec)


mysql-8.0.16> INSERT INTO squares (dx,dy) VALUES (2,4);
ERROR 3819 (HY000): Check constraint 'area_must_be_larger_than_10' is violated.


mysql-8.0.16> SELECT dx, dy FROM squares;
+------+------+
| dx   | dy   |
+------+------+
|    7 |    4 |
+------+------+
1 row in set (0.00 sec)

Still easy!

Example 3

You can also check text columns.

Check constraints hack

mysql-8.0.15> 
CREATE TABLE animal_hack (  
     name varchar(30) NOT NULL,  
     class varchar(100) DEFAULT NULL,  
     class_allow_Mammal_Reptile_Amphibian BOOLEAN 
           GENERATED ALWAYS AS (IF(class IN ("Mammal", "Reptile", "Amphibian"), true, NULL)) NOT NULL
);  

mysql-8.0.15> INSERT INTO animal_hack (name, class) VALUES ("Agalychnis callidryas",'Amphibian');  
Query OK, 1 row affected (0.02 sec)

mysql-8.0.15> INSERT INTO animal_hack (name, class) VALUES ("Orycteropus afer", 'Mammal');  
Query OK, 1 row affected (0.02 sec)

mysql-8.0.15> INSERT INTO animal_hack (name, class) VALUES ("Lacerta agilis", 'Reptile');  
Query OK, 1 row affected (0.02 sec)


mysql-8.0.15> SELECT name, class FROM animal_hack;
+-----------------------+-----------+
| name                  | class     |
+-----------------------+-----------+
| Agalychnis callidryas | Amphibian |
| Orycteropus afer      | Mammal    |
| Lacerta agilis        | Reptile   |
+-----------------------+-----------+
3 rows in set (0.00 sec)
mysql-8.0.15> INSERT INTO animal_hack (name, class) VALUES ("Palystes castaneus", 'Arachnid'); 
ERROR 1048 (23000): Column 'class_allow_Mammal_Reptile_Amphibian' cannot be null


mysql-8.0.15> SELECT name, class FROM animal_hack;
+-----------------------+-----------+
| name                  | class     |
+-----------------------+-----------+
| Agalychnis callidryas | Amphibian |
| Orycteropus afer      | Mammal    |
| Lacerta agilis        | Reptile   |
+-----------------------+-----------+
3 rows in set (0.00 sec)

Check constraint since MySQL 8.0.16

mysql-8.0.16> 
CREATE TABLE animal (  
     name varchar(30) NOT NULL,  
     class varchar(100) DEFAULT NULL,  
     CONSTRAINT CHECK (class IN ("Mammal", "Reptile", "Amphibian"))
);  

mysql-8.0.16> INSERT INTO animal (name, class) VALUES ("Agalychnis callidryas",'Amphibian');  
Query OK, 1 row affected (0.04 sec)

mysql-8.0.16> INSERT INTO animal (name, class) VALUES ("Orycteropus afer", 'Mammal');  
Query OK, 1 row affected (0.04 sec)

mysql-8.0.16> INSERT INTO animal (name, class) VALUES ("Lacerta agilis", 'Reptile');  
Query OK, 1 row affected (0.04 sec)


mysql-8.0.16> SELECT name, class FROM animal_hack;
+-----------------------+-----------+
| name                  | class     |
+-----------------------+-----------+
| Agalychnis callidryas | Amphibian |
| Orycteropus afer      | Mammal    |
| Lacerta agilis        | Reptile   |
+-----------------------+-----------+
3 rows in set (0.00 sec)
mysql-8.0.16> INSERT INTO animal (name, class) VALUES ("Palystes castaneus", 'Arachnid');  
ERROR 3819 (HY000): Check constraint 'animal_chk_1' is violated.


mysql-8.0.16> SELECT name, class FROM animal_hack;
+-----------------------+-----------+
| name                  | class     |
+-----------------------+-----------+
| Agalychnis callidryas | Amphibian |
| Orycteropus afer      | Mammal    |
| Lacerta agilis        | Reptile   |
+-----------------------+-----------+
3 rows in set (0.00 sec)

Frankly easy!

I did not mention that the hack works as well in 8.0.16, though not needed anymore.

CHECK constraint is another useful feature implemented in MySQL (and not the last one, stay tuned!).
There are some other interesting things to know about this feature but also about the others available in MySQL 8.0.16.
Please have a look on the references below.

References

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

Comments Off on CHECK constraints in MySQL

Constant-Folding Optimization in MySQL 8.0

May 7, 2019

TL;TR

In MySQL 8.0.16 the optimizer has improved again!
Comparisons of columns of numeric types with constant values are checked and folded or removed for invalid or out-of-rage values.
The goal is to speed up query execution.


The name of this article (Constant-Folding Optimization), named after this kind of optimization, is quite cryptic. Nevertheless the principle is simple and more important there is nothing to do from the user perspective.

What is “Constant-Folding Optimization” ?

From the MySQL Documentation :
Comparisons between constants and column values in which the constant value is out of range or of the wrong type with respect to the column type are now handled once during query optimization rather row-by-row than during execution.

From the MySQL Server Team Blog :
The goal is to speed up execution at the cost of a little more analysis at optimize time.
Always true and false comparisons are detected and eliminated.
In other cases, the type of the constant is adjusted to match that of the field if they are not the same, avoiding type conversion at execution time
.

Clear enough?

One example is worth a thousand words, so let’s have a deeper look comparing the old behavior in MySQL 8.0.15 to the new one beginning with MySQL 8.0.16.

I’m using the optimized MySQL Server Docker images, created, maintained and supported by the MySQL team at Oracle.

Deployment of MySQL 8.0.15 & MySQL 8.0.16:

$ docker run --name=mysql_8.0.15 -e MYSQL_ROOT_PASSWORD=unsafe -d mysql/mysql-server:8.0.15
$ docker run --name=mysql_8.0.16 -e MYSQL_ROOT_PASSWORD=unsafe -d mysql/mysql-server:8.0.16

Note:

Obviously using a password on the command line interface can be insecure.

Please read the best practices of deploying MySQL on Linux with Docker.


Copy the test table dump file on 8.0.15 & 8.0.16:

$ docker cp ./testtbl.sql mysql_8.0.15:/tmp/testtbl.sql
$ docker cp ./testtbl.sql mysql_8.0.16:/tmp/testtbl.sql


Load the test table into 8.0.15 instance:

$ docker exec -it mysql_8.0.15 mysql -u root -p --prompt='mysql_8.0.15> '

Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 31
Server version: 8.0.15 MySQL Community Server - GPL

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql_8.0.15> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.15    |
+-----------+

mysql_8.0.15> CREATE SCHEMA test;
Query OK, 1 row affected (0.04 sec)

mysql_8.0.15> USE test
Database changed

mysql_8.0.15> source /tmp/testtbl.sql
... <snip> ...


Load the test table into 8.0.16 instance:

$ docker exec -it mysql_8.0.16 mysql -u root -p --prompt='mysql_8.0.16> '

Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 8.0.16 MySQL Community Server - GPL

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql_8.0.16> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.16    |
+-----------+

mysql_8.0.16> CREATE SCHEMA test;
Query OK, 1 row affected (0.04 sec)

mysql_8.0.16> USE test
Database changed

mysql_8.0.16> source /tmp/testtbl.sql
... <snip> ...



Let’s see what we have loaded:

mysql_8.0.16> SHOW CREATE TABLE testtbl\G
*************************** 1. row ***************************
       Table: testtbl
Create Table: CREATE TABLE `testtbl` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `val` varchar(36) NOT NULL,
  `val2` varchar(36) DEFAULT NULL,
  `val3` varchar(36) DEFAULT NULL,
  `val4` varchar(36) DEFAULT NULL,
  `num` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx2` (`val2`),
  KEY `idx3` (`val3`),
  KEY `idx4` (`val4`)
) ENGINE=InnoDB AUTO_INCREMENT=14220001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci


mysql_8.0.16> SELECT COUNT(*) FROM testtbl;
+----------+
| COUNT(*) |
+----------+
|  5000000 |
+----------+

What is important for us here is the non indexed column – num :

num int(10) unsigned DEFAULT NULL

It contains only positive numbers:

mysql_8.0.16> SELECT min(num), max(num) FROM testtbl;
+----------+----------+
| min(num) | max(num) |
+----------+----------+
|  9130001 | 14130000 |
+----------+----------+

The old behavior

What happens if I looking for a negative number, let’s say -12345, on the column num ?
Remember that it contains only positive numbers and there is no index.

mysql_8.0.15> EXPLAIN SELECT * FROM testtbl WHERE num=-12345\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtbl
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 4820634
     filtered: 10.00
        Extra: Using where

According to the EXPLAIN plan, we have a full table scan. In a way that makes sense because there is no index on num.
However we know that there is no negative value, so there is certainly some room for improvements 🙂

Running the query:

mysql_8.0.15> SELECT * FROM testtbl WHERE num=-12345;
Empty set (2.77 sec)

Indeed the full table scan could be costly.

The current behavior – 8.0.16+

The Constant-Folding Optimization improves the execution of this type of queries.

The EXPLAIN plan for MySQL 8.0.16 is completely different:

mysql_8.0.16> EXPLAIN SELECT * FROM testtbl WHERE num=-12345\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Impossible WHERE

Did you notice the:

Extra: Impossible WHERE

Looking for the negative value in a strictly positive column was processed at the optimize time!
So they are obviously a positive impact on the query execution time:

mysql_8.0.16> SELECT * FROM testtbl WHERE num=-12345;
Empty set (0.00 sec)

Yay!



In addition to the = operator, this optimization is currently possible for >, >=, <, <=, =, <>, != and <=> as well.
e.g.

mysql_8.0.16> EXPLAIN SELECT * FROM testtbl WHERE num > -42 AND num <= -1 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Impossible WHERE


mysql_8.0.16> SELECT * FROM testtbl WHERE num > -42 AND num <=  -1;
Empty set (0.00 sec)

Indexed column

As a side note, if your column is indexed the optimizer already have the relevant information, so before 8.0.16, no need of Constant-Folding Optimization, to have a fast query :).

mysql_8.0.15> CREATE INDEX idx_num ON testtbl(num);
Query OK, 0 rows affected (24.84 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql_8.0.15> EXPLAIN SELECT * FROM testtbl WHERE num = -12345\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: no matching row in const table
1 row in set, 1 warning (0.00 sec)


mysql_8.0.15> SELECT * FROM testtbl WHERE num = -12345;
Empty set (0.00 sec)

References

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

Comments Off on Constant-Folding Optimization in MySQL 8.0

MySQL InnoDB Cluster – HowTo #1 – Monitor your cluster

April 11, 2019
Sakila HA by Olivier DASINI

How do I… Monitor the status & the configuration of my cluster?

Short answer

Use:

status()

Long answer…

Assuming you already have a MySQL InnoDB Cluster up and running. If not, please RTFM 🙂
Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague Ivan.

I’m using MySQL 8.0.15

MySQL localhost:33060+ JS> session.sql('SELECT VERSION()')
+-----------+
| VERSION() |
+-----------+
| 8.0.15    |
+-----------+

Let’s connect to my cluster

$ mysqlsh root@localhost --cluster

Please provide the password for 'root@localhost': ****
MySQL Shell 8.0.15

Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'root@localhost'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 1520 (X protocol)
Server version: 8.0.15 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
You are connected to a member of cluster 'pocCluster'.
Variable 'cluster' is set.
Use cluster.status() in scripting mode to get status of this cluster or cluster.help() for more commands.

The “– – cluster” argument enables cluster management by setting the global variable.
This variable is a reference to the MySQL InnoDB Cluster object session. It will give you access (among others) to the status() method that allows you to check and monitor the cluster.

Ask for help

The built-in help is simply awesome!

MySQL localhost:33060+ JS> cluster.help('status')
NAME
      status - Describe the status of the cluster.

SYNTAX
      <Cluster>.status([options])

WHERE
      options: Dictionary with options.

RETURNS
       A JSON object describing the status of the cluster.

DESCRIPTION
      This function describes the status of the cluster including its
      ReplicaSets and Instances. The following options may be given to control
      the amount of information gathered and returned.

      - extended: if true, includes information about transactions processed by
        connection and applier, as well as groupName and memberId values.
      - queryMembers: if true, connect to each Instance of the ReplicaSets to
        query for more detailed stats about the replication machinery.

EXCEPTIONS
      MetadataError in the following scenarios:

      - If the Metadata is inaccessible.
      - If the Metadata update operation failed.

Cluster status

So let’s discover the status of our cluster

MySQL localhost:33060+ JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.19.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.19.0.11:3306": {
                "address": "172.19.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.19.0.12:3306": {
                "address": "172.19.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.19.0.13:3306": {
                "address": "172.19.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "172.19.0.11:3306"
}

Note:
The instance’s state in the cluster directly influences the information provided in the status report. Therefore ensure the instance you are connected to has a status of ONLINE.

As you can see, by default status() gives you a lot of relevant information.
Thus it could be used to monitor your cluster although the best tool available to monitor your MySQL InnoDB Cluster (but also MySQL Replication, MySQL NDB Cluster and obviously your standalone MySQL servers) is MySQL Enterprise Monitor.

More details with “A Guide to MySQL Enterprise Monitor“.

Extended cluster status

MySQL Group Replication provides several metrics and detailed information about the underlying cluster in MySQL InnoDB clusters.
These metrics which are used for monitoring are based on these Performance Schema tables.

Some of these information are available through MySQL Shell. You can control the amount of information gathered and returned with 2 options: extended & queryMembers.

extended

if enabled, includes information about groupName and memberID for each member; and general statistics about the number of transactions checked, proposed, rejected by members…

MySQL localhost:33060+ JS> cluster.status({extended:true})
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "groupName": "72568575-561c-11e9-914c-0242ac13000b", 
        "name": "default", 
        "primary": "172.19.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.19.0.11:3306": {
                "address": "172.19.0.11:3306", 
                "memberId": "4a85f6c4-561c-11e9-8401-0242ac13000b", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "appliedCount": 2, 
                    "checkedCount": 53, 
                    "committedAllMembers": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-51", 
                    "conflictsDetectedCount": 0, 
                    "inApplierQueueCount": 0, 
                    "inQueueCount": 0, 
                    "lastConflictFree": "72568575-561c-11e9-914c-0242ac13000b:56", 
                    "proposedCount": 53, 
                    "rollbackCount": 0
                }
            }, 
            "172.19.0.12:3306": {
                "address": "172.19.0.12:3306", 
                "memberId": "4ad75450-561c-11e9-baa8-0242ac13000c", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "appliedCount": 44, 
                    "checkedCount": 43, 
                    "committedAllMembers": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-41", 
                    "conflictsDetectedCount": 0, 
                    "inApplierQueueCount": 0, 
                    "inQueueCount": 0, 
                    "lastConflictFree": "72568575-561c-11e9-914c-0242ac13000b:52", 
                    "proposedCount": 0, 
                    "rollbackCount": 0
                }
            }, 
            "172.19.0.13:3306": {
                "address": "172.19.0.13:3306", 
                "memberId": "4b77c1ec-561c-11e9-9cc1-0242ac13000d", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "appliedCount": 42, 
                    "checkedCount": 42, 
                    "committedAllMembers": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-41", 
                    "conflictsDetectedCount": 0, 
                    "inApplierQueueCount": 0, 
                    "inQueueCount": 0, 
                    "lastConflictFree": "72568575-561c-11e9-914c-0242ac13000b:53", 
                    "proposedCount": 0, 
                    "rollbackCount": 0
                }
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "172.19.0.11:3306"
}

queryMembers

if enabled, includes information about recovery and regular transaction I/O, applier worker thread statistic and any lags; applier coordinator statistic…

MySQL localhost:33060+ JS> cluster.status({queryMembers:true})
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.19.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.19.0.11:3306": {
                "address": "172.19.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "connection": {
                        "lastHeartbeatTimestamp": "", 
                        "lastQueued": {
                            "endTimestamp": "2019-04-03 14:26:33.394755", 
                            "immediateCommitTimestamp": "", 
                            "immediateCommitToEndTime": null, 
                            "originalCommitTimestamp": "", 
                            "originalCommitToEndTime": null, 
                            "queueTime": 0.000077, 
                            "startTimestamp": "2019-04-03 14:26:33.394678", 
                            "transaction": "72568575-561c-11e9-914c-0242ac13000b:13"
                        }, 
                        "receivedHeartbeats": 0, 
                        "receivedTransactionSet": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-65", 
                        "threadId": null
                    }, 
                    "workers": [
                        {
                            "lastApplied": {
                                "applyTime": 0.022927, 
                                "endTimestamp": "2019-04-03 14:26:33.417643", 
                                "immediateCommitTimestamp": "", 
                                "immediateCommitToEndTime": null, 
                                "originalCommitTimestamp": "", 
                                "originalCommitToEndTime": null, 
                                "retries": 0, 
                                "startTimestamp": "2019-04-03 14:26:33.394716", 
                                "transaction": "72568575-561c-11e9-914c-0242ac13000b:13"
                            }, 
                            "threadId": 58
                        }
                    ]
                }
            }, 
            "172.19.0.12:3306": {
                "address": "172.19.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "connection": {
                        "lastHeartbeatTimestamp": "", 
                        "lastQueued": {
                            "endTimestamp": "2019-04-03 15:42:30.855989", 
                            "immediateCommitTimestamp": "", 
                            "immediateCommitToEndTime": null, 
                            "originalCommitTimestamp": "2019-04-03 15:42:30.854594", 
                            "originalCommitToEndTime": 0.001395, 
                            "queueTime": 0.000476, 
                            "startTimestamp": "2019-04-03 15:42:30.855513", 
                            "transaction": "72568575-561c-11e9-914c-0242ac13000b:65"
                        }, 
                        "receivedHeartbeats": 0, 
                        "receivedTransactionSet": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-65", 
                        "threadId": null
                    }, 
                    "workers": [
                        {
                            "lastApplied": {
                                "applyTime": 0.024685, 
                                "endTimestamp": "2019-04-03 15:42:30.880361", 
                                "immediateCommitTimestamp": "", 
                                "immediateCommitToEndTime": null, 
                                "originalCommitTimestamp": "2019-04-03 15:42:30.854594", 
                                "originalCommitToEndTime": 0.025767, 
                                "retries": 0, 
                                "startTimestamp": "2019-04-03 15:42:30.855676", 
                                "transaction": "72568575-561c-11e9-914c-0242ac13000b:65"
                            }, 
                            "threadId": 54
                        }
                    ]
                }
            }, 
            "172.19.0.13:3306": {
                "address": "172.19.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "transactions": {
                    "connection": {
                        "lastHeartbeatTimestamp": "", 
                        "lastQueued": {
                            "endTimestamp": "2019-04-03 15:42:30.855678", 
                            "immediateCommitTimestamp": "", 
                            "immediateCommitToEndTime": null, 
                            "originalCommitTimestamp": "2019-04-03 15:42:30.854594", 
                            "originalCommitToEndTime": 0.001084, 
                            "queueTime": 0.000171, 
                            "startTimestamp": "2019-04-03 15:42:30.855507", 
                            "transaction": "72568575-561c-11e9-914c-0242ac13000b:65"
                        }, 
                        "receivedHeartbeats": 0, 
                        "receivedTransactionSet": "4a85f6c4-561c-11e9-8401-0242ac13000b:1-12,
72568575-561c-11e9-914c-0242ac13000b:1-65", 
                        "threadId": null
                    }, 
                    "workers": [
                        {
                            "lastApplied": {
                                "applyTime": 0.021354, 
                                "endTimestamp": "2019-04-03 15:42:30.877398", 
                                "immediateCommitTimestamp": "", 
                                "immediateCommitToEndTime": null, 
                                "originalCommitTimestamp": "2019-04-03 15:42:30.854594", 
                                "originalCommitToEndTime": 0.022804, 
                                "retries": 0, 
                                "startTimestamp": "2019-04-03 15:42:30.856044", 
                                "transaction": "72568575-561c-11e9-914c-0242ac13000b:65"
                            }, 
                            "threadId": 54
                        }
                    ]
                }
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "172.19.0.11:3306"
}

To summarize

Q: How do I monitor the status & the configuration of my cluster?

A: Use status() or status({extended:true}) or status({queryMembers:true})

References

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

5

MySQL JSON Document Store

April 2, 2019

Introduction

MySQL is the most popular Open Source database!
An ACID (acronym standing for Atomicity, Consistency, Isolation, and Durability) compliant relational database that allows you, among others, to manage your data with the powerful and proven SQL, to take care of your data integrity with transactions, foreign keys, …
But you already know that 🙂

JavaScript Objet Notation, better known as JSON, is a lightweight and very popular data-interchange format. Use for storing and exchanging data.
A JSON document is a standardized object that can represent structured data. And the structure is implicit in the document.
Anyway, I bet you know that too!

Started with MySQL 5.7.8, you can handle JSON documents in a “relational way”, using SQL queries and also storing them using the MySQL native JSON data type.
We also provides a large set of JSON functions.
I hope you were aware of that!

You should be interested in:

Note:

I would recommend you to have a closer look at JSON_TABLE function, that extract data from a JSON document and returns it as a relational table… It’s just amazing!

However MySQL 8.0 provides another way to handle JSON documents, actually in a “Not only SQL” (NoSQL) approach…
In other words, if you need/want to manage JSON documents (collections) in a non-relational manner, with CRUD (acronym for Create/Read/Update/Delete) operations then you can use MySQL 8.0!
Did you know that?

MySQL Document Store Architecture

Let’s have a quick overview of the architecture.

MySQL Document Store Architecture

  • X Plugin – The X Plugin enables MySQL to use the X Protocol and uses Connectors and the Shell to act as clients to the server.
  • X Protocol – The X Protocol is a new client protocol based on top of the Protobuf library, and works for both, CRUD and SQL operations.
  • X DevAPI – The X DevAPI is a new, modern, async developer API for CRUD and SQL operations on top of X Protocol. It introduces Collections as new Schema objects. Documents are stored in Collections and have their dedicated CRUD operation set.
  • MySQL Shell – The MySQL Shell is an interactive Javascript, Python, or SQL interface supporting development and administration for the MySQL Server. You can use the MySQL Shell to perform data queries and updates as well as various administration operations.
  • MySQL Connectors – Connectors that support the X Protocol and enable you to use X DevAPI in your chosen language (Node.jsPHPPythonJava.NETC++,…).

Write application using X DevAPI

As a disclaimer, I am not a developer, so sorry no fancy code in this blog post.
However the good news is that I can show you were you’ll be able to found the best MySQL developer resources ever 🙂 that is :

https://insidemysql.com/

And to start, I recommend to focus on the following articles:

And of course the newest articles as well.
Furthermore, another resource that would be useful to you is the

X DevAPI User Guide

Use Document Store with MySQL Shell

If you are a DBA, OPS and obviously a developer, the simplest way to use (or test) MySQL Document Store, is with MySQL Shell.

MySQL Shell is an integrated development & administration shell where all MySQL products will be available through a common scripting interface.
If you don’t know it yet, please download it.
Trust me you are going to love it !

MySQL Shell

MySQL Shell key features are :

  • Scripting for Javascript, Python, and SQL mode
  • Supports MySQL Standard and X Protocols
  • Document and Relational Models
  • CRUD Document and Relational APIs via scripting
  • Traditional Table, JSON, Tab Separated output results formats
  • Both Interactive and Batch operations

Note:

MySQL Shell is also a key component of MySQL InnoDB Cluster. In this context, it allows you to deploy and manager a MySQL Group Replication cluster.

See my MySQL InnoDB Cluster tutorial.

First steps with MySQL Shell

Let’s connect to the MySQL Server with MySQL Shell (mysqlsh)

$ mysqlsh root@myHost
MySQL Shell 8.0.15
... snip ...
Your MySQL connection id is 15 (X protocol)
Server version: 8.0.15 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.

We must be inside a X session in order to use MySQL as a document store. Luckily there is no extra step, because it’s the default in MySQL 8.0. Note that the default “X” port is 33060.
You can check that you are inside a X session thus using X protocol

MySQL myHost:33060+ JS> session
<Session:root@myHost:33060>

MySQL myHost:33060+ JS> \status
...snip...

Session type:                 X

Default schema:               
Current schema:               
Server version:               8.0.15 MySQL Community Server - GPL
Protocol version:             X protocol
...snip...

If you are connected inside a classic session, you’ll get the following input (note “<ClassicSession instead of <Session”) :

MySQL myHost:3306 JS> session
<ClassicSession:root@myHost:3306>

You can know what is you X protocol port by checking mysqlx_port variable.
I’ll switch to the MySQL Shell SQL mode to execute my SQL command:

MySQL myHost:3306 JS> \sql
Switching to SQL mode... Commands end with ;

MySQL myHost:3306 SQL> SHOW VARIABLES LIKE 'mysqlx_port';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| mysqlx_port   | 33060 |
+---------------+-------+

Then reconnect to the server using the right port (33060 by default) and you should be fine :

MySQL myHost:3306 SQL> \connect root@myHost:33060 
...snip...
Your MySQL connection id is 66 (X protocol)
Server version: 8.0.15 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.

MySQL myHost:33060+ SQL> \js
Switching to JavaScript mode...

MySQL myHost:33060+ JS> session
<Session:root@myHost:33060>

CRUD

We are going to create a schema (demo) where we will do our tests

MySQL myHost:33060+ JS> session.createSchema('demo')
<Schema:demo>


MySQL myHost:33060+ JS> \use demo
Default schema `demo` accessible through db.

Note:

The MySQL Shell default language is JavaScript. However, all the steps described in this article can also be done in Python.

e.g.

JS> session.createSchema(‘demo’)

Py> session.create_schema(‘demo’)

Create documents

Create a collection (my_coll1) insert the schema demo and insert documents :

MySQL myHost:33060+ demo JS> db.createCollection('my_coll1');
<Collection:my_coll1>


MySQL myHost:33060+ demo JS> db.my_coll1.add({"title":"MySQL Document Store", "abstract":"SQL is now optional!", "code": "42"})
Query OK, 1 item affected (0.0358 sec)

Trying to add a non valid JSON document raise an error :

MySQL myHost:33060+ demo JS> db.my_coll1.add("This is not a valid JSON document")
CollectionAdd.add: Argument #1 expected to be a document, JSON expression or a list of documents (ArgumentError)

List collections

To get the list of collections belonging to the current schema use getCollections() :

MySQL myHost:33060+ demo JS> db.getCollections()
[
    <Collection:my_coll1>
]

Find documents

Display the content of a collection with find() :

MySQL myHost:33060+ demo JS> db.my_coll1.find()
[
    {
        "_id": "00005c9514e60000000000000053",
        "code": "42",
        "title": "MySQL Document Store",
        "abstract": "SQL is now optional!"
    }
]
1 document in set (0.0029 sec)

Note:

Each document requires an identifier field called _id. The value of the _id field must be unique among all documents in the same collection.
MySQL server sets an _id value if the document does not contain the _id field.

Please read: Understanding Document IDs.

You can execute many operations on your document. One practical way to get the list of available functions is to press the <TAB> key, to ask for auto-completion, after the dot “.”
For example, type db.my_coll1. then press <TAB>twice, you’ll get the following result:

MySQL myHost:33060+ demo JS> db.my_coll1.
add()               count()             dropIndex()         find()              getOne()            getSession()        modify()            remove()            replaceOne()        session
addOrReplaceOne()   createIndex()       existsInDatabase()  getName()           getSchema()         help()              name                removeOne()         schema

You can also use the awesome MySQL Shell built-in help (I strongly recommend my colleague Jesper‘s article) and please bookmark is blog.
Last but not least our documentations: X DevAPI User Guide, MySQL Shell JavaSCript API Reference & MySQL Shell Python API reference.

Modify documents

You’ll need the modify() function :

MySQL myHost:33060+ demo JS> db.my_coll1.find("_id='00005c9514e60000000000000053'").fields("code")
[
    {
        "code": "42"
    }
]


MySQL myHost:33060+ demo JS> db.my_coll1.modify("_id='00005c9514e60000000000000053'").set("code","2019")
Query OK, 1 item affected (0.0336 sec)
Rows matched: 1  Changed: 1  Warnings: 0


MySQL myHost:33060+ demo JS> db.my_coll1.find("_id='00005c9514e60000000000000053'").fields("code")
[
    {
        "code": "2019"
    }
]

Remove content from documents

You can also modify the structure of a document by remove a key and its content with modify() and unset().

MySQL myHost:33060+ demo JS> db.my_coll1.add({"title":"Quote", "message": "Strive for greatness"})
Query OK, 1 item affected (0.0248 sec)

MySQL myHost:33060+ demo JS> db.my_coll1.find()
[
    {
        "_id": "00005c9514e60000000000000053",
        "code": "42",
        "title": "MySQL Document Store",
        "abstract": "SQL is now optional!"
    },
    {
        "_id": "00005c9514e60000000000000054",
        "title": "Quote",
        "message": "Strive for greatness"
    }
]
2 documents in set (0.0033 sec)

MySQL myHost:33060+ demo JS> db.my_coll1.modify("_id='00005c9514e60000000000000054'").unset("title")
Query OK, 1 item affected (0.0203 sec)

Rows matched: 1  Changed: 1  Warnings: 0

MySQL myHost:33060+ demo JS> db.my_coll1.find("_id='00005c9514e60000000000000054'")
[
    {
        "_id": "00005c9514e60000000000000054",
        "message": "Strive for greatness"
    }
]

Remove documents

We are missing one last important operation, delete documents with remove()

MySQL myHost:33060+ demo JS> db.my_coll1.remove("_id='00005c9514e60000000000000054'")
Query OK, 1 item affected (0.0625 sec)


MySQL myHost:33060+ demo JS> db.my_coll1.find("_id='00005c9514e60000000000000054'")
Empty set (0.0003 sec)

You can also remove all documents in a collection with one command. To do so, use the remove(“true”) method without specifying any search condition.
Obviously it is usually not a good practice…

Import JSON dcouments

Let’s work with a bigger JSON collection.
MySQL Shell provide a very convenient tool, named importJson(), to easily import JSON documents inside your MySQL Server either in the form of collection or table.

MySQL myHost:33060+ demo JS> db.getCollections()
[
    <Collection:my_coll1>
]


MySQL myHost:33060+ demo JS> util.importJson('GoT_episodes.json')
Importing from file "GoT_episodes.json" to collection `demo`.`GoT_episodes` in MySQL Server at myHost:33060

.. 73.. 73
Processed 47.74 KB in 73 documents in 0.1051 sec (694.75 documents/s)
Total successfully imported documents 73 (694.75 documents/s)


MySQL myHost:33060+ demo JS> db.getCollections()
[
    <Collection:GoT_episodes>, 
    <Collection:my_coll1>
]

You can find the JSON file source here.
Note that I had to do an extra step before import the data:
sed ‘s/}}},{“id”/}}} {“id”/g’ got_episodes.json.BAK > got_episodes.json

By the way you can import data from MongoDB to MySQL \o/

No more excuses to finally get rid of MongoDB 😉

Let’s do some queries…

Display 1 document

MySQL myHost:33060+ demo JS> db.GoT_episodes.find().limit(1)
[
    {
        "id": 4952,
        "_id": "00005c9514e6000000000000009e",
        "url": "http://www.tvmaze.com/episodes/4952/game-of-thrones-1x01-winter-is-coming",
        "name": "Winter is Coming",
        "image": {
            "medium": "http://static.tvmaze.com/uploads/images/medium_landscape/1/2668.jpg",
            "original": "http://static.tvmaze.com/uploads/images/original_untouched/1/2668.jpg"
        },
        "_links": {
            "self": {
                "href": "http://api.tvmaze.com/episodes/4952"
            }
        },
        "number": 1,
        "season": 1,
        "airdate": "2011-04-17",
        "airtime": "21:00",
        "runtime": 60,
        "summary": "<p>Lord Eddard Stark, ruler of the North, is summoned to court by his old friend, King Robert Baratheon, to serve as the King's Hand. Eddard reluctantly agrees after learning of a possible threat to the King's life. Eddard's bastard son Jon Snow must make a painful decision about his own future, while in the distant east Viserys Targaryen plots to reclaim his father's throne, usurped by Robert, by selling his sister in marriage.</p>",
        "airstamp": "2011-04-18T01:00:00+00:00"
    }
]

Looks like data relative to a famous TV show 🙂

All episodes from season 1

MySQL myHost:33060+ demo JS> db.GoT_episodes.find("season=1").fields("name", "summary", "airdate").sort("number")
[
    {
        "name": "Winter is Coming",
        "airdate": "2011-04-17",
        "summary": "<p>Lord Eddard Stark, ruler of the North, is summoned to court by his old friend, King Robert Baratheon, to serve as the King's Hand. Eddard reluctantly agrees after learning of a possible threat to the King's life. Eddard's bastard son Jon Snow must make a painful decision about his own future, while in the distant east Viserys Targaryen plots to reclaim his father's throne, usurped by Robert, by selling his sister in marriage.</p>"
    },
    {
        "name": "The Kingsroad",
        "airdate": "2011-04-24",
        "summary": "<p>An incident on the Kingsroad threatens Eddard and Robert's friendship. Jon and Tyrion travel to the Wall, where they discover that the reality of the Night's Watch may not match the heroic image of it.</p>"
    },
    {
        "name": "Lord Snow",
        "airdate": "2011-05-01",
        "summary": "<p>Jon Snow attempts to find his place amongst the Night's Watch. Eddard and his daughters arrive at King's Landing.</p>"
    },
    {
        "name": "Cripples, Bastards, and Broken Things",
        "airdate": "2011-05-08",
        "summary": "<p>Tyrion stops at Winterfell on his way home and gets a frosty reception from Robb Stark. Eddard's investigation into the death of his predecessor gets underway.</p>"
    },
    {
        "name": "The Wolf and the Lion",
        "airdate": "2011-05-15",
        "summary": "<p>Catelyn's actions on the road have repercussions for Eddard. Tyrion enjoys the dubious hospitality of the Eyrie.</p>"
    },
    {
        "name": "A Golden Crown",
        "airdate": "2011-05-22",
        "summary": "<p>Viserys is increasingly frustrated by the lack of progress towards gaining his crown.</p>"
    },
    {
        "name": "You Win or You Die",
        "airdate": "2011-05-29",
        "summary": "<p>Eddard's investigations in King's Landing reach a climax and a dark secret is revealed.</p>"
    },
    {
        "name": "The Pointy End",
        "airdate": "2011-06-05",
        "summary": "<p>Tyrion joins his father's army with unexpected allies. Events in King's Landing take a turn for the worse as Arya's lessons are put to the test.</p>"
    },
    {
        "name": "Baelor",
        "airdate": "2011-06-12",
        "summary": "<p>Catelyn must negotiate with the irascible Lord Walder Frey.</p>"
    },
    {
        "name": "Fire and Blood",
        "airdate": "2011-06-19",
        "summary": "<p>Daenerys must realize her destiny. Jaime finds himself in an unfamiliar predicament.</p>"
    }
]

First episode of each season

MySQL myHost:33060+ demo JS> db.GoT_episodes.find("number=1").fields("name", "airdate", "season").sort("season")
[
    {
        "name": "Winter is Coming",
        "season": 1,
        "airdate": "2011-04-17"
    },
    {
        "name": "The North Remembers",
        "season": 2,
        "airdate": "2012-04-01"
    },
    {
        "name": "Valar Dohaeris",
        "season": 3,
        "airdate": "2013-03-31"
    },
    {
        "name": "Two Swords",
        "season": 4,
        "airdate": "2014-04-06"
    },
    {
        "name": "The Wars to Come",
        "season": 5,
        "airdate": "2015-04-12"
    },
    {
        "name": "The Red Woman",
        "season": 6,
        "airdate": "2016-04-24"
    },
    {
        "name": "Dragonstone",
        "season": 7,
        "airdate": "2017-07-16"
    },
    {
        "name": "TBA",
        "season": 8,
        "airdate": "2019-04-14"
    }
]
8 documents in set (0.0047 sec)

CRUD Prepared Statements

A common pattern with document store datastores is to repeatedly execute the same (or similar) kind of simple queries (e.g. “id” based lookup).
These queries can be accelerated using prepared (CRUD) statements.

For example if your application often use the following query:

MySQL myHost:33060+ demo JS> db.GoT_episodes.find("number=1 AND season=1").fields("name", "airdate")
[
    {
        "name": "Winter is Coming",
        "airdate": "2011-04-17"
    }
]

So it’s probably a good idea to use prepared statements.
First we need to prepare the query:

// Prepare a statement using a named parameter
var gotEpisode = db.GoT_episodes.find("number = :episodeNum AND season = :seasonNum").fields("name", "airdate")

Then bind the value to the parameter :

MySQL myHost:33060+ demo JS> gotEpisode.bind('episodeNum', 1).bind('seasonNum', 1)
[
    {
        "name": "Winter is Coming",
        "airdate": "2011-04-17"
    }
]
MySQL myHost:33060+ demo JS> gotEpisode.bind('episodeNum', 7).bind('seasonNum', 3)
[
    {
        "name": "The Bear and the Maiden Fair",
        "airdate": "2013-05-12"
    }
]

Simply powerful!

Index

Indeed relevant indexes is a common practice to improve performances. MySQL Document Store allows you to index your keys inside the JSON document.

Add a composite Index on keys season AND episode.

MySQL myHost:33060+ demo JS> db.GoT_episodes.createIndex('idxSeasonEpisode', {fields: [{field: "$.season", type: "TINYINT UNSIGNED", required: true}, {field: "$.number", type: "TINYINT UNSIGNED", required: true}]})
Query OK, 0 rows affected (0.1245 sec)

The required: true option means that it’s mandatory for all documents to contains at least the keys number and season.
E.g.

MySQL myHost:33060+ demo JS> db.GoT_episodes.add({"name": "MySQL 8 is Great"})
ERROR: 5115: Document is missing a required field


MySQL myHost:33060+ demo JS> db.GoT_episodes.add({"name": "MySQL 8 is Great", "number": 8})
ERROR: 5115: Document is missing a required field


MySQL myHost:33060+ demo JS> db.GoT_episodes.add({"name": "MySQL 8 is Great", "season": 8})
ERROR: 5115: Document is missing a required field

Add an index on key summary (30 first characters)

MySQL myHost:33060+ demo JS> db.GoT_episodes.createIndex('idxSummary', {fields: [{field: "$.summary", type: "TEXT(30)"}]})
Query OK, 0 rows affected (0.1020 sec)

Add a Unique Index on key id
Not the one generated by MySQL called _id and already indexed (primary key)

MySQL myHost:33060+ demo JS> db.GoT_episodes.createIndex('idxId', {fields: [{field: "$.id", type: "INT UNSIGNED"}], unique: true})
Query OK, 0 rows affected (0.3379 sec)

The unique: true option means that values of key id must be unique for each document inside the collection. i.e. no duplicate values.
E.g.

MySQL myHost:33060+ demo JS> db.GoT_episodes.add({"id":4952, "number": 42, "season": 42 })
ERROR: 5116: Document contains a field value that is not unique but required to be

You can obviously drop an index, using dropIndex().
E.g. db.GoT_episodes.dropIndex(“idxSummary”)

Transactions

MySQL Document Store is full ACID, it relies on the proven InnoDB’s strength & robustness.

Yes, you get it right, We do care about your data!

You need the functions below:

Let’s see an example with a multi collection transactions that will be rollback.

// Start the transaction
session.startTransaction()

MySQL myHost:33060+ demo JS> db.my_coll1.find()
[
    {
        "_id": "00005c9514e60000000000000053",
        "code": "42",
        "title": "MySQL Document Store",
        "abstract": "SQL is now optional!"
    }
]
1 document in set (0.0033 sec)


// Modify a document in collection my_coll1
MySQL myHost:33060+ demo JS> db.my_coll1.modify("_id = '00005c9514e60000000000000053'").unset("code")
Query OK, 1 item affected (0.0043 sec)
Rows matched: 1  Changed: 1  Warnings: 0


//Collection 1 : my_coll1
// Add a new document in my_coll1
MySQL myHost:33060+ demo JS> db.my_coll1.add({"title":"Quote", "message": "Be happy, be bright, be you"})
Query OK, 1 item affected (0.0057 sec)


MySQL myHost:33060+ demo JS> db.my_coll1.find()
[
    {
        "_id": "00005c9514e60000000000000053",
        "title": "MySQL Document Store",
        "abstract": "SQL is now optional!"
    },
    {
        "_id": "00005c9514e600000000000000e7",
        "title": "Quote",
        "message": "Be happy, be bright, be you"
    }
]
2 documents in set (0.0030 sec)



// Collection 2 : GoT_episodes
// Number of documents in GoT_episodes
MySQL myHost:33060+ demo JS> db.GoT_episodes.count()
73


// Remove all the 73 documents from GoT_episodes
MySQL myHost:33060+ demo JS> db.GoT_episodes.remove("true")
Query OK, 73 items affected (0.2075 sec)


// Empty collection
MySQL myHost:33060+ demo JS> db.GoT_episodes.count()
0



// Finally want my previous status back
// Rollback the transaction (if necessary e.g. in case of an error)
MySQL myHost:33060+ demo JS> session.rollback() 
Query OK, 0 rows affected (0.0174 sec)

Tadam!!!
We back in the past 🙂

MySQL myHost:33060+ demo JS> db.my_coll1.find()
[
    {
        "_id": "00005c9514e60000000000000053",
        "code": "42",
        "title": "MySQL Document Store",
        "abstract": "SQL is now optional!"
    }
]
1 document in set (0.0028 sec)


MySQL myHost:33060+ demo JS> db.GoT_episodes.count()
73

Execute (complex) SQL queries

NoSQL + SQL = MySQL

From the MySQL server point of view, collections are tables as well, like regular tables.
And this is very powerful !!!

Powerful because that allow you, within the same datastore (MySQL), to do CRUD queries and SQL queries on the same dataset.
Powerful because that allow you, to have your OLTP CRUD workload and your analytics SQL workload at the same place.
So no need to transfer/sync/… data from 1 datastore to another anymore!!!

You can do SQL queries using sql() functions:

MySQL myHost:33060+ demo JS> session.sql("SELECT count(*) FROM GoT_episodes")
+----------+
| count(*) |
+----------+
|       73 |
+----------+

You can also do SQL queries just as you have done until now, using the rich set of MySQL JSON functions.
OK let’s have a closer look.

Remember this CRUD query?

MySQL myHost:33060+ demo JS> db.GoT_episodes.find("number=1 AND season=1").fields("name", "airdate")
[
    {
        "name": "Winter is Coming",
        "airdate": "2011-04-17"
    }
]

Its SQL query alter ego is :

MySQL myHost:33060+ demo JS> \sql

MySQL myHost:33060+ demo SQL> 
SELECT doc->>"$.name" AS name, doc->>"$.airdate" AS airdate 
FROM GoT_episodes 
WHERE doc->>"$.number" = 1 AND doc->>"$.season" = 1\G
*************************** 1. row ***************************
   name: Winter is Coming
airdate: 2011-04-17

Let’s do some SQL queries…

Number of episodes by season

MySQL myHost:33060+ demo SQL> 
SELECT doc->>"$.season", COUNT(doc->>"$.number") 
FROM GoT_episodes 
GROUP BY doc->>"$.season";
+------------------+-------------------------+
| doc->>"$.season" | count(doc->>"$.number") |
+------------------+-------------------------+
| 1                |                      10 |
| 2                |                      10 |
| 3                |                      10 |
| 4                |                      10 |
| 5                |                      10 |
| 6                |                      10 |
| 7                |                       7 |
| 8                |                       6 |
+------------------+-------------------------+

Episode statistics for each season

MySQL myHost:33060+ demo SQL> 
SELECT DISTINCT
    doc->>"$.season" AS Season,
    max(doc->>"$.runtime") OVER w AS "Max duration",
    min(doc->>"$.runtime") OVER w AS "Min duration",
    AVG(doc->>"$.runtime") OVER w AS "Avg duration"
FROM GoT_episodes
WINDOW w AS (
    PARTITION BY doc->>"$.season"
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
);
+--------+--------------+--------------+--------------+
| Season | Max duration | Min duration | Avg duration |
+--------+--------------+--------------+--------------+
| 1      | 60           | 60           |           60 |
| 2      | 60           | 60           |           60 |
| 3      | 60           | 60           |           60 |
| 4      | 60           | 60           |           60 |
| 5      | 60           | 60           |           60 |
| 6      | 69           | 60           |         60.9 |
| 7      | 60           | 60           |           60 |
| 8      | 90           | 60           |           80 |
+--------+--------------+--------------+--------------+

Statistics on the number of days between episodes

MySQL myHost:33060+ demo SQL> 
SELECT
    doc->>"$.airdate" AS airdate, 
    DATEDIFF(doc->>"$.airdate", lag(doc->>"$.airdate") OVER w) AS "Delta days between episode",
    DATEDIFF(doc->>"$.airdate", first_value(doc->>"$.airdate") OVER w) AS "Total days since 1st episode"
FROM GoT_episodes
    WINDOW w AS (ORDER BY doc->>"$.airdate")
;
+------------+----------------------------+------------------------------+
| airdate    | Delta days between episode | Total days since 1st episode |
+------------+----------------------------+------------------------------+
| 2011-04-17 |                       NULL |                            0 |
| 2011-04-24 |                          7 |                            7 |
| 2011-05-01 |                          7 |                           14 |
| 2011-05-08 |                          7 |                           21 |
| 2011-05-15 |                          7 |                           28 |
| 2011-05-22 |                          7 |                           35 |
| 2011-05-29 |                          7 |                           42 |
| 2011-06-05 |                          7 |                           49 |
| 2011-06-12 |                          7 |                           56 |
| 2011-06-19 |                          7 |                           63 |
| 2012-04-01 |                        287 |                          350 |
| 2012-04-08 |                          7 |                          357 |
| 2012-04-15 |                          7 |                          364 |
| 2012-04-22 |                          7 |                          371 |
| 2012-04-29 |                          7 |                          378 |
| 2012-05-06 |                          7 |                          385 |
| 2012-05-13 |                          7 |                          392 |
| 2012-05-20 |                          7 |                          399 |
| 2012-05-27 |                          7 |                          406 |
| 2012-06-03 |                          7 |                          413 |
| 2013-03-31 |                        301 |                          714 |
| 2013-04-07 |                          7 |                          721 |
| 2013-04-14 |                          7 |                          728 |
| 2013-04-21 |                          7 |                          735 |
| 2013-04-28 |                          7 |                          742 |
| 2013-05-05 |                          7 |                          749 |
| 2013-05-12 |                          7 |                          756 |
| 2013-05-19 |                          7 |                          763 |
| 2013-06-02 |                         14 |                          777 |
| 2013-06-09 |                          7 |                          784 |
| 2014-04-06 |                        301 |                         1085 |
| 2014-04-13 |                          7 |                         1092 |
| 2014-04-20 |                          7 |                         1099 |
| 2014-04-27 |                          7 |                         1106 |
| 2014-05-04 |                          7 |                         1113 |
| 2014-05-11 |                          7 |                         1120 |
| 2014-05-18 |                          7 |                         1127 |
| 2014-06-01 |                         14 |                         1141 |
| 2014-06-08 |                          7 |                         1148 |
| 2014-06-15 |                          7 |                         1155 |
| 2015-04-12 |                        301 |                         1456 |
| 2015-04-19 |                          7 |                         1463 |
| 2015-04-26 |                          7 |                         1470 |
| 2015-05-03 |                          7 |                         1477 |
| 2015-05-10 |                          7 |                         1484 |
| 2015-05-17 |                          7 |                         1491 |
| 2015-05-24 |                          7 |                         1498 |
| 2015-05-31 |                          7 |                         1505 |
| 2015-06-07 |                          7 |                         1512 |
| 2015-06-14 |                          7 |                         1519 |
| 2016-04-24 |                        315 |                         1834 |
| 2016-05-01 |                          7 |                         1841 |
| 2016-05-08 |                          7 |                         1848 |
| 2016-05-15 |                          7 |                         1855 |
| 2016-05-22 |                          7 |                         1862 |
| 2016-05-29 |                          7 |                         1869 |
| 2016-06-05 |                          7 |                         1876 |
| 2016-06-12 |                          7 |                         1883 |
| 2016-06-19 |                          7 |                         1890 |
| 2016-06-26 |                          7 |                         1897 |
| 2017-07-16 |                        385 |                         2282 |
| 2017-07-23 |                          7 |                         2289 |
| 2017-07-30 |                          7 |                         2296 |
| 2017-08-06 |                          7 |                         2303 |
| 2017-08-13 |                          7 |                         2310 |
| 2017-08-20 |                          7 |                         2317 |
| 2017-08-27 |                          7 |                         2324 |
| 2019-04-14 |                        595 |                         2919 |
| 2019-04-21 |                          7 |                         2926 |
| 2019-04-28 |                          7 |                         2933 |
| 2019-05-05 |                          7 |                         2940 |
| 2019-05-12 |                          7 |                         2947 |
| 2019-05-19 |                          7 |                         2954 |
+------------+----------------------------+------------------------------+
73 rows in set (0.0066 sec)

Note:

Hey buddy, aren’t Window Functions very cool?

More here and here.

Drop collections

Use dropCollection() :

MySQL myHost:33060+ demo JS> db.getCollections()
[
    <Collection:GoT_episodes>, 
    <Collection:my_coll1>
]


MySQL myHost:33060+ demo JS> db.dropCollection("my_coll1")
MySQL myHost:33060+ demo JS> db.getCollections()
[
    <Collection:GoT_episodes>
]

Conclusion

Wow!
Probably one of my longest article, but I wanted to be sure to give you a large overview of MySQL Document Store (although not exhaustive) from a point of view of a non developer.


Now it is your turn to give it a try 🙂

NoSQL + SQL = MySQL

In order to go further

Some useful link:

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

8

MySQL Security – MySQL Enterprise Data Masking and De-Identification

March 19, 2019

When thinking about security within a MySQL installation, you should consider a wide range of possible procedures / best practices and how they affect the security of your MySQL server and related applications. MySQL provides many tools / features / plugins in order to protect your data including some advanced features like Transparent Data Encryption aka TDE,  Audit, Data Masking & De-Identification, Firewall, Password Management, Password Validation Plugin, etc…

MySQL Security

In order to mitigate the effects of data breaches, and therefore the associated risks for your organization’s brand and reputation, popular regulations or standards including GDPR, PCI DSS, HIPAA,… recommand (among others things) data masking and de-identification.

According to Wikipedia:

  • Data masking or data obfuscation is the process of hiding original data with modified content (characters or other data.)
  • De-identification is the process used to prevent a person’s identity from being connected with information. For example, data produced during human subject research might be de-identified to preserve research participants’ privacy.

In other words, MySQL Enterprise Data Masking and De-Identification hides sensitive information by replacing real values with substitutes in order to protect sensitive data while they are still look real and consistent.

This the topic of this eight episode of this MySQL  Security series (URLs to all the articles at the end of this page).

MySQL Enterprise Data Masking and De-Identification

The simplest way to present this MySQL feature :
A built-in database solution to help organizations protect sensitive data from unauthorized uses

MySQL Enterprise Masking and De-identificaiton protects sensitive data from unauthorized users.

Note:

MySQL Enterprise Data Masking and De-Identification is an extension included in MySQL Enterprise Edition, a commercial product.

Available in MySQL 8.0, as of 8.0.13 and in MySQL 5.7, as of 5.7.24.

First step, installation.

Installation

MySQL Enterprise Data Masking and De-Identification, is implemented as a plugin library file containing a plugin and user-defined functions (UDFs).
As usual install is easy:

mysql> 
INSTALL PLUGIN data_masking SONAME 'data_masking.so';
CREATE FUNCTION gen_blacklist RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_dictionary RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_dictionary_drop RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_dictionary_load RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_range RETURNS INTEGER  SONAME 'data_masking.so';
CREATE FUNCTION gen_rnd_email RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_rnd_pan RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_rnd_ssn RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION gen_rnd_us_phone RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION mask_inner RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION mask_outer RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION mask_pan RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION mask_pan_relaxed RETURNS STRING  SONAME 'data_masking.so';
CREATE FUNCTION mask_ssn RETURNS STRING  SONAME 'data_masking.so';

You can check the activation of the data masking plugin:

mysql> 
SELECT PLUGIN_NAME, PLUGIN_STATUS, PLUGIN_VERSION, PLUGIN_LIBRARY, PLUGIN_DESCRIPTION 
FROM INFORMATION_SCHEMA.PLUGINS 
WHERE PLUGIN_NAME='data_masking'\G
*************************** 1. row ***************************
       PLUGIN_NAME: data_masking
     PLUGIN_STATUS: ACTIVE
    PLUGIN_VERSION: 0.1
    PLUGIN_LIBRARY: data_masking.so
PLUGIN_DESCRIPTION: Data masking facilities

Note:

If the plugin and UDFs are used on a master replication server, install them on all slave servers as well to avoid replication problems.

Uninstall is simple as well, uninstall the plugin and drop the UDFs:

mysql>
UNINSTALL PLUGIN data_masking;
DROP FUNCTION gen_blacklist;
DROP FUNCTION gen_dictionary;
DROP FUNCTION gen_dictionary_drop;
DROP FUNCTION gen_dictionary_load;
DROP FUNCTION gen_range;
DROP FUNCTION gen_rnd_email;
DROP FUNCTION gen_rnd_pan;
DROP FUNCTION gen_rnd_ssn;
DROP FUNCTION gen_rnd_us_phone;
DROP FUNCTION mask_inner;
DROP FUNCTION mask_outer;
DROP FUNCTION mask_pan;
DROP FUNCTION mask_pan_relaxed;
DROP FUNCTION mask_ssn;

Now we’re ready to play!

Data Generation

One of the nice “side feature” of MySQL Data Masking and De-Identification is the ability to generate business relevant datasets. Because it is not always possible to test/simulate your application on your real dataset (indeed playing with customer credit card or security social numbers is a very bad practice) this feature is very convenient.

Generating Random Data with Specific Characteristics

Several functions are available. They start with these 4 first characters: gen_ and you’ll find the complete list here.
In this article I’ll use the following functions :

  • gen_range() : returns a random integer selected from a given range.
  • gen_rnd_email() : returns a random email address in the example.com domain.
  • gen_rnd_pan() : returns a random payment card Primary Account Number.
  • gen_rnd_us_phone() : returns a random U.S. phone number in the 555 area code not used for legitimate numbers.

Generating Random Data Using Dictionaries

Sometime you will need data with better quality. So another way to generate a relevant dataset is to use dictionaries.

Again several functions are available. They also start with these 4 first characters: gen_ and you’ll find the complete list here.
I’ll use the following functions :

  • gen_dictionary_load() : Loads a file into the dictionary registry and assigns the dictionary a name to be used with other functions that require a dictionary name argument.
  • gen_dictionary() : Returns a random term from a dictionary.

OK, let’s moving forward!
In order to use data from a dictionary we must first load the data.

A dictionary is a plain text file, with one term per line:

$ head /dict/mq_cities.txt
Basse-Pointe
Bellefontaine
Case-Pilote
Ducos
Fonds-Saint-Denis
Fort-de-France
Grand'Rivière
Gros-Morne
L'Ajoupa-Bouillon
La Trinité

Then we must load the dictionaries

Note:

The secure_file_priv variable must be set properly (usually in your my.cnf or my.ini).

mysql> SHOW VARIABLES LIKE 'secure_file_priv'\G
*************************** 1. row ***************************
Variable_name: secure_file_priv
        Value: /dict/
1 row in set (0,00 sec)

mysql> SELECT gen_dictionary_load('/dict/Firstnames.txt', 'Firstnames')\G
*************************** 1. row ***************************
gen_dictionary_load('/dict/Firstnames.txt', 'Firstnames'): Dictionary load success
1 row in set (0,20 sec)

mysql> SELECT gen_dictionary_load('/dict/Lastnames.txt', 'Lastnames')\G
*************************** 1. row ***************************
gen_dictionary_load('/dict/Lastnames.txt', 'Lastnames'): Dictionary load success
1 row in set (0,24 sec)

mysql> SELECT gen_dictionary_load('/dict/JobTitles.txt', 'JobTitles')\G
*************************** 1. row ***************************
gen_dictionary_load('/dict/JobTitles.txt', 'JobTitles'): Dictionary load success
1 row in set (0,00 sec)

mysql> SELECT gen_dictionary_load('/dict/BirthDates.txt', 'BirthDates')\G
*************************** 1. row ***************************
gen_dictionary_load('/dict/BirthDates.txt', 'BirthDates'): Dictionary load success
1 row in set (0,00 sec)

mysql> SELECT gen_dictionary_load('/dict/mq_cities.txt', 'mq_Cities')\G
*************************** 1. row ***************************
gen_dictionary_load('/dict/mq_cities.txt', 'mq_Cities'): Dictionary load success
1 row in set (0,00 sec)

Note:

Dictionaries are not persistent. Any dictionary used by applications must be loaded for each server startup.

Now I have all my bricks to build my business centric test dataset.
For example I can generate a random email address:

mysql> SELECT gen_rnd_email();
+---------------------------+
| gen_rnd_email()           |
+---------------------------+
| rcroe.odditdn@example.com |
+---------------------------+

Or a random city from my dictionary of the cities of Martinique :

mysql> SELECT gen_dictionary('mq_Cities');
+-------------------------------+
| gen_dictionary('mq_Cities')   |
+-------------------------------+
| Fort-de-France                |
+-------------------------------+

Awesome!

Now let’s use these functions to generate some random but business oriented data.
Below our test table called sensitive_data which contains… sensitive data :

CREATE TABLE sensitive_data(
    emp_id INT UNSIGNED NOT NULL AUTO_INCREMENT,
    firstname VARCHAR(100) NOT NULL,
    lastname VARCHAR(100) NOT NULL,
    birth_date date,
    email VARCHAR(100) NOT NULL,
    phone VARCHAR(20),
    jobTitle VARCHAR(50),
    salary INT UNSIGNED,
    city VARCHAR(30),
    credit_card CHAR(19),
    PRIMARY KEY (emp_id))
;

I created a stored procedure (sorry but I’m a DBA) to fill my table with data. However a script in your favorite programming language could be a better choice:

DELIMITER //
DROP PROCEDURE IF EXISTS add_rows;
CREATE PROCEDURE add_rows( IN numRow TINYINT UNSIGNED)
BEGIN
    DECLARE cpt TINYINT UNSIGNED DEFAULT 0;
    WHILE cpt < numRow DO
        INSERT INTO sensitive_data(firstname, lastname, birth_date, email, phone, jobTitle, salary, city, credit_card)
        SELECT
        gen_dictionary('Firstnames'),
        gen_dictionary('Lastnames'),
        gen_dictionary('BirthDates'),
        gen_rnd_email(),
        gen_rnd_us_phone(),
        gen_dictionary('JobTitles'),
        gen_range(30000, 120000),
        gen_dictionary('mq_Cities'),
        gen_rnd_pan()
        FROM DUAL;
        SET cpt = cpt + 1;
        SELECT sleep(1);
    END WHILE;
END//
DELIMITER ;


-- Call the procedure and insert 10 rows in the table
CALL add_rows(10);


mysql> SELECT firstname, lastname, phone, salary, city FROM sensitive_data;
+-----------+-----------+----------------+--------+------------------+
| firstname | lastname  | phone          | salary | city             |
+-----------+-----------+----------------+--------+------------------+
| Fresh     | Daz       | 1-555-381-3165 |  78920 | Ducos            |
| Doowon    | Vieri     | 1-555-645-3332 |  78742 | Macouba          |
| Marsja    | Speckmann | 1-555-455-3688 |  56526 | Les Trois-Îlets  |
| Carrsten  | Speckmann | 1-555-264-8108 |  51253 | Fort-de-France   |
| Yonghong  | Marrevee  | 1-555-245-0883 |  86820 | Le Lorrain       |
| Shuji     | Magliocco | 1-555-628-3771 |  88615 | Le Marin         |
| Luisa     | Sury      | 1-555-852-7710 | 117957 | Le Morne-Rouge   |
| Troy      | Zobel     | 1-555-805-0270 |  78801 | Bellefontaine    |
| Lunjin    | Pettis    | 1-555-065-0517 |  69782 | Le Prêcheur      |
| Boriana   | Marletta  | 1-555-062-4226 |  70970 | Saint-Joseph     |
+-----------+-----------+----------------+--------+------------------+
10 rows in set (0,00 sec)

It looks like real data, it smells like real data, it sounds like real data but these are not real data. That’s what we wanted 🙂

Data Masking and De-Identification

Many masking functions are available. They start with these 5 first characters: mask_ and you’ll find the complete list here.
I’ll use the following functions :

mask_inner() masks the interior of its string argument, leaving the ends unmasked. Other arguments specify the sizes of the unmasked ends.

SELECT phone, mask_inner(phone, 0, 4) FROM sensitive_data LIMIT 1;
+----------------+-------------------------+
| phone          | mask_inner(phone, 0, 4) |
+----------------+-------------------------+
| 1-555-381-3165 | XXXXXXXXXX3165          |
+----------------+-------------------------+

mask_outer() does the reverse, masking the ends of its string argument, leaving the interior unmasked. Other arguments specify the sizes of the masked ends.

SELECT birth_date, mask_outer(birth_date, 5, 0) FROM sensitive_data LIMIT 1;
+------------+------------------------------+
| birth_date | mask_outer(birth_date, 5, 0) |
+------------+------------------------------+
| 1954-06-08 | XXXXX06-08                   |
+------------+------------------------------+

mask_pan() masks all but the last four digits of the number;
mask_pan_relaxed() is similar but does not mask the first six digits that indicate the payment card issuer unmasked.

SELECT mask_pan(credit_card), mask_pan_relaxed(credit_card) FROM sensitive_data LIMIT 1;
+-----------------------+-------------------------------+
| mask_pan(credit_card) | mask_pan_relaxed(credit_card) |
+-----------------------+-------------------------------+
| XXXXXXXXXXXX4416      | 262491XXXXXX4416              |
+-----------------------+-------------------------------+

Note:

If you deal with U.S. Social Security Numbers, you could also use mask_ssn() function.

e.g. mysql> SELECT mask_ssn(gen_rnd_ssn());

So how to masked and de-identified customer sensitive data ?


There are different strategies. One is to use views.
Thus you already have a first level of security because you can choose only the columns the business need and/or filter the rows.
Furthermore you have another level of security because you can control who can access these data with relevant privileges, with or without roles.

Let’s see some examples:

Ex. 1
Mask the firstname (firstname) & the lastname (lastname)

CREATE VIEW v1_mask AS
  SELECT
    mask_inner(firstname, 1, 0) AS firstname,
    mask_outer(lastname, 3, 3) AS lastname,
    salary
  FROM sensitive_data;
SELECT * FROM v1_mask WHERE salary > 100000;
+-----------+----------+--------+
| firstname | lastname | salary |
+-----------+----------+--------+
| LXXXX     | XXXX     | 117957 |
+-----------+----------+--------+

Ex. 2
Mask the credit card number (credit_card)

CREATE VIEW v2_mask AS
  SELECT
    firstname,
    lastname,
    email,
    phone,
    mask_pan(credit_card) AS credit_card
  FROM sensitive_data;  
SELECT email, phone, credit_card 
FROM v2_mask 
WHERE firstname='Fresh' AND lastname='Daz';
+---------------------------+----------------+------------------+
| email                     | phone          | credit_card      |
+---------------------------+----------------+------------------+
| bcnnk.wnruava@example.com | 1-555-381-3165 | XXXXXXXXXXXX4416 |
+---------------------------+----------------+------------------+

Ex. 3
Replace real values of employee id (emp_id) and birth date (birth_date) with random ones.

CREATE VIEW v3_mask AS
  SELECT
    gen_range(1, 1000) AS emp_id,
    FROM_DAYS(gen_range(715000, 731000)) AS birth_date,
    jobTitle,
    salary,
    city 
  FROM sensitive_data;
SELECT DISTINCT
    jobTitle,
    max(salary) OVER w AS Max,
    min(salary) OVER w AS Min,
    AVG(salary) OVER w AS Avg
FROM v3_mask
WINDOW w AS (
    PARTITION BY jobTitle
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
);
+--------------------+--------+-------+------------+
| jobTitle           | Max    | Min   | Avg        |
+--------------------+--------+-------+------------+
| Assistant Engineer |  78920 | 78920 | 78920.0000 |
| Engineer           |  88615 | 88615 | 88615.0000 |
| Manager            |  78801 | 51253 | 65027.0000 |
| Senior Engineer    |  86820 | 70970 | 78895.0000 |
| Staff              |  78742 | 69782 | 74262.0000 |
| Technique Leader   | 117957 | 56526 | 87241.5000 |
+--------------------+--------+-------+------------+

Et voilà!
As a conclusion, MySQL Enterprise Masking and De-Identification enables organization to:

  • Meet regulatory requirements and data privacy laws
  • Significantly reduce the risk of a data breach
  • Protect confidential information

To conclude this conclusion, I recommend to read Data Masking in MySQL blog post from the MySQL Server Blog.

MySQL Enterprise Edition

MySQL Enterprise Edition includes the most comprehensive set of advanced features, management tools and technical support to achieve the highest levels of MySQL scalability, security, reliability, and uptime.

It reduces the risk, cost, and complexity in developing, deploying, and managing business-critical MySQL applications.

MySQL Enterprise Edition server Trial Download (Note – Select Product Pack: MySQL Database).

MySQL Enterprise Edition

In order to go further

MySQL Security Series

  1. Password Validation Plugin
  2. Password Management
  3. User Account Locking
  4. The Connection-Control Plugins
  5. Enterprise Audit
  6. Enterprise Transparent Data Encryption (TDE)
  7. Enterprise Firewall
  8. Enterprise Data Masking and De-Identification

Reference Manual

MySQL Security

Blog posts

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

15