Tutoriel – Déployer MySQL 5.7 InnoDB Cluster | dasini.net

Tutoriel – Déployer MySQL 5.7 InnoDB Cluster

août 21, 2018

Tags: cluster, group replication, Haute disponibilité, InnoDB Cluster

Cet article remplace le précédent tuto : Tutoriel – Déployer MySQL innoDB Cluster.

Si tu utilises MySQL 8.0, alors lit plutôt ce tuto : Tutoriel – Déployer MySQL 8.0 InnoDB Cluster.

Je reçois pas mal de questions de clients et d’utilisateurs de MySQL 5.7, j’espère donc que ce post t’apportera l’essentiel des réponses et bonnes pratiques pour te permettre de déployer un cluster InnoDB avec MySQL 5.7.

De plus, les nouvelles versions de MySQL Router et de MySQL Shell amènent de sensibles améliorations qu’il faut que je te montre à tout prix 🙂

L’un des principaux besoins de mes clients est la Haute Disponibilité avec MySQL. On va voir, dans cet article, comment déployer et gérer un cluster MySQL 5.7 « from scratch » , sous la forme d’un tutoriel, grâce à la solution HA tout en un : MySQL (5.7) InnoDB Cluster.

Note: L’article traite de MySQL InnoDB Cluster, HA natif de MySQL Server (via le plugin Group Replication) avec pour moteur de stockage InnoDB, solution à ne pas confondre avec MySQL NDB Cluster (moteur de stockage NDB).

Le contexte

3 instances MySQL autonomes, <spoil altert> qui vont grâce au plugin MySQL Group Replication se transformer en une base de données distribuée.</spoil altert>

Instance MySQL 1 (mysql_5.7_node1) : 172.18.0.11; Nœud numéro 1 du cluster
Instance MySQL 2 (mysql_5.7_node2) : 172.18.0.12; Nœud numéro 2 du cluster
Instance MySQL 3 (mysql_5.7_node3) : 172.18.0.13; Nœud numéro 3 du cluster

1 instance applicative : 192.168.1.11; MySQL Router + mon application.

Note: J’utilise l’image Docker MySQL Server supportée par l’équipe MySQL d’Oracle.

Note: Je n’aborde pas dans cet article la redondance de MySQL Router. Plusieurs scénarios sont possibles, je te recommande de lire ça, ceci et cela.

MySQL Shell est installé sur toutes les instances.

En ce qui concerne les versions des logiciels, ce sont les plus récentes à ce jour (journée caniculaire du mois d’août 2018):

Note: Dans cet article j’utilise la dernière GA de MySQL 5.7. Je publierai un autre tuto avec MySQL 8.0. Cependant, en ce qu’il concerne MySQL Router et MySQL Shell, il faut TOUJOURS prendre la dernière version (branche 8.0).

Pour récapituler notre architecture, une image valant (au moins) 1000 mots, ça nous donne :

MySQL InnoDB Cluster PoC Architecture

Vérifier la configuration des instances

La première étape consiste à s’assurer que les instances MySQL sont correctement configurées pour l’utilisation de MySQL Group Replication, la couche haute disponibilité de notre architecture. A noter qu’il est préférable de provisionner ses instances déjà correctement configurées (comme détaillé dans cet article) pour MySQL Group Replication.

Note: J’utiliser le compte utilisateur root pour configurer le cluster, cependant ce n’est pas une obligation. Il est effectivement possible de créer un compte utilisateur spécifique (ou plusieurs), avec les droits qui vont bien (accès total sur les tables des méta-données d’InnoDB Cluster + des droits d’administration de l’instance MySQL). Plus d’info ici (Paragraphe « User Privileges) ».

La vérification de la configuration se fait grâce à MySQL Shell et la méthode dba.checkInstanceConfiguration() :

$ mysqlsh --uri root@172.18.0.11
...
Server version: 5.7.23 MySQL Community Server (GPL)
No default schema selected; type \use <schema> to set one.
MySQL Shell 8.0.12
...

JS> dba.checkInstanceConfiguration('root@172.18.0.11:3306')
Please provide the password for 'root@172.18.0.11:3306': 
Validating MySQL instance at 172.18.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as 51306ade1992
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Note: verifyMyCnf option was not given so only dynamic configuration will be verified.

Some configuration options need to be fixed:
+----------------------------------+---------------+----------------+--------------------------------------------------+
| Variable                         | Current Value | Required Value | Note                                             |
+----------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum                  | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency         | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                        | OFF           | ON             | Update read-only variable and restart the server |
| log_bin                          | 0             | 1              | Update read-only variable and restart the server |
| log_slave_updates                | 0             | ON             | Update read-only variable and restart the server |
| master_info_repository           | FILE          | TABLE          | Update read-only variable and restart the server |
| relay_log_info_repository        | FILE          | TABLE          | Update read-only variable and restart the server |
| server_id                        | 0             | <unique ID>    | Update read-only variable and restart the server |
| transaction_write_set_extraction | OFF           | XXHASH64       | Update read-only variable and restart the server |
+----------------------------------+---------------+----------------+--------------------------------------------------+

The following variable needs to be changed, but cannot be done dynamically: 'log_bin'
Please use the dba.configureInstance() command to repair these issues.

{
    "config_errors": [
        {
            "action": "server_update", 
            "current": "CRC32", 
            "option": "binlog_checksum", 
            "required": "NONE"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_bin", 
            "required": "1"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_slave_updates", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "master_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "relay_log_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "server_id", 
            "required": "<unique ID>"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "transaction_write_set_extraction", 
            "required": "XXHASH64"
        }
    ], 
    "errors": [], 
    "status": "error"
}

Dans mon cas, avec l’installation de MySQL 5.7 par défaut sous Ubuntu (avec l’image Docker), niveau configuration… bah j’ai tout à faire 🙂

La méthode renvoie un document JSON (pratique pour l’automatisation) avec la liste des tâches à effectuer pour être conforme… Configurons donc !

J’ai deux solutions :

1/ je prépare mes instances « manuellement » (cette tâche peut bien évidemment s’automatiser e.g. Ansible, Puppet, Chef, …) comme expliqué dans l’article comment configurer un groupe.
2/ je me connecte à chaque instance en local, et j’utilise la méthode : configureLocalInstance()

Et ensuite je ne dois pas oublier de redémarrer les instances 🙂

Soyons fou ! allons y pour la méthode 2 :

JS> dba.configureLocalInstance('root@172.18.0.11:3306')
Please provide the password for 'root@172.18.0.11:3306': 
Configuring MySQL instance at 172.18.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_5.7_node1
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Some configuration options need to be fixed:
+----------------------------------+---------------+----------------+--------------------------------------------------+
| Variable                         | Current Value | Required Value | Note                                             |
+----------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum                  | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency         | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                        | OFF           | ON             | Update read-only variable and restart the server |
| log_bin                          | 0             | 1              | Update read-only variable and restart the server |
| log_slave_updates                | 0             | ON             | Update read-only variable and restart the server |
| master_info_repository           | FILE          | TABLE          | Update read-only variable and restart the server |
| relay_log_info_repository        | FILE          | TABLE          | Update read-only variable and restart the server |
| server_id                        | 0             | <unique ID>    | Update read-only variable and restart the server |
| transaction_write_set_extraction | OFF           | XXHASH64       | Update read-only variable and restart the server |
+----------------------------------+---------------+----------------+--------------------------------------------------+

The following variable needs to be changed, but cannot be done dynamically: 'log_bin'
WARNING: Cannot update configuration file for a remote target instance.
ERROR: Unable to change MySQL configuration.
MySQL server configuration needs to be updated, but neither remote nor local configuration is possible.
Please run this command locally, in the same host as the MySQL server being configured, and pass the path to its configuration file through the mycnfPath option.
Dba.configureLocalInstance: Unable to update configuration (RuntimeError)

Ouppss!!! dba.configureLocalInstance ne fonctionne qu’en local, c’est-à-dire, si je suis connecté sur la machine hôte de l’instance MySQL (ce qui est une bonne chose). Du coup après m’être connecté à l’hôte 172.18.0.11 :

daz@172.18.0.11:~$ mysqlsh 
MySQL Shell 8.0.12

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.


JS> dba.configureLocalInstance('root@localhost:3306')
Please provide the password for 'root@localhost:3306': 
Configuring local MySQL instance listening at port 3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_5.7_node1
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Some configuration options need to be fixed:
+----------------------------------+---------------+----------------+--------------------------------------------------+
| Variable                         | Current Value | Required Value | Note                                             |
+----------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum                  | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency         | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                        | OFF           | ON             | Update read-only variable and restart the server |
| log_bin                          | 0             | 1              | Update read-only variable and restart the server |
| log_slave_updates                | 0             | ON             | Update read-only variable and restart the server |
| master_info_repository           | FILE          | TABLE          | Update read-only variable and restart the server |
| relay_log_info_repository        | FILE          | TABLE          | Update read-only variable and restart the server |
| server_id                        | 0             | <unique ID>    | Update read-only variable and restart the server |
| transaction_write_set_extraction | OFF           | XXHASH64       | Update read-only variable and restart the server |
+----------------------------------+---------------+----------------+--------------------------------------------------+

The following variable needs to be changed, but cannot be done dynamically: 'log_bin'

Detecting the configuration file...
Default file not found at the standard locations.
Please specify the path to the MySQL configuration file: /etc/my.cnf
Do you want to perform the required configuration changes? [y/n]: y
Configuring instance...
The instance 'localhost:3306' was configured for cluster usage.
MySQL server needs to be restarted for configuration changes to take effect.

Note: Assure toi d’avoir les droits nécessaires pour mettre à jour le fichier de configuration de MySQL.

Les informations ajoutées dans le fichier de configuration se trouvent en fin de fichier :

daz@172.18.0.11:~$ cat /etc/my.cnf

[mysqld]
... <data previously in the file> ...

log_slave_updates = ON
server_id = 1467421716
relay_log_info_repository = TABLE
master_info_repository = TABLE
transaction_write_set_extraction = XXHASH64
binlog_format = ROW
report_port = 3306
binlog_checksum = NONE
enforce_gtid_consistency = ON
log_bin
gtid_mode = ON

172.18.0.11 est configurée !

Après redémarrage de l’instance MySQL, la sortie de checkInstanceConfiguration est beaucoup moins anxiogène :

JS> dba.checkInstanceConfiguration('root@172.18.0.11:3306')
Please provide the password for 'root@172.18.0.11:3306': 
Validating MySQL instance at 172.18.0.11:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_5.7_node1
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Note: verifyMyCnf option was not given so only dynamic configuration will be verified.
Instance configuration is compatible with InnoDB cluster

The instance '172.18.0.11:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}

OK ! Le membre est prêt pour faire parti d’un groupe.

La même procédure doit être appliquée sur les autres instances MySQL.

… <Quelques commandes de configuration>…

Et je me retrouve avec le résultat suivant:

JS > dba.checkInstanceConfiguration('root@172.18.0.12:3306')
Please provide the password for 'root@172.18.0.12:3306':
Validating MySQL instance at 172.18.0.12:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_5.7_node2
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Note: verifyMyCnf option was not given so only dynamic configuration will be verified.
Instance configuration is compatible with InnoDB cluster

The instance '172.18.0.12:3306' is valid for InnoDB cluster usage.

{
"status": "ok"
}

JS > dba.checkInstanceConfiguration('root@172.18.0.13:3306')
Please provide the password for 'root@172.18.0.13:3306':
Validating MySQL instance at 172.18.0.13:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_5.7_node3
Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Note: verifyMyCnf option was not given so only dynamic configuration will be verified.
Instance configuration is compatible with InnoDB cluster

The instance '172.18.0.13:3306' is valid for InnoDB cluster usage.

{
"status": "ok"
}

All good!

Créer le cluster

Une fois les 3 instances correctement configurées, l’étape suivante consiste à créer le cluster avec createCluster. Cette méthode va être jouée sur le premier membre, l’instance MySQL sur 172.18.0.11, elle va permettre de bootstrapper le cluster:

JS> \c root@172.18.0.11
Creating a Session to 'root@172.18.0.11'
Enter password: 
...

JS> var cluster = dba.createCluster('pocCluster', {ipWhitelist: "172.18.0.0/16"});
A new InnoDB cluster will be created on instance 'root@172.18.0.11:3306'.

Validating instance at 172.18.0.11:3306...

This instance reports its own address as mysql_5.7_node1

Instance configuration is suitable.
Creating InnoDB cluster 'pocCluster' on 'root@172.18.0.11:3306'...
WARNING: On instance '172.18.0.11:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
Adding Seed Instance...

Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.

createCluster() prend comme paramètre le nom du cluster (pocCluster). Tu peux lui passer également quelques informations optionnelles comme la whitelist.

Tu peux ensuite vérifier l’état du nœud dans le cluster avec status() :

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.11:3306"
}

Notes : Assure toi que ton DNS (ou /etc/hosts) est correctement configuré, sinon tu vas avoir des soucis de connections…

L’ajouts des nœuds suivant se fait avec addInstance(), il est néanmoins conseillé d’exécuter checkInstanceState() au préalable pour s’assurer de la compatibilité des GTID sets :

Nœud 2

JS> cluster.checkInstanceState('root@172.18.0.12:3306')
Analyzing the instance replication state...
Please provide the password for 'root@172.18.0.12:3306': 

The instance 'root@172.18.0.12:3306' is valid for the cluster.
The instance is new to Group Replication.

{
    "reason": "new", 
    "state": "ok"
}



JS> cluster.addInstance("root@172.18.0.12:3306", {ipWhitelist: "172.18.0.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Please provide the password for 'root@172.18.0.12:3306': 
Validating instance at 172.18.0.12:3306...

This instance reports its own address as mysql_5.7_node2

Instance configuration is suitable.
WARNING: On instance '172.18.0.12:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.11:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
The instance 'root@172.18.0.12:3306' was successfully added to the cluster.



JS> cluster.status();
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.11:3306"
}

Au cas où l’instance ajoutée n’a pas un GTID set compatible avec le groupe checkInstanceState te le fait savoir :

JS> cluster.checkInstanceState('root@172.18.0.12:3306')
Analyzing the instance replication state...
Please provide the password for 'root@172.18.0.12:3306': 

The instance '172.18.0.12:3306' is invalid for the cluster.
The instance contains additional transactions in relation to the cluster.

{
    "reason": "diverged", 
    "state": "error"
}

En fonction du contexte, il faut alors soit restaurer une sauvegarde d’un membre du cluster sur l’instance problématique (celle qui diverge) ou alors si tu sais ce que tu fais, une synchronisation des GTIDs est toujours possible, voir un reset master.

Nœud 3

JS> cluster.checkInstanceState('root@172.18.0.13:3306')
Analyzing the instance replication state...
Please provide the password for 'root@172.18.0.13:3306': 

The instance 'root@172.18.0.13:3306' is valid for the cluster.
The instance is new to Group Replication.

{
    "reason": "new", 
    "state": "ok"
}



JS> cluster.addInstance("root@172.18.0.13:3306", {ipWhitelist: "172.18.0.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Please provide the password for 'root@172.18.0.13:3306': 
Validating instance at 172.18.0.13:3306...

This instance reports its own address as mysql_5.7_node3

Instance configuration is suitable.
WARNING: On instance '172.18.0.13:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.11:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.12:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
The instance 'root@172.18.0.13:3306' was successfully added to the cluster.

Le résultat final:

MySQL InnoDB Cluster PoC Architecture

Et voilà!

Un cluster MySQL Group Replication de 3 nœuds est déployé grâce à MySQL Shell !

La configuration actuelle est la suivante:

Nœud 1 (mysql_5.7_node1) = 172.18.0.11 : Primaire (lecture/écriture)
Nœud 2 (mysql_5.7_node2) = 172.18.0.12 : Secondaire (lecture seule)
Nœud 3 (mysql_5.7_node3) = 172.18.0.13 : Secondaire (lecture seule)

Si tu as été attentif, lors de l’ajout des nœuds (pareil donc pour la création du cluster), tu as noté que MySQL Shell me renvoi des « Warnings », cependant, rien de bien méchant !

En MySQL 5.7 la commande SET PERSIST n’existe tout simplement pas. Il n’est donc pas possible, à cette étape, d’automatiquement rendre persistante la configuration ie l’écrire dans le fichier de configuration à distance (remote en bon franglais). Bref, en clair, la conf des nœuds est en mémoire.

Persistance de la configuration

Pour rendre la configuration persistante, il faut alors exécuter, sur chacun des nœuds et après que le nœud soit configuré, la méthode (déjà vue) dba.configureLocalInstance() :

# Before persistence, no MySQL Group Replication variable in my.cnf
daz@172.18.0.11:~$ grep -c group_replication /etc/my.cnf
0



daz@172.18.0.11:~$ mysqlsh
MySQL Shell 8.0.12

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.


JS > dba.configureLocalInstance("root@localhost:3306")  
Please provide the password for 'root@localhost:3306': 
The instance 'localhost:3306' belongs to an InnoDB cluster.

Detecting the configuration file...
Default file not found at the standard locations.
Please specify the path to the MySQL configuration file: /etc/my.cnf
Persisting the cluster settings...
The instance 'localhost:3306' was configured for use in an InnoDB cluster.

The instance cluster settings were successfully persisted.



# After persistence, some MySQL Group Replication variables were added in my.cnf
daz@172.18.0.11:~$ grep -c group_replication /etc/my.cnf
35

A noter que cette opération ne peut se faire qu’en local.

Evidemment, à faire sur tout les autres nœuds:

daz@172.18.0.12:~$ mysqlsh
MySQL Shell 8.0.12

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.


JS > dba.configureLocalInstance("root@localhost:3306")
Please provide the password for 'root@localhost:3306': 
The instance 'localhost:3306' belongs to an InnoDB cluster.

Detecting the configuration file...
Default file not found at the standard locations.
Please specify the path to the MySQL configuration file: /etc/my.cnf
Persisting the cluster settings...
The instance 'localhost:3306' was configured for use in an InnoDB cluster.

The instance cluster settings were successfully persisted.

Et le dernier:

daz@172.18.0.13:~$ mysqlsh
MySQL Shell 8.0.12

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.


JS > dba.configureLocalInstance("root@localhost:3306")
Please provide the password for 'root@localhost:3306': 
The instance 'localhost:3306' belongs to an InnoDB cluster.

Detecting the configuration file...
Default file not found at the standard locations.
Please specify the path to the MySQL configuration file: /etc/my.cnf
Persisting the cluster settings...
The instance 'localhost:3306' was configured for use in an InnoDB cluster.

The instance cluster settings were successfully persisted.

Bien que pas obligatoire, je recommande de le faire systématiquement.

La suite ?

Configuration de MySQL Router

Les recommandations de MySQL sont d’installer MySQL Router sur la machine hôte de l’application, je vais donc l’installer sur la machine 192.168.1.11.

Note: Si tu ne peux (veux) pas mettre MySQL Router sur l’application, il va alors te falloir gérer le HA du Router. Plusieurs solutions sont envisageables comme :

Pacemaker
Keepalived

Bootstrap MySQL Router

La première étape est le bootstrap:

daz@192.168.1.11:~$ mysqlrouter --bootstrap root@172.18.0.11:3306 --conf-base-port 3306 --directory ~/routerConf/RouterPoC
Please enter MySQL password for root: 

Bootstrapping MySQL Router instance at '/home/daz/routerConf/RouterPoC'...
Checking for old Router accounts
Creating account mysql_router6_7gnev5crokb8@'%'
MySQL Router  has now been configured for the InnoDB cluster 'pocCluster'.

The following connection information can be used to connect to the cluster.

Classic MySQL protocol connections to cluster 'pocCluster':
- Read/Write Connections: localhost:3306
- Read/Only Connections: localhost:3307
X protocol connections to cluster 'pocCluster':
- Read/Write Connections: localhost:3308
- Read/Only Connections: localhost:3309

Note: il se peut que tu rencontres un problème de permission. Probablement dû à la configuration de AppArmor… Google (ou équivalent) est ton ami 🙂 (si tu es sous Ubuntu click ici).

J’ai créé une configuration différente de celle par défaut, en personnalisant avec quelques options:

conf-base-port : le port proposé par défaut est 6446 pour la lecture/écriture. Dans mon cas, je veux utiliser le célèbre port 3306
directory : histoire de ranger tout le bazar de cette instance de Router dans le répertoire spécifié

La liste complète des options est disponible ici.

Pour résumer, 4 ports TCP ont été configurés, dont 2 pour les connexions MySQL traditionnelles:

3306 (au lieu de 6446 par défaut) : lectures / écritures pour le nœud primaire
3307 (au lieu de 6447 par défaut) : lectures seules pour les nœuds secondaires (en Round-Robin)
Et le pendant pour les connexions avec le protocole X (3308 & 3309 (au lieu de respectivement 64460 & 64470)), pour une utilisation NoSQL Document Store de MySQL.

Le fichier de configuration de MySQL Router contient quelques informations importantes, tel que le(s) port(s) à utiliser par l’application (comme vu précédemment) :

daz@192.168.1.11:~$ cat ~/routerConf/RouterPoC/mysqlrouter.conf 
# File automatically generated during MySQL Router bootstrap
[DEFAULT]
logging_folder=/home/daz/routerConf/RouterPoC/log
runtime_folder=/home/daz/routerConf/RouterPoC/run
data_folder=/home/daz/routerConf/RouterPoC/data
keyring_path=/home/daz/routerConf/RouterPoC/data/keyring
master_key_path=/home/daz/routerConf/RouterPoC/mysqlrouter.key
connect_timeout=30
read_timeout=30

[logger]
level = INFO

[metadata_cache:pocCluster]
router_id=6
bootstrap_server_addresses=mysql://172.18.0.11:3306,mysql://172.18.0.12:3306,mysql://172.18.0.13:3306
user=mysql_router6_7gnev5crokb8
metadata_cluster=pocCluster
ttl=0.5

[routing:pocCluster_default_rw]
bind_address=0.0.0.0
bind_port=3306
destinations=metadata-cache://pocCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=classic

[routing:pocCluster_default_ro]
bind_address=0.0.0.0
bind_port=3307
destinations=metadata-cache://pocCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=classic

[routing:pocCluster_default_x_rw]
bind_address=0.0.0.0
bind_port=3308
destinations=metadata-cache://pocCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=x

[routing:pocCluster_default_x_ro]
bind_address=0.0.0.0
bind_port=3309
destinations=metadata-cache://pocCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=x

Il est évidemment possible de modifier ce fichier.

Ensuite, il faut démarrer MySQL Router avec le script start.sh

daz@192.168.1.11:~$ ~/routerConf/RouterPoC/start.sh

L’arrêt du Router se fait avec le script stop.sh (mais tu l’avais deviné)

daz@192.168.1.11:~$ ~/routerConf/RouterPoC/stop.sh

Voilà pour MySQL Router !

Se connecter au cluster

A partir de maintenant, ton cluster et « up and running« . Ton application va donc devoir se connecter au port 3306 (car on l’a configuré comme cela, sinon c’est 6446 par défaut – je radote, je sais) pour utiliser la base de donnée. D’ailleurs du point de vue de l’application, la base de donnée est MySQL Router, sauf qu’en réalité ce n’est pas 1 instance, mais bel et bien 3 instances qui sont en backend et ceci en toute transparence (épatant! hein ?).

La partie utilisation du cluster est hors du scope de cet article, mais on peut facilement simuler le comportement de l’application avec un client MySQL (MySQL Shell ici) et MySQL router.

daz@192.168.1.11:~$ mysqlsh --uri=root@localhost:3306 --sql
SQL > SELECT @@report_host;
+-----------------+
| @@report_host   |
+-----------------+
| mysql_5.7_node1 |
+-----------------+

Je me connecte avec MySQL Shell en mode SQL (ça c’est l’applicatif), au cluster (à mysql_5.7_node1, nœud primaire InnoDB Cluster), par l’intermédiaire de MySQL Router en localhost (car je suis sur la machine 192.168.1.11) sur le port 3306.

Le paramètre report_host (défini dans mon fichier de configuration) me renvoi la valeur du nœud 1, le primaire.

En cas d’arrêt du primaire, un nouveau va être automatiquement élu par le cluster (voir paragraphe failover plus bas) est la même commande me donnera un résultat différent:

-- /!\ Arrêt du nœud 1
SQL> SELECT @@report_host;
ERROR: 2013 (HY000): Lost connection to MySQL server during query
The global session got disconnected..
Attempting to reconnect to 'mysql://root@localhost:3306'..
The global session was successfully reconnected.

SQL> SELECT @@report_host;
+-----------------+
| @@report_host   |
+-----------------+
| mysql_5.7_node2 |
+-----------------+

Gestion des nœuds

Quelques commandes qui vont te simplifier la vie…

Performance_Schema

Quelques informations sont disponibles en SQL au niveau des instances.

Identifier le nœud primaire

SQL> SELECT MEMBER_ID, MEMBER_HOST, MEMBER_PORT, MEMBER_STATE 
FROM performance_schema.replication_group_members 
    INNER JOIN performance_schema.global_status 
        ON (MEMBER_ID = VARIABLE_VALUE) 
WHERE VARIABLE_NAME='group_replication_primary_member'\G
*************************** 1. row ***************************
   MEMBER_ID: 56ea1a24-9cbe-11e8-aba1-0242ac12000b
 MEMBER_HOST: mysql_5.7_node1
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

Description des membres du cluster

SQL> SELECT * FROM performance_schema.replication_group_members\G
*************************** 1. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 556ba78e-9cbe-11e8-ac75-0242ac12000c
 MEMBER_HOST: mysql_5.7_node2
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
*************************** 2. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 5693705a-9cbe-11e8-abf9-0242ac12000d
 MEMBER_HOST: mysql_5.7_node3
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE
*************************** 3. row ***************************
CHANNEL_NAME: group_replication_applier
   MEMBER_ID: 56ea1a24-9cbe-11e8-aba1-0242ac12000b
 MEMBER_HOST: mysql_5.7_node1
 MEMBER_PORT: 3306
MEMBER_STATE: ONLINE

Récupérer les méta-données d’un cluster

Les méta-données du cluster sont stockées sur les membres dans le schéma mysql_innodb_cluster_metadata :

daz@172.18.0.11 ~ $ mysqlsh --uri root@172.18.0.11 --sql
...

SQL> SHOW SCHEMAS;
+-------------------------------+
| Database                      |
+-------------------------------+
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |
+-------------------------------+


SQL> SHOW TABLES IN mysql_innodb_cluster_metadata;
+-----------------------------------------+
| Tables_in_mysql_innodb_cluster_metadata |
+-----------------------------------------+
| clusters                                |
| hosts                                   |
| instances                               |
| replicasets                             |
| routers                                 |
| schema_version                          |
+-----------------------------------------+


SQL> SELECT cluster_name FROM mysql_innodb_cluster_metadata.clusters;
+--------------+
| cluster_name |
+--------------+
| pocCluster   |
+--------------+


SQL > SELECT host_name FROM mysql_innodb_cluster_metadata.hosts;
+-----------------+
| host_name       |
+-----------------+
| 172.18.0.11     |
| 172.18.0.12     |
| 172.18.0.13     |
| 192.168.1.11    |
+-----------------+

Pour récupérer les informations relatives à l’état du cluster dans une nouvelle session il faut utiliser la méthode getCluster :

JS> cluster.status()
ReferenceError: cluster is not defined


JS> var cluster = dba.getCluster('pocCluster')


JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.11:3306"
}

Failover

Le basculement niveau base de données (changement de primaire) est automatiquement géré par les membres du cluster entre eux.

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.11:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.11:3306"
}

Crash du noeud primaire (172.18.0.11)…

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Nouveau primaire élu par le groupe : 172.18.0.13.

Et 172.18.0.11 est porté disparu (MIA).

Les données configuration cluster étant sauvegardées dans le fichier de configuration (voir paragraphe Persistance de la configuration), une fois le redémarré/réparé/restauré il fera automatiquement parti du cluster à nouveau. et il aura un rôle de secondaire.

En cas de configuration non persistante, un rejoinInstance() est nécessaire pour remettre le nœud dans le cluster. (voir paragraphe suivant Remettre un membre dans le groupe).

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Remettre un membre dans le groupe

Nécessaire si la conf n’est pas persistante ou si la variable group_replication_start_on_boot = OFF.

Le nœud peut alors être remit dans le groupe avec la commande rejoinInstance() :

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}



JS> dba.checkInstanceConfiguration('root@172.18.0.13:3306')
Please provide the password for 'root@172.18.0.13:3306': 
Validating MySQL instance at 172.18.0.13:3306 for use in an InnoDB cluster...
 
This instance reports its own address as mysql_5.7_node3

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Note: verifyMyCnf option was not given so only dynamic configuration will be verified.
Instance configuration is compatible with InnoDB cluster

The instance '172.18.0.13:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}



JS> cluster.rejoinInstance("root@172.18.0.13:3306", {ipWhitelist: "172.18.0.0/16"})
Rejoining the instance to the InnoDB cluster. Depending on the original
problem that made the instance unavailable, the rejoin operation might not be
successful and further manual steps will be needed to fix the underlying
problem.

Please monitor the output of the rejoin operation and take necessary action if
the instance cannot rejoin.

Rejoining instance to the cluster ...

Please provide the password for 'root@172.18.0.13:3306': 
The instance '172.18.0.13:3306' was successfully rejoined on the cluster.



JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}

Supprimer une instance du groupe

Sans grande surprise, c’est la commande removeInstance

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}



JS> cluster.removeInstance("root@172.18.0.13:3306")
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.

Attempting to leave from the Group Replication group...
WARNING: On instance '172.18.0.13:3306' configuration cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please set the 'group_replication_start_on_boot' variable to 'OFF' in the server configuration file, otherwise it might rejoin the cluster upon restart.
WARNING: On instance '172.18.0.12:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.11:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.

The instance '172.18.0.13:3306' was successfully removed from the cluster.



JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}

L’instance n’est alors plus listée dans les méta-données :

JS> \sql
Switching to SQL mode... Commands end with ;


SQL> SELECT instance_name FROM mysql_innodb_cluster_metadata.instances;
+------------------+
| instance_name    |
+------------------+
| 172.18.0.11:3306 |
| 172.18.0.12:3306 |
+------------------+

Pour la remettre dans le groupe, il faut donc rejouer le processus de l’ajout d’instance vu plus haut :

JS> cluster.checkInstanceState('root@172.18.0.13:3306')
Analyzing the instance replication state...
Please provide the password for 'root@172.18.0.13:3306': 

The instance 'root@172.18.0.13:3306' is valid for the cluster.
The instance is fully recoverable.

{
    "reason": "recoverable", 
    "state": "ok"
}



JS> cluster.addInstance("root@172.18.0.13:3306", {ipWhitelist: "172.18.0.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster ...

Please provide the password for 'root@172.18.0.13:3306': 
Validating instance at 172.18.0.13:3306...

This instance reports its own address as mysql_5.7_node3

Instance configuration is suitable.
WARNING: On instance '172.18.0.13:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.12:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
WARNING: On instance '172.18.0.11:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.
The instance 'root@172.18.0.13:3306' was successfully added to the cluster.



JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}

Perte du quorum

Si le cluster perd plus de la moitié de ses membres (crash ou split brain par exemple) il se retrouve dans un état assez désagréable, network partitioning, en clair il faut une intervention externe au cluster pour permettre aux membres restant de continuer à faire leur boulot.

Note: Par perte j’entend arrêt non prévu (crash). En cas d’arrêt normal ou propre, même si le cluster perd son quorum (dans ce cas présent arrêt normal de 2 nœuds), le nœud restant sait que les autres nœuds ne sont plus là (en clair pas de risque de split brain) donc le cluster continue de fonctionner. Mais c’est un cluster avec un seul nœud…

JS > cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}

Dans notre cas, avec 3 instances, il faut en perdre 2 d’un coup :

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.12:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.12:3306"
}

Perte des nœuds (crash) 172.18.0.11 & 172.18.0.12… (Mayday, Mayday, Mayday!!!)

JS> cluster.status()
WARNING: Cluster has no quorum and cannot process write transactions: 2 out of 3 members of the InnoDB cluster are unreachable from the member we’re connected to, which is not sufficient for a quorum to be reached.
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "NO_QUORUM", 
        "statusText": "Cluster has no quorum as visible from '172.18.0.13:3306' and cannot process write transactions. 2 members are not active", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "UNREACHABLE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "UNREACHABLE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Le failover automatique ne peut pas s’enclencher, le nœud survivant (172.18.0.13) est bloqué.

Il faut donc intervenir :

$ mysqlsh --uri=root@172.18.0.13
Creating a Session to 'root@172.18.0.13'
...


JS> var cluster = dba.getCluster("pocCluster")
WARNING: Cluster has no quorum and cannot process write transactions: 2 out of 3 members of the InnoDB cluster are unreachable from the member we’re connected to, which is not sufficient for a quorum to be reached.



JS>  cluster.forceQuorumUsingPartitionOf('root@172.18.0.13:3306')
Restoring replicaset 'default' from loss of quorum, by using the partition composed of [172.18.0.13:3306]

Restoring the InnoDB cluster ...

Please provide the password for 'root@172.18.0.13:3306': 
The InnoDB cluster was successfully restored using the partition from the instance 'root@172.18.0.13:3306'.

WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored.



JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Evidemment, sauf si tu es joueur 🙂 , il faut éviter de rester trop longtemps dans cet état.

Une fois les instances remisent en condition, il faut soit simplement les démarrer ou alors utiliser rejoinInstance() pour les remettre dans le cluster, en tant que secondaire.

JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Repartir après un arrêt total du cluster

La perte du quorum est une chose, mais il y a pire, perdre tout les nœuds…

En cas d’arrêt total du cluster i.e. toutes les instances sont éteintes, il faut utiliser, une fois les instances MySQL de nouveau démarrées rebootClusterFromCompleteOutage() :

$ mysqlsh root@172.18.0.12:3306
...

JS > var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
Reconfiguring the cluster 'pocCluster' from complete outage...

The instance '172.18.0.11:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance '172.18.0.13:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

Dba.rebootClusterFromCompleteOutage: The active session instance isn't the most updated in comparison with the ONLINE instances of the Cluster's metadata. Please use the most up to date instance: '172.18.0.13:3306'. (RuntimeError)

Le reboot doit se faire sur l’instance la plus à jour (ici la machine 172.18.0.13) :

$ mysqlsh --uri=root@172.18.0.13
Creating a session to 'root@172.18.0.13'
...

JS> var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
Reconfiguring the cluster 'pocCluster' from complete outage...

The instance '172.18.0.11:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance '172.18.0.12:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

WARNING: On instance '172.18.0.13:3306' membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the <Dba>.configureLocalInstance command locally to persist the changes.

The cluster was successfully rebooted.



JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "172.18.0.13:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "172.18.0.11:3306": {
                "address": "172.18.0.11:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.12:3306": {
                "address": "172.18.0.12:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "172.18.0.13:3306": {
                "address": "172.18.0.13:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@172.18.0.13:3306"
}

Le membre sur lequel la commande est exécutée est le nouveau primaire.

Voilà c’est tout pour aujourd’hui 🙂

C’est très certainement l’un de mes derniers articles sur MySQL 5.7, cependant, vu le nombre élevé de retour que je reçois, je tenais à mettre à jour le tuto sur la version 5.7 de MySQL InnoDB Cluster.

Dans la même thématique :

Thanks for using MySQL!

Olivier DASINI

Architecte Solution Cloud IA & Analytique chez Oracle
MySQL Geek, passionné par la data & l’IA, Architecte, DBA, Consultant, Formateur, Auteur, Blogueur eCloud Solutions Architectt Conférencier.
—–

Blog: www.dasini.net/blog/en/
Linkedin : www.linkedin.com/in/olivier-dasini
Speaker Deck: https://speakerdeck.com/freshdaz
SlideShare: www.slideshare.net/freshdaz
Youtube: https://www.youtube.com/channel/UC12TulyJsJZHoCmby3Nm3WQ
—–

6 Responses to “Tutoriel – Déployer MySQL 5.7 InnoDB Cluster”

Tutoriel - Déployer MySQL innoDB Cluster | dasini.net - Journal d'un expert MySQL

août 21st, 2018 at 9:33

[…] C’est ici. […]
Tutoriel – Déployer MySQL 8.0 InnoDB Cluster | dasini.net - Journal d'un expert MySQL

août 30th, 2018 at 11:25

[…] Si tu utilises MySQL 5.7 tu peux également mettre en oeuvre InnoDB Cluster, je te recommande cet article: Tutoriel – Déployer MySQL 5.7 InnoDB Cluster. […]
Jug

février 20th, 2019 at 17:13

Bonjour,
Avant tout, merci pour ce blog et pour les différents billets que vous partagez avec nous.
Je débute avec MySQL notamment la partie Cluster, je souhaiterais savoir s’il faut des licences particulière pour utiliser « MySQL 5.7 InnoDB Cluster » Est-ce payant ou gratuit ?

Merci d’avance pour votre aide.
Olivier DASINI

février 21st, 2019 at 13:10

Merci pour les encouragements.
Oui il est possible d’utiliser MySQL InnoDB Cluster avec la version communautaire (libre) de MySQL.
Olivier
Jug

février 28th, 2019 at 11:43

Un grand merci pour votre réponse.
Je viens de finaliser la mise en place de mon premier Cluster MySQL InnoDB grâce à votre tuto 😉

Merci encore une fois.
200 !! | dasini.net - Journal d'un expert MySQL

avril 9th, 2019 at 11:06

[…] Tutoriel – Déployer MySQL 5.7 InnoDB Cluster […]

dasini.net – Journal d'un expert MySQL