Tutoriel – Déployer MySQL innoDB Cluster

mai 11, 2017

 

Warning

 

L’ article que tu t’apprêtes à lire est maintenant obsolète 🙁 Ohhhhhh !

Mais rassure toi, il est remplacé par un nouvel article, tout nouveau, tout beau et surtout plus complet et à jour 🙂 Ahhhhh !!

C’est par ici.

 

 

 




 

 

Dans les épisodes précédents on a vu comment  déployer « manuellement » MySQL Group Replication,  comprendre et tester MySQL InnoDB Cluster  ainsi que comment  gérer aisément un cluster Group Replication déja déployé avec MySQL Shell.

Aujourd’hui, dans la série Haute Disponibilité avec MySQL on va voir comment déployer et gérer un cluster from scratch , sous la forme d’un tutoriel, grâce à la solution tout en un : MySQL InnoDB Cluster.

Note: 
L’article traite de MySQL InnoDB Cluster, HA natif de MySQL Server (plugin GRoup Replication) avec pour moteur de stockage InnoDB, solution à ne pas confondre avec MySQL NDB Cluster (moteur de stockage NDB).

 

Le contexte

3 instances MySQL autonomes, <spoil altert> qui vont grâce au plugin MySQL Group Replication se transformer en une base de données distribuée.</spoil altert>

  • Instance MySQL 1 : 192.168.1.22; Nœud numéro 1 du cluster
  • Instance MySQL 2 : 192.168.1.50; Nœud numéro 2 du cluster
  • Instance MySQL 3 : 192.168.1.48; Nœud numéro 3 du cluster

1 instance applicative : 192.168.1.11; MySQL Router + mon application.

MySQL Shell est installé sur toutes les instances.

En ce qui concerne les versions des logiciels:


Update 08-08-2018

Note: TOUJOURS prendre la dernière version du Router et du Shell. C’est actuellement la branche 8.0.x qui est bien évidemment compatible avec MySQL 5.7 InnoDB Cluster.

ex: MySQL Router 8.0.12 + MySQL Shell 8.0.12 avec MySQL 5.7.23

 

Avec un schéma sa donne (à peu près) ça :

MySQL InnoDB Cluster PoC Architecture

 

Vérifier la configuration des instances

La première étape consiste à s’assurer que les instances MySQL sont correctement configurées pour l’utilisation de MySQL Group Replication, la couche haute disponibilité de notre architecture. A noter qu’il est préférable de provisionner ses instances déjà correctement configurées (comme détaillé dans cet article) pour MySQL Group Replication.

La vérification de la configuration se fait grâce à MySQL Shell et la méthode checkInstanceConfiguration() :

$ mysqlsh
Welcome to MySQL Shell 1.0.9

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.



mysql-js> dba.checkInstanceConfiguration('root@192.168.1.22:3306')
Please provide the password for 'root@192.168.1.22:3306': 
Validating instance...

The instance '192.168.1.22:3306' is not valid for Cluster usage.

The following issues were encountered:

 - Some configuration options need to be fixed.

+----------------------------------+---------------+----------------+--------------------------------------------------+
| Variable                         | Current Value | Required Value | Note                                             |
+----------------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum                  | CRC32         | NONE           | Update the server variable or restart the server |
| enforce_gtid_consistency         | OFF           | ON             | Restart the server                               |
| gtid_mode                        | OFF           | ON             | Restart the server                               |
| log_bin                          | 0             | 1              | Restart the server                               |
| log_slave_updates                | 0             | ON             | Restart the server                               |
| master_info_repository           | FILE          | TABLE          | Restart the server                               |
| relay_log_info_repository        | FILE          | TABLE          | Restart the server                               |
| transaction_write_set_extraction | OFF           | XXHASH64       | Restart the server                               |
+----------------------------------+---------------+----------------+--------------------------------------------------+


Please fix these issues , restart the serverand try again.

{
    "config_errors": [
        {
            "action": "server_update", 
            "current": "CRC32", 
            "option": "binlog_checksum", 
            "required": "NONE"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_bin", 
            "required": "1"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_slave_updates", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "master_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "relay_log_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "transaction_write_set_extraction", 
            "required": "XXHASH64"
        }
    ], 
    "errors": [], 
    "restart_required": true, 
    "status": "error"
}

Dans mon cas, avec l’installation par défaut sous Ubuntu, niveau configuration j’ai tout à faire 🙂

La méthode renvoie un document JSON (pratique pour l’automatisation) avec la liste des tâches à effectuer pour être conforme… Configurons donc !

 

J’ai deux solutions :

  • 1/ je prépare mes instances « manuellement » (cette tâche peut bien évidemment s’automatiser) comme expliqué dans l’article comment configurer un groupe.
  • 2/ je me connecte à chaque instance en local, et j’utilise la méthode : configureLocalInstance()

Soyons fou ! allons y pour la méthode 2 :

mysql-js> dba.configureLocalInstance('root@192.168.1.22:3306')
Dba.configureLocalInstance: This function only works with local instances (RuntimeError)

Ouppss!!! dba.configureLocalInstance ne fonctionne qu’en local, c’est-à-dire, si je suis connecté sur la machine hôte de l’instance MySQL (ce qui est une bonne chose). Du coup après m’être connecté à l’hôte 192.168.1.22 :

sudo mysqlsh
Welcome to MySQL Shell 1.0.9
...

mysql-js> dba.configureLocalInstance('root@localhost:3306')
Please provide the password for 'root@localhost:3306': 

Detecting the configuration file...
Found configuration file at standard location: /etc/mysql/mysql.conf.d/mysqld.cnf
Do you want to modify this file? [Y|n]: Y
Validating instance...

The configuration has been updated but it is required to restart the server.

{
    "config_errors": [
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_bin", 
            "required": "1"
        },
        {
            "action": "restart", 
            "current": "0", 
            "option": "log_slave_updates", 
            "required": "ON"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "master_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "FILE", 
            "option": "relay_log_info_repository", 
            "required": "TABLE"
        },
        {
            "action": "restart", 
            "current": "OFF", 
            "option": "transaction_write_set_extraction", 
            "required": "XXHASH64"
        }
    ], 
    "errors": [], 
    "restart_required": true, 
    "status": "error"
}

Note: Assure toi d’avoir les droits nécessaires pour mettre à jour le fichier de configuration de MySQL.

Les informations ajoutées dans le fichier de configuration se trouvent en fin de fichier :

# cat /etc/mysql/mysql.conf.d/mysqld.cnf

[mysqld]
... <data previously in the file> ...

log_slave_updates = ON
server_id = 2323701546
relay_log_info_repository = TABLE
master_info_repository = TABLE
transaction_write_set_extraction = XXHASH64
binlog_format = ROW
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
report_port = 3306
binlog_checksum = NONE
enforce_gtid_consistency = ON
log_bin
gtid_mode = ON

192.168.1.22 est configurée. Après redémarrage de l’instance MySQL, la sortie de checkInstanceConfiguration est beaucoup moins anxiogène :

mysql-js> dba.checkInstanceConfiguration('root@192.168.1.22:3306')
Please provide the password for 'root@192.168.1.22:3306': 
Validating instance...

The instance '192.168.1.22:3306' is valid for Cluster usage
{
    "status": "ok"
}

OK ! Le membre est prêt pour faire parti d’un groupe.

La même procédure doit être appliquée sur les autres instances MySQL.

mysql-js> dba.checkInstanceConfiguration('root@192.168.1.50:3306')
Please provide the password for 'root@192.168.1.50:3306': 
Validating instance...

The instance '192.168.1.50:3306' is valid for Cluster usage
{
    "status": "ok"
}
mysql-js> dba.checkInstanceConfiguration('root@192.168.1.48:3306')
Please provide the password for 'root@192.168.1.48:3306': 
Validating instance...

The instance '192.168.1.48:3306' is valid for Cluster usage
{
    "status": "ok"
}

 

 

Créer le cluster

Une fois les 3 instances correctement configurées, l’étape suivante consiste à créer le cluster avec createCluster. Cette méthode va être jouée sur le premier membre, l’instance MySQL sur  192.168.1.22,  elle va permettre de bootstrapper le cluster:

mysql-js> \c root@192.168.1.22
Creating a Session to 'root@192.168.1.22'
Enter password: 
Classic Session successfully established. No default schema selected.


mysql-js> var cluster = dba.createCluster('pocCluster', {ipWhitelist: "192.168.1.0/16"});
A new InnoDB cluster will be created on instance 'root@192.168.1.22:3306'.

Creating InnoDB cluster 'pocCluster' on 'root@192.168.1.22:3306'...
Adding Seed Instance...

Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.

createCluster() prend comme paramètre le nom du cluster (pocCluster). Je peux lui passer également quelques information optionnelles comme la whitelist.

On peut vérifier l’état du nœud dans le cluster avec status() :

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Notes : Assure toi que ton DNS (ou /etc/hosts) est correctement configuré, sinon tu vas avoir des soucis de connections…

 

L’ajouts des nœuds suivant se fait avec addInstance(), il est néanmoins conseillé d’exécuter checkInstanceState() au préalable pour s’assurer de la compatibilité des GTID sets :

Nœud 2

mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306')
Please provide the password for 'root@192.168.1.50:3306': 
Analyzing the instance replication state...

The instance '192.168.1.50:3306' is valid for the cluster.
The instance is new to Group Replication.

{
    "reason": "new", 
    "state": "ok"
}


mysql-js> cluster.addInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@192.168.1.50:3306': 
Adding instance to the cluster ...

The instance 'root@192.168.1.50:3306' was successfully added to the cluster.


mysql-js> cluster.status();
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Au cas où l’instance ajoutée contient plus de transactions que le groupe checkInstanceState le fait savoir :

mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306')
Please provide the password for 'root@192.168.1.50:3306': 
Analyzing the instance replication state...

The instance '192.168.1.50:3306' is invalid for the cluster.
The instance contains additional transactions in relation to the cluster.

{
    "reason": "diverged", 
    "state": "error"
}

En fonction du contexte, il faut alors soit restaurer une sauvegarde d’un membre du cluster sur l’instance problématique (celle qui diverge) ou alors si tu sais ce que tu fais, une synchronisation des GTIDs est toujours possible, voir un reset master.

 

Nœud 3

mysql-js> cluster.checkInstanceState('root@192.168.1.48:3306')
Please provide the password for 'root@192.168.1.48:3306': 
Analyzing the instance replication state...

The instance '192.168.1.48:3306' is valid for the cluster.
The instance is new to Group Replication.

{
    "reason": "new", 
    "state": "ok"
}


mysql-js> cluster.addInstance("root@192.168.1.48:3306", {ipWhitelist: "192.168.1.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@192.168.1.48:3306': 
Adding instance to the cluster ...

The instance 'root@192.168.1.48:3306' was successfully added to the cluster.


mysql-js> cluster.status();
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Et voilà!

Cluster MySQL Group Replication de 3 nœuds déployé grâce à MySQL Shell !

  • Nœud 1 : 192.168.1.22 : Primaire (lecture/écriture)
  • Nœud 2 : 192.168.1.50 : Secondaire (lecture seule)
  • Nœud 3 : 192.168.1.48 : Secondaire (lecture seule)

 

Pour rendre la configuration persistante ie l’écrire dans le fichier de configuration, il faut exécuter, après que le noeud soit configuré, la méthode dba.configureLocalInstance() :

daz@192.168.1.50:~$ grep -c group_replication /etc/mysql/mysql.conf.d/mysqld.cnf
0


daz@192.168.1.50:~$ sudo mysqlsh --uri=root@localhost
Creating a Session to 'root@localhost'
Enter password: 
Classic Session successfully established. No default schema selected.
Welcome to MySQL Shell 1.0.9
...


mysql-js> dba.configureLocalInstance("root@localhost:3306")  
Please provide the password for 'root@localhost:3306': 

Detecting the configuration file...
Found configuration file at standard location: /etc/mysql/mysql.conf.d/mysqld.cnf
Do you want to modify this file? [Y|n]: Y
Validating instance...

The instance 'localhost:3306' is valid for Cluster usage
You can now use it in an InnoDB Cluster.

{
    "status": "ok"
}



daz@192.168.1.50:~$ grep -c group_replication /etc/mysql/my.cnf
32

A noter que cette opération ne peut se faire qu’en local.

La suite ?

 

 

Configuration de MySQL Router

Les recommandations de MySQL sont d’installer MySQL Router sur la machine hôte de l’application, je vais donc l’installer sur la machine 192.168.1.11.

 

Note: Si tu ne peux (veux) pas mettre MySQL Router sur l’application, il va alors te falloir gérer le HA du Router. Plusieurs solutions sont envisageables comme :

 

 

Bootstrap MySQL Router

daz@192.168.1.11 ~ $ mysqlrouter --bootstrap root@192.168.1.22:3306 --directory RouterPoC
Please enter MySQL password for root: 

Bootstrapping MySQL Router instance at /home/daz/RouterPoC...
MySQL Router  has now been configured for the InnoDB cluster 'pocCluster'.

The following connection information can be used to connect to the cluster.

Classic MySQL protocol connections to cluster 'pocCluster':
- Read/Write Connections: localhost:6446
- Read/Only Connections: localhost:6447

X protocol connections to cluster 'pocCluster':
- Read/Write Connections: localhost:64460
- Read/Only Connections: localhost:64470

4 ports TCP ont été configurés, dont 2 pour les connexions MySQL traditionnelles:

6446 : lectures / écritures pour le noeud primaire
6447 : lectures seules pour les nœuds secondaires (Round-Robin)
Et le pendant pour les connexions avec le protocole X (64460 & 64470), pour une utilisation NoSQL Document Store de MySQL.

 

Le fichier de configuration du Router contient quelques informations importantes, tel que le(s) port(s) à utiliser par l’application :

daz@192.168.1.11 ~ $ cat ~/RouterPoC/mysqlrouter.conf
# File automatically generated during MySQL Router bootstrap
[DEFAULT]
logging_folder=/home/daz/RouterPoC/log
runtime_folder=/home/daz/RouterPoC/run
data_folder=/home/daz/RouterPoC/data
keyring_path=/home/daz/RouterPoC/data/keyring
master_key_path=/home/daz/RouterPoC/mysqlrouter.key

[logger]
level = INFO

[metadata_cache:pocCluster]
router_id=3
bootstrap_server_addresses=mysql://192.168.1.22:3306,mysql://192.168.1.50:3306,mysql://192.168.1.48:3306
user=mysql_router3_qeteek4wi41s
metadata_cluster=pocCluster
ttl=300

[routing:pocCluster_default_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://pocCluster/default?role=PRIMARY
mode=read-write
protocol=classic

[routing:pocCluster_default_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://pocCluster/default?role=SECONDARY
mode=read-only
protocol=classic

[routing:pocCluster_default_x_rw]
bind_address=0.0.0.0
bind_port=64460
destinations=metadata-cache://pocCluster/default?role=PRIMARY
mode=read-write
protocol=x

[routing:pocCluster_default_x_ro]
bind_address=0.0.0.0
bind_port=64470
destinations=metadata-cache://pocCluster/default?role=SECONDARY
mode=read-only
protocol=x

Il est évidemment possible de modifier ce fichier. Par exemple, souvent les applications se connectent au port 3306, ça peut donc avoir du sens de modifier le port du noeud primaire en le passant de 6446 à 3306 :

daz@192.168.1.11 ~ $ grep -A6 pocCluster_default_rw ~/RouterPoC/mysqlrouter.conf
[routing:pocCluster_default_rw]
bind_address=0.0.0.0
bind_port=3306
destinations=metadata-cache://pocCluster/default?role=PRIMARY
mode=read-write
protocol=classic

Ton application va donc (continuer à) se connecter au port 3306, sauf que maintenant ce n’est pas 1 instance, mais bel et bien 3 instances qui sont en backend, et ceci en toute transparence.

 

Ensuite, il faut démarrer MySQL Router avec le script start.sh

daz@192.168.1.11 ~ $ ~/RouterPoC/start.sh

L’arrêt du Router se fait avec le script stop.sh (mais tu l’avais deviné)

daz@192.168.1.11 ~ $ ~/RouterPoC/stop.sh

 

 

Gestion des nœuds

Quelques commandes qui vont te simplifier la vie…

 

Récupérer les méta-données d’un cluster

Les méta-données du cluster sont stockées sur les membres dans le schéma mysql_innodb_cluster_metadata :

mysql> SHOW SCHEMAS;
+-------------------------------+
| Database                      |
+-------------------------------+
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |
+-------------------------------+
5 rows in set (0,00 sec)


mysql> SHOW TABLES IN mysql_innodb_cluster_metadata;
+-----------------------------------------+
| Tables_in_mysql_innodb_cluster_metadata |
+-----------------------------------------+
| clusters                                |
| hosts                                   |
| instances                               |
| replicasets                             |
| routers                                 |
| schema_version                          |
+-----------------------------------------+
6 rows in set (0,01 sec)


mysql> SELECT cluster_name FROM mysql_innodb_cluster_metadata.clusters;
+--------------+
| cluster_name |
+--------------+
| pocCluster   |
+--------------+

Pour récupérer ces informations dans une nouvelle session il faut utiliser la méthode getCluster :

mysql-js> cluster.status()
ReferenceError: cluster is not defined


mysql-js> var cluster = dba.getCluster('pocCluster')


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

 

 

Failover

Le basculement niveau base de données (changement de primaire) est automatiquement géré par les membres du cluster entre eux.

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Crash du noeud primaire (192.168.1.22)…

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.48:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Nouveau primaire élu par le groupe : 192.168.1.48.

Et 192.168.1.22 est porté disparu (MIA).

 

Un rejoinInstance() est nécessaire pour remettre le membre dans le cluster. Il aura un rôle de secondaire (voir paragraphe suivant Remettre un membre dans le groupe).

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.48:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

 

 

Remettre un membre dans le groupe

Un membre avec son instance MySQL qui tourne et qui est correctement configuré peut être remis (après un redémarrage de l’instance par exemple) dans le groupe avec la commande rejoinInstance() :

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }
        }
    }
}


mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306')
Please provide the password for 'root@192.168.1.50:3306': 
Analyzing the instance replication state...

The instance '192.168.1.50:3306' is valid for the cluster.
The instance is fully recoverable.

{
    "reason": "recoverable", 
    "state": "ok"
}


mysql-js> cluster.rejoinInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"})
Rejoining the instance to the InnoDB cluster. Depending on the original
problem that made the instance unavailable, the rejoin operation might not be
successful and further manual steps will be needed to fix the underlying
problem.

Please monitor the output of the rejoin operation and take necessary action if
the instance cannot rejoin.

Please provide the password for 'root@192.168.1.50:3306': 
Rejoining instance to the cluster ...

The instance 'root@192.168.1.50:3306' was successfully rejoined on the cluster.

The instance '192.168.1.50:3306' was successfully added to the MySQL Cluster.


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

 

 

Supprimer une instance du groupe

Sans grande surprise, c’est la commande removeInstance

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}


mysql-js> cluster.removeInstance("root@192.168.1.50:3306")
The instance will be removed from the InnoDB cluster. Depending on the 
instance being the Seed or not, the Metadata session might become invalid. 
If so, please start a new session to the Metadata Storage R/W instance.

The instance 'root@192.168.1.50:3306' was successfully removed from the cluster.


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

L’instance n’est alors plus listée dans les méta-données :

mysql-js> \sql
Switching to SQL mode... Commands end with ;


mysql-sql> SELECT instance_name FROM mysql_innodb_cluster_metadata.instances;
+-------------------+
| instance_name     |
+-------------------+
| 192.168.1.22:3306 |
| 192.168.1.48:3306 |
+-------------------+
2 rows in set (0.00 sec)

Pour la remettre dans le groupe, il faut donc rejouer le processus de l’ajout d’instance vu plus haut :

mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306')
Please provide the password for 'root@192.168.1.50:3306': 
Analyzing the instance replication state...

The instance '192.168.1.50:3306' is valid for the cluster.
The instance is fully recoverable.

{
    "reason": "recoverable", 
    "state": "ok"
}


mysql-js> cluster.addInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"})
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@192.168.1.50:3306': 
Adding instance to the cluster ...

The instance 'root@192.168.1.50:3306' was successfully added to the cluster.


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

 

 

Perte du quorum

Si le cluster perd plus de la moitié de ses membres il se retrouve dans un état assez désagréable, network partitioning, en clair il faut une intervention externe au groupe pour permettre aux membres restant de continuer à faire leur boulot de serveur MySQL.

Dans notre cas, avec 3 instances, il faut en perdre 2 d’un coup :

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Perte des nœuds (crash) 192.168.1.22 & 192.168.1.50…  (Mayday, Mayday, Mayday!!!)

mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.22:3306", 
        "status": "NO_QUORUM", 
        "statusText": "Cluster has no quorum as visible from 'localhost:3306' and cannot process write transactions. 2 members are not active", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "UNREACHABLE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "UNREACHABLE"
            }
        }
    }
}

Le failover automatique ne peut pas s’enclencher, le membre survivant (192.168.1.48) est en lecture seule. Il faut donc intervenir :

$ mysqlsh --uri=root@192.168.1.48
Creating a Session to 'root@192.168.1.48'
...

mysql-js> var cluster = dba.getCluster("pocCluster")
WARNING: The session is on a Read Only instance.
         Write operations on the InnoDB cluster will not be allowed


mysql-js> cluster.forceQuorumUsingPartitionOf('192.168.1.48:3306')
Restoring replicaset 'default' from loss of quorum, by using the partition composed of [192.168.1.48:3306]

Please provide the password for 'root@192.168.1.48:3306': 
Restoring the InnoDB cluster ...

The InnoDB cluster was successfully restored using the partition from the instance 'root@192.168.1.48:3306'.

WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored.


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.48:3306", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }
        }
    }
}

Evidemment, sauf si tu es joueur 🙂 , il faut éviter de rester trop longtemps dans cet état.

Une fois les instances en condition, il faut utiliser rejoinInstance() pour remettre les membres dans le cluster, en tant que secondaire (voir paragraphe Remettre un membre dans le groupe).

 

 

Repartir après un arrêt total du cluster

La perte du quorum est une chose, mais il y a pire, perdre tout les nœuds…

En cas d’arrêt total du cluster i.e. toutes les instances sont éteintes, il faut utiliser, une fois les instances MySQL de nouveau démarrées  rebootClusterFromCompleteOutage() :

$ mysqlsh --uri=root@192.168.1.48
Creating a Session to 'root@mysql_node1'
...

mysql-js> var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
Reconfiguring the cluster 'pocCluster' from complete outage...

The instance '192.168.1.22:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y

The instance '192.168.1.50:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y|N]: y


The cluster was successfully rebooted.


mysql-js> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "192.168.1.48:3306", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "192.168.1.22:3306": {
                "address": "192.168.1.22:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.48:3306": {
                "address": "192.168.1.48:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "192.168.1.50:3306": {
                "address": "192.168.1.50:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }
}

Le membre sur lequel la commande est exécutée est le nouveau primaire.

 

Dans la même thématique :

 

Thanks for using MySQL!

 

2 Responses to “Tutoriel – Déployer MySQL innoDB Cluster”

  1. Bonjour,
    J’ai suivi votre premier article sur la création manuelle d’un group replication. Dans celui-ci, il faut créer un utilisateur « recovery » avant de bootstraper le cluster.

    Ici avec MySQL Shell, est-ce qu’il y a besoin d’un utilisateur « recovery » ? Si oui, est-ce que l’utilisateur « recovery » est implicite (automatiquement créé par mysqlshell) ou faut-il le faire via cet outil ?

    Enfin, je comprends que MySQL Shell va faciliter la création/configuration d’un cluster, mais si j’ai déjà beaucoup de réglages dans ma configuration, est-ce qu’il va tout se baser dessus ou en recréer une et tout écraser lors de l’instruction dba.configureLocalInstance(« … ») ?

    Merci pour vos articles !

  2. […] Cet article remplace le précédent tuto : Tutoriel – Déployer MySQL innoDB Cluster. […]