Tutoriel – Déployer MySQL innoDB Cluster
L’ article que tu t’apprêtes à lire est maintenant obsolète 🙁 Ohhhhhh !
Mais rassure toi, il est remplacé par un nouvel article, tout nouveau, tout beau et surtout plus complet et à jour 🙂 Ahhhhh !!
Dans les épisodes précédents on a vu comment déployer « manuellement » MySQL Group Replication, comprendre et tester MySQL InnoDB Cluster ainsi que comment gérer aisément un cluster Group Replication déja déployé avec MySQL Shell.
Aujourd’hui, dans la série Haute Disponibilité avec MySQL on va voir comment déployer et gérer un cluster from scratch , sous la forme d’un tutoriel, grâce à la solution tout en un : MySQL InnoDB Cluster.
Note:
L’article traite de MySQL InnoDB Cluster, HA natif de MySQL Server (plugin GRoup Replication) avec pour moteur de stockage InnoDB, solution à ne pas confondre avec MySQL NDB Cluster (moteur de stockage NDB).
Le contexte
3 instances MySQL autonomes, <spoil altert> qui vont grâce au plugin MySQL Group Replication se transformer en une base de données distribuée.</spoil altert>
- Instance MySQL 1 : 192.168.1.22; Nœud numéro 1 du cluster
- Instance MySQL 2 : 192.168.1.50; Nœud numéro 2 du cluster
- Instance MySQL 3 : 192.168.1.48; Nœud numéro 3 du cluster
1 instance applicative : 192.168.1.11; MySQL Router + mon application.
MySQL Shell est installé sur toutes les instances.
En ce qui concerne les versions des logiciels:
—
Update 08-08-2018
Note: TOUJOURS prendre la dernière version du Router et du Shell. C’est actuellement la branche 8.0.x qui est bien évidemment compatible avec MySQL 5.7 InnoDB Cluster.
ex: MySQL Router 8.0.12 + MySQL Shell 8.0.12 avec MySQL 5.7.23
—
Avec un schéma sa donne (à peu près) ça :
Vérifier la configuration des instances
La première étape consiste à s’assurer que les instances MySQL sont correctement configurées pour l’utilisation de MySQL Group Replication, la couche haute disponibilité de notre architecture. A noter qu’il est préférable de provisionner ses instances déjà correctement configurées (comme détaillé dans cet article) pour MySQL Group Replication.
La vérification de la configuration se fait grâce à MySQL Shell et la méthode checkInstanceConfiguration() :
$ mysqlsh Welcome to MySQL Shell 1.0.9 Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type '\help', '\h' or '\?' for help, type '\quit' or '\q' to exit. Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries. mysql-js> dba.checkInstanceConfiguration('root@192.168.1.22:3306') Please provide the password for 'root@192.168.1.22:3306': Validating instance... The instance '192.168.1.22:3306' is not valid for Cluster usage. The following issues were encountered: - Some configuration options need to be fixed. +----------------------------------+---------------+----------------+--------------------------------------------------+ | Variable | Current Value | Required Value | Note | +----------------------------------+---------------+----------------+--------------------------------------------------+ | binlog_checksum | CRC32 | NONE | Update the server variable or restart the server | | enforce_gtid_consistency | OFF | ON | Restart the server | | gtid_mode | OFF | ON | Restart the server | | log_bin | 0 | 1 | Restart the server | | log_slave_updates | 0 | ON | Restart the server | | master_info_repository | FILE | TABLE | Restart the server | | relay_log_info_repository | FILE | TABLE | Restart the server | | transaction_write_set_extraction | OFF | XXHASH64 | Restart the server | +----------------------------------+---------------+----------------+--------------------------------------------------+ Please fix these issues , restart the serverand try again. { "config_errors": [ { "action": "server_update", "current": "CRC32", "option": "binlog_checksum", "required": "NONE" }, { "action": "restart", "current": "OFF", "option": "enforce_gtid_consistency", "required": "ON" }, { "action": "restart", "current": "OFF", "option": "gtid_mode", "required": "ON" }, { "action": "restart", "current": "0", "option": "log_bin", "required": "1" }, { "action": "restart", "current": "0", "option": "log_slave_updates", "required": "ON" }, { "action": "restart", "current": "FILE", "option": "master_info_repository", "required": "TABLE" }, { "action": "restart", "current": "FILE", "option": "relay_log_info_repository", "required": "TABLE" }, { "action": "restart", "current": "OFF", "option": "transaction_write_set_extraction", "required": "XXHASH64" } ], "errors": [], "restart_required": true, "status": "error" }
Dans mon cas, avec l’installation par défaut sous Ubuntu, niveau configuration j’ai tout à faire 🙂
La méthode renvoie un document JSON (pratique pour l’automatisation) avec la liste des tâches à effectuer pour être conforme… Configurons donc !
J’ai deux solutions :
- 1/ je prépare mes instances « manuellement » (cette tâche peut bien évidemment s’automatiser) comme expliqué dans l’article comment configurer un groupe.
- 2/ je me connecte à chaque instance en local, et j’utilise la méthode : configureLocalInstance()
Soyons fou ! allons y pour la méthode 2 :
mysql-js> dba.configureLocalInstance('root@192.168.1.22:3306') Dba.configureLocalInstance: This function only works with local instances (RuntimeError)
Ouppss!!! dba.configureLocalInstance ne fonctionne qu’en local, c’est-à-dire, si je suis connecté sur la machine hôte de l’instance MySQL (ce qui est une bonne chose). Du coup après m’être connecté à l’hôte 192.168.1.22 :
sudo mysqlsh Welcome to MySQL Shell 1.0.9 ... mysql-js> dba.configureLocalInstance('root@localhost:3306') Please provide the password for 'root@localhost:3306': Detecting the configuration file... Found configuration file at standard location: /etc/mysql/mysql.conf.d/mysqld.cnf Do you want to modify this file? [Y|n]: Y Validating instance... The configuration has been updated but it is required to restart the server. { "config_errors": [ { "action": "restart", "current": "OFF", "option": "enforce_gtid_consistency", "required": "ON" }, { "action": "restart", "current": "OFF", "option": "gtid_mode", "required": "ON" }, { "action": "restart", "current": "0", "option": "log_bin", "required": "1" }, { "action": "restart", "current": "0", "option": "log_slave_updates", "required": "ON" }, { "action": "restart", "current": "FILE", "option": "master_info_repository", "required": "TABLE" }, { "action": "restart", "current": "FILE", "option": "relay_log_info_repository", "required": "TABLE" }, { "action": "restart", "current": "OFF", "option": "transaction_write_set_extraction", "required": "XXHASH64" } ], "errors": [], "restart_required": true, "status": "error" }
Note: Assure toi d’avoir les droits nécessaires pour mettre à jour le fichier de configuration de MySQL.
Les informations ajoutées dans le fichier de configuration se trouvent en fin de fichier :
# cat /etc/mysql/mysql.conf.d/mysqld.cnf [mysqld] ... <data previously in the file> ... log_slave_updates = ON server_id = 2323701546 relay_log_info_repository = TABLE master_info_repository = TABLE transaction_write_set_extraction = XXHASH64 binlog_format = ROW disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE report_port = 3306 binlog_checksum = NONE enforce_gtid_consistency = ON log_bin gtid_mode = ON
192.168.1.22 est configurée. Après redémarrage de l’instance MySQL, la sortie de checkInstanceConfiguration est beaucoup moins anxiogène :
mysql-js> dba.checkInstanceConfiguration('root@192.168.1.22:3306') Please provide the password for 'root@192.168.1.22:3306': Validating instance... The instance '192.168.1.22:3306' is valid for Cluster usage { "status": "ok" }
OK ! Le membre est prêt pour faire parti d’un groupe.
La même procédure doit être appliquée sur les autres instances MySQL.
…
mysql-js> dba.checkInstanceConfiguration('root@192.168.1.50:3306') Please provide the password for 'root@192.168.1.50:3306': Validating instance... The instance '192.168.1.50:3306' is valid for Cluster usage { "status": "ok" } mysql-js> dba.checkInstanceConfiguration('root@192.168.1.48:3306') Please provide the password for 'root@192.168.1.48:3306': Validating instance... The instance '192.168.1.48:3306' is valid for Cluster usage { "status": "ok" }
Créer le cluster
Une fois les 3 instances correctement configurées, l’étape suivante consiste à créer le cluster avec createCluster. Cette méthode va être jouée sur le premier membre, l’instance MySQL sur 192.168.1.22, elle va permettre de bootstrapper le cluster:
mysql-js> \c root@192.168.1.22 Creating a Session to 'root@192.168.1.22' Enter password: Classic Session successfully established. No default schema selected. mysql-js> var cluster = dba.createCluster('pocCluster', {ipWhitelist: "192.168.1.0/16"}); A new InnoDB cluster will be created on instance 'root@192.168.1.22:3306'. Creating InnoDB cluster 'pocCluster' on 'root@192.168.1.22:3306'... Adding Seed Instance... Cluster successfully created. Use Cluster.addInstance() to add MySQL instances. At least 3 instances are needed for the cluster to be able to withstand up to one server failure.
createCluster() prend comme paramètre le nom du cluster (pocCluster). Je peux lui passer également quelques information optionnelles comme la whitelist.
On peut vérifier l’état du nœud dans le cluster avec status() :
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Notes : Assure toi que ton DNS (ou /etc/hosts) est correctement configuré, sinon tu vas avoir des soucis de connections…
L’ajouts des nœuds suivant se fait avec addInstance(), il est néanmoins conseillé d’exécuter checkInstanceState() au préalable pour s’assurer de la compatibilité des GTID sets :
Nœud 2
mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306') Please provide the password for 'root@192.168.1.50:3306': Analyzing the instance replication state... The instance '192.168.1.50:3306' is valid for the cluster. The instance is new to Group Replication. { "reason": "new", "state": "ok" } mysql-js> cluster.addInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"}) A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours. Please provide the password for 'root@192.168.1.50:3306': Adding instance to the cluster ... The instance 'root@192.168.1.50:3306' was successfully added to the cluster. mysql-js> cluster.status(); { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Au cas où l’instance ajoutée contient plus de transactions que le groupe checkInstanceState le fait savoir :
mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306') Please provide the password for 'root@192.168.1.50:3306': Analyzing the instance replication state... The instance '192.168.1.50:3306' is invalid for the cluster. The instance contains additional transactions in relation to the cluster. { "reason": "diverged", "state": "error" }
En fonction du contexte, il faut alors soit restaurer une sauvegarde d’un membre du cluster sur l’instance problématique (celle qui diverge) ou alors si tu sais ce que tu fais, une synchronisation des GTIDs est toujours possible, voir un reset master.
Nœud 3
mysql-js> cluster.checkInstanceState('root@192.168.1.48:3306') Please provide the password for 'root@192.168.1.48:3306': Analyzing the instance replication state... The instance '192.168.1.48:3306' is valid for the cluster. The instance is new to Group Replication. { "reason": "new", "state": "ok" } mysql-js> cluster.addInstance("root@192.168.1.48:3306", {ipWhitelist: "192.168.1.0/16"}) A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours. Please provide the password for 'root@192.168.1.48:3306': Adding instance to the cluster ... The instance 'root@192.168.1.48:3306' was successfully added to the cluster. mysql-js> cluster.status(); { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Et voilà!
Cluster MySQL Group Replication de 3 nœuds déployé grâce à MySQL Shell !
- Nœud 1 : 192.168.1.22 : Primaire (lecture/écriture)
- Nœud 2 : 192.168.1.50 : Secondaire (lecture seule)
- Nœud 3 : 192.168.1.48 : Secondaire (lecture seule)
Pour rendre la configuration persistante ie l’écrire dans le fichier de configuration, il faut exécuter, après que le noeud soit configuré, la méthode dba.configureLocalInstance() :
daz@192.168.1.50:~$ grep -c group_replication /etc/mysql/mysql.conf.d/mysqld.cnf 0 daz@192.168.1.50:~$ sudo mysqlsh --uri=root@localhost Creating a Session to 'root@localhost' Enter password: Classic Session successfully established. No default schema selected. Welcome to MySQL Shell 1.0.9 ... mysql-js> dba.configureLocalInstance("root@localhost:3306") Please provide the password for 'root@localhost:3306': Detecting the configuration file... Found configuration file at standard location: /etc/mysql/mysql.conf.d/mysqld.cnf Do you want to modify this file? [Y|n]: Y Validating instance... The instance 'localhost:3306' is valid for Cluster usage You can now use it in an InnoDB Cluster. { "status": "ok" } daz@192.168.1.50:~$ grep -c group_replication /etc/mysql/my.cnf 32
A noter que cette opération ne peut se faire qu’en local.
La suite ?
Configuration de MySQL Router
Les recommandations de MySQL sont d’installer MySQL Router sur la machine hôte de l’application, je vais donc l’installer sur la machine 192.168.1.11.
Note: Si tu ne peux (veux) pas mettre MySQL Router sur l’application, il va alors te falloir gérer le HA du Router. Plusieurs solutions sont envisageables comme :
Bootstrap MySQL Router
daz@192.168.1.11 ~ $ mysqlrouter --bootstrap root@192.168.1.22:3306 --directory RouterPoC Please enter MySQL password for root: Bootstrapping MySQL Router instance at /home/daz/RouterPoC... MySQL Router has now been configured for the InnoDB cluster 'pocCluster'. The following connection information can be used to connect to the cluster. Classic MySQL protocol connections to cluster 'pocCluster': - Read/Write Connections: localhost:6446 - Read/Only Connections: localhost:6447 X protocol connections to cluster 'pocCluster': - Read/Write Connections: localhost:64460 - Read/Only Connections: localhost:64470
4 ports TCP ont été configurés, dont 2 pour les connexions MySQL traditionnelles:
6446 : lectures / écritures pour le noeud primaire
6447 : lectures seules pour les nœuds secondaires (Round-Robin)
Et le pendant pour les connexions avec le protocole X (64460 & 64470), pour une utilisation NoSQL Document Store de MySQL.
Le fichier de configuration du Router contient quelques informations importantes, tel que le(s) port(s) à utiliser par l’application :
daz@192.168.1.11 ~ $ cat ~/RouterPoC/mysqlrouter.conf # File automatically generated during MySQL Router bootstrap [DEFAULT] logging_folder=/home/daz/RouterPoC/log runtime_folder=/home/daz/RouterPoC/run data_folder=/home/daz/RouterPoC/data keyring_path=/home/daz/RouterPoC/data/keyring master_key_path=/home/daz/RouterPoC/mysqlrouter.key [logger] level = INFO [metadata_cache:pocCluster] router_id=3 bootstrap_server_addresses=mysql://192.168.1.22:3306,mysql://192.168.1.50:3306,mysql://192.168.1.48:3306 user=mysql_router3_qeteek4wi41s metadata_cluster=pocCluster ttl=300 [routing:pocCluster_default_rw] bind_address=0.0.0.0 bind_port=6446 destinations=metadata-cache://pocCluster/default?role=PRIMARY mode=read-write protocol=classic [routing:pocCluster_default_ro] bind_address=0.0.0.0 bind_port=6447 destinations=metadata-cache://pocCluster/default?role=SECONDARY mode=read-only protocol=classic [routing:pocCluster_default_x_rw] bind_address=0.0.0.0 bind_port=64460 destinations=metadata-cache://pocCluster/default?role=PRIMARY mode=read-write protocol=x [routing:pocCluster_default_x_ro] bind_address=0.0.0.0 bind_port=64470 destinations=metadata-cache://pocCluster/default?role=SECONDARY mode=read-only protocol=x
Il est évidemment possible de modifier ce fichier. Par exemple, souvent les applications se connectent au port 3306, ça peut donc avoir du sens de modifier le port du noeud primaire en le passant de 6446 à 3306 :
daz@192.168.1.11 ~ $ grep -A6 pocCluster_default_rw ~/RouterPoC/mysqlrouter.conf [routing:pocCluster_default_rw] bind_address=0.0.0.0 bind_port=3306 destinations=metadata-cache://pocCluster/default?role=PRIMARY mode=read-write protocol=classic
Ton application va donc (continuer à) se connecter au port 3306, sauf que maintenant ce n’est pas 1 instance, mais bel et bien 3 instances qui sont en backend, et ceci en toute transparence.
Ensuite, il faut démarrer MySQL Router avec le script start.sh
daz@192.168.1.11 ~ $ ~/RouterPoC/start.sh
L’arrêt du Router se fait avec le script stop.sh (mais tu l’avais deviné)
daz@192.168.1.11 ~ $ ~/RouterPoC/stop.sh
Gestion des nœuds
Quelques commandes qui vont te simplifier la vie…
Récupérer les méta-données d’un cluster
Les méta-données du cluster sont stockées sur les membres dans le schéma mysql_innodb_cluster_metadata :
mysql> SHOW SCHEMAS; +-------------------------------+ | Database | +-------------------------------+ | information_schema | | mysql | | mysql_innodb_cluster_metadata | | performance_schema | | sys | +-------------------------------+ 5 rows in set (0,00 sec) mysql> SHOW TABLES IN mysql_innodb_cluster_metadata; +-----------------------------------------+ | Tables_in_mysql_innodb_cluster_metadata | +-----------------------------------------+ | clusters | | hosts | | instances | | replicasets | | routers | | schema_version | +-----------------------------------------+ 6 rows in set (0,01 sec) mysql> SELECT cluster_name FROM mysql_innodb_cluster_metadata.clusters; +--------------+ | cluster_name | +--------------+ | pocCluster | +--------------+
Pour récupérer ces informations dans une nouvelle session il faut utiliser la méthode getCluster :
mysql-js> cluster.status() ReferenceError: cluster is not defined mysql-js> var cluster = dba.getCluster('pocCluster') mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Failover
Le basculement niveau base de données (changement de primaire) est automatiquement géré par les membres du cluster entre eux.
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Crash du noeud primaire (192.168.1.22)…
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.48:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "(MISSING)" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Nouveau primaire élu par le groupe : 192.168.1.48.
Et 192.168.1.22 est porté disparu (MIA).
Un rejoinInstance() est nécessaire pour remettre le membre dans le cluster. Il aura un rôle de secondaire (voir paragraphe suivant Remettre un membre dans le groupe).
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.48:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Remettre un membre dans le groupe
Un membre avec son instance MySQL qui tourne et qui est correctement configuré peut être remis (après un redémarrage de l’instance par exemple) dans le groupe avec la commande rejoinInstance() :
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "(MISSING)" } } } } mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306') Please provide the password for 'root@192.168.1.50:3306': Analyzing the instance replication state... The instance '192.168.1.50:3306' is valid for the cluster. The instance is fully recoverable. { "reason": "recoverable", "state": "ok" } mysql-js> cluster.rejoinInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"}) Rejoining the instance to the InnoDB cluster. Depending on the original problem that made the instance unavailable, the rejoin operation might not be successful and further manual steps will be needed to fix the underlying problem. Please monitor the output of the rejoin operation and take necessary action if the instance cannot rejoin. Please provide the password for 'root@192.168.1.50:3306': Rejoining instance to the cluster ... The instance 'root@192.168.1.50:3306' was successfully rejoined on the cluster. The instance '192.168.1.50:3306' was successfully added to the MySQL Cluster. mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Supprimer une instance du groupe
Sans grande surprise, c’est la commande removeInstance
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } } mysql-js> cluster.removeInstance("root@192.168.1.50:3306") The instance will be removed from the InnoDB cluster. Depending on the instance being the Seed or not, the Metadata session might become invalid. If so, please start a new session to the Metadata Storage R/W instance. The instance 'root@192.168.1.50:3306' was successfully removed from the cluster. mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
L’instance n’est alors plus listée dans les méta-données :
mysql-js> \sql Switching to SQL mode... Commands end with ; mysql-sql> SELECT instance_name FROM mysql_innodb_cluster_metadata.instances; +-------------------+ | instance_name | +-------------------+ | 192.168.1.22:3306 | | 192.168.1.48:3306 | +-------------------+ 2 rows in set (0.00 sec)
Pour la remettre dans le groupe, il faut donc rejouer le processus de l’ajout d’instance vu plus haut :
mysql-js> cluster.checkInstanceState('root@192.168.1.50:3306') Please provide the password for 'root@192.168.1.50:3306': Analyzing the instance replication state... The instance '192.168.1.50:3306' is valid for the cluster. The instance is fully recoverable. { "reason": "recoverable", "state": "ok" } mysql-js> cluster.addInstance("root@192.168.1.50:3306", {ipWhitelist: "192.168.1.0/16"}) A new instance will be added to the InnoDB cluster. Depending on the amount of data on the cluster this might take from a few seconds to several hours. Please provide the password for 'root@192.168.1.50:3306': Adding instance to the cluster ... The instance 'root@192.168.1.50:3306' was successfully added to the cluster. mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Perte du quorum
Si le cluster perd plus de la moitié de ses membres il se retrouve dans un état assez désagréable, network partitioning, en clair il faut une intervention externe au groupe pour permettre aux membres restant de continuer à faire leur boulot de serveur MySQL.
Dans notre cas, avec 3 instances, il faut en perdre 2 d’un coup :
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Perte des nœuds (crash) 192.168.1.22 & 192.168.1.50… (Mayday, Mayday, Mayday!!!)
mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.22:3306", "status": "NO_QUORUM", "statusText": "Cluster has no quorum as visible from 'localhost:3306' and cannot process write transactions. 2 members are not active", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "UNREACHABLE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "UNREACHABLE" } } } }
Le failover automatique ne peut pas s’enclencher, le membre survivant (192.168.1.48) est en lecture seule. Il faut donc intervenir :
$ mysqlsh --uri=root@192.168.1.48 Creating a Session to 'root@192.168.1.48' ... mysql-js> var cluster = dba.getCluster("pocCluster") WARNING: The session is on a Read Only instance. Write operations on the InnoDB cluster will not be allowed mysql-js> cluster.forceQuorumUsingPartitionOf('192.168.1.48:3306') Restoring replicaset 'default' from loss of quorum, by using the partition composed of [192.168.1.48:3306] Please provide the password for 'root@192.168.1.48:3306': Restoring the InnoDB cluster ... The InnoDB cluster was successfully restored using the partition from the instance 'root@192.168.1.48:3306'. WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored. mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.48:3306", "status": "OK_NO_TOLERANCE", "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "(MISSING)" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "(MISSING)" } } } }
Evidemment, sauf si tu es joueur 🙂 , il faut éviter de rester trop longtemps dans cet état.
Une fois les instances en condition, il faut utiliser rejoinInstance() pour remettre les membres dans le cluster, en tant que secondaire (voir paragraphe Remettre un membre dans le groupe).
Repartir après un arrêt total du cluster
La perte du quorum est une chose, mais il y a pire, perdre tout les nœuds…
En cas d’arrêt total du cluster i.e. toutes les instances sont éteintes, il faut utiliser, une fois les instances MySQL de nouveau démarrées rebootClusterFromCompleteOutage() :
$ mysqlsh --uri=root@192.168.1.48 Creating a Session to 'root@mysql_node1' ... mysql-js> var cluster = dba.rebootClusterFromCompleteOutage('pocCluster') Reconfiguring the cluster 'pocCluster' from complete outage... The instance '192.168.1.22:3306' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y|N]: y The instance '192.168.1.50:3306' was part of the cluster configuration. Would you like to rejoin it to the cluster? [y|N]: y The cluster was successfully rebooted. mysql-js> cluster.status() { "clusterName": "pocCluster", "defaultReplicaSet": { "name": "default", "primary": "192.168.1.48:3306", "status": "OK", "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", "topology": { "192.168.1.22:3306": { "address": "192.168.1.22:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.48:3306": { "address": "192.168.1.48:3306", "mode": "R/W", "readReplicas": {}, "role": "HA", "status": "ONLINE" }, "192.168.1.50:3306": { "address": "192.168.1.50:3306", "mode": "R/O", "readReplicas": {}, "role": "HA", "status": "ONLINE" } } } }
Le membre sur lequel la commande est exécutée est le nouveau primaire.
Dans la même thématique :
- Tutoriel – Déployer MySQL 5.7 InnoDB Cluster
- Déployer un cluster MySQL Group Replication
- Déployer MySQL InnoDB Cluster avec Docker
- FAQ Webinar MySQL Group Replication
- Tester MySQL InnoDB Cluster
- Configurer ProxySQL pour MySQL Group Replication
- Doc – MySQL Group Replication
- Doc – MySQL InnoDB Cluster
Thanks for using MySQL!