Tutoriel – Déployer MySQL 8.0 InnoDB Cluster (09-2019)

septembre 3, 2019
Rocher du Diamant - Martinique by Olivier DASINI

Cela fait maintenant 1 an que j’ai écris la v1 de ce tuto sur MySQL InnoDB Cluster (ici) et durant ce laps de temps, mes collègues d’Oracle MySQL ont travaillé très dur pour enrichir la solution et développer pas mal de nouvelles fonctionnalités.
En fait que du bon 🙂 – grâce à vous et à vos retours – merci infiniment !.

Bref tu l’as compris, c’est le moment de mettre à jour ce tutoriel.
En clair, cet article remplace et annule l’ancien.

Je t’invite à passer un moment avec moi pour découvrir les différentes étapes du déploiement d’un MySQL InnoDB Cluster, l’utilisation de MySQL Router et aussi les principales commandes pour gérer ton cluster avec MySQL Shell.


Dans la liste des besoins essentiels de mes clients, en plus de la supervision et des process de sauvegarde et restauration, se trouve la Haute Disponibilité avec MySQL.

On va donc voir, dans cet article, comment déployer et gérer un cluster MySQL « from scratch » , sous la forme d’un tutoriel, grâce à la solution HA native tout en un : MySQL InnoDB Cluster.

Note
L’article traite de MySQL InnoDB Cluster, HA natif de MySQL Server (plugin Group Replication) avec pour moteur de stockage InnoDB.
Solution à ne pas confondre avec MySQL NDB Cluster (moteur de stockage NDB).

Le contexte

3 instances MySQL autonomes (je peux en avoir 9 max),
<spoil alert> qui vont grâce au plugin MySQL Group Replication se transformer en une base de données distribuée.</spoil alert>

  • Instance MySQL #1 – mysql_node1 – Nœud 1 du cluster
  • Instance MySQL #2 – mysql_node2 – Nœud 2 du cluster
  • Instance MySQL #3 – mysql_node3 – Nœud 3 du cluster

1 instance applicative – app1 – formée de MySQL Router + mon application.

Note:
J’utilise l’image Docker MySQL Server supportée par l’équipe MySQL d’Oracle.

$ docker ps
CONTAINER ID        IMAGE                       COMMAND                  CREATED             STATUS                   PORTS                 NAMES
e65479bbc98c        mysql/mysql-server:latest   "/entrypoint.sh mysq…"   9 minutes ago       Up 9 minutes (healthy)   3306/tcp, 33060/tcp   mysql_node3
bc140052242c        mysql/mysql-server:latest   "/entrypoint.sh mysq…"   9 minutes ago       Up 9 minutes (healthy)   3306/tcp, 33060/tcp   mysql_node2
9ae71ad1cfce        mysql/mysql-server:latest   "/entrypoint.sh mysq…"   9 minutes ago       Up 9 minutes (healthy)   3306/tcp, 33060/tcp   mysql_node1

Note:
Je n’aborde pas dans cet article la redondance de MySQL Router. Plusieurs scénarios sont possibles, je te recommande de lire ça, ceci et cela.

MySQL Shell n’a pas besoin d’être installé sur toutes les instances (contrairement à MySQL 5.7, pour pouvoir rendre persistante la configuration).
ce client texte est beaucoup plus puissant que le client texte par défaut.
Si tu ne le connais pas encore, essaie le et tu verras tu ne pourras plus t’en passer 🙂

En ce qui concerne les versions des logiciels, ce sont les plus récentes à ce jour (journée caniculaire du mois d’août 2019):

Note:
Dans cet article j’utilise la dernière GA de MySQL Server 8.0.
En ce qui concerne MySQL Router et MySQL Shell, il est impératif d’utiliser la dernière version courante (même avec un server en 5.7).

Vérifier la configuration des instances

La première étape consiste à s’assurer que les instances MySQL sont correctement configurées, en d’autres termes « InnoDB Cluster aware » (ouais je sais je suis bilingue :D).

Note:
J’aime bien pré-configurer les instances à provisionner. A minima, l’installation des plugins:

ex: plugin_load=group_replication.so & plugin-load-add=mysql_clone.so

Note:
J’utilise le compte utilisateur root pour configurer le cluster, cependant ce n’est pas une bonne pratique. Il est possible (et souhaitable) de créer un compte utilisateur spécifique (ou plusieurs), avec les droits qui vont bien.
La méthode recommandée pour créer cet utilisateur et d’utiliser l’option clusterAdmin des méthodes dba.configureInstance() et cluster.addInstance().
Plus d’info ici (Paragraphe « User Privileges »).

La vérification de la configuration se fait grâce à MySQL Shell et la fonction checkInstanceConfiguration() :

$ mysqlsh
MySQL Shell 8.0.17

Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
MySQL JS> dba.checkInstanceConfiguration('root@mysql_node1:3306')
Validating MySQL instance at mysql_node1:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1:3306

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...

NOTE: Some configuration options need to be fixed:
+--------------------------+---------------+----------------+--------------------------------------------------+
| Variable                 | Current Value | Required Value | Note                                             |
+--------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum          | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                | OFF           | ON             | Update read-only variable and restart the server |
| server_id                | 1             | <unique ID>    | Update read-only variable and restart the server |
+--------------------------+---------------+----------------+--------------------------------------------------+

Some variables need to be changed, but cannot be done dynamically on the server.
NOTE: Please use the dba.configureInstance() command to repair these issues.

{
    "config_errors": [
        {
            "action": "server_update", 
            "current": "CRC32", 
            "option": "binlog_checksum", 
            "required": "NONE"
        }, 
        {
            "action": "server_update+restart", 
            "current": "OFF", 
            "option": "enforce_gtid_consistency", 
            "required": "ON"
        }, 
        {
            "action": "server_update+restart", 
            "current": "OFF", 
            "option": "gtid_mode", 
            "required": "ON"
        }, 
        {
            "action": "server_update+restart", 
            "current": "1", 
            "option": "server_id", 
            "required": "<unique ID>"
        }
    ], 
    "status": "error"
}

La fonction checkInstanceConfiguration() renvoie un document JSON (pratique pour l’automatisation) avec la liste des tâches à effectuer pour être conforme.

Dans mon cas, avec l’image Docker Officielle de MySQL 8.0.17, par défaut sous Ubuntu, niveau configuration j’ai quasiment tout à faire 🙂

Configurons donc…

J’ai 2 solutions :

  • 1/ je prépare mes instances « manuellement » (cette tâche peut bien évidemment s’automatiser) comme expliqué dans l’article comment configurer un groupe.
  • 2/ je laisse MySQL Shell faire tout le boulot en utilisant la méthode : configureInstance()

je sens que ton cœur balance pour la 2… le mien aussi 🙂

MySQL JS> dba.configureInstance('root@mysql_node1:3306')
Configuring MySQL instance at mysql_node1:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1:3306

NOTE: Some configuration options need to be fixed:
+--------------------------+---------------+----------------+--------------------------------------------------+
| Variable                 | Current Value | Required Value | Note                                             |
+--------------------------+---------------+----------------+--------------------------------------------------+
| binlog_checksum          | CRC32         | NONE           | Update the server variable                       |
| enforce_gtid_consistency | OFF           | ON             | Update read-only variable and restart the server |
| gtid_mode                | OFF           | ON             | Update read-only variable and restart the server |
| server_id                | 1             | <unique ID>    | Update read-only variable and restart the server |
+--------------------------+---------------+----------------+--------------------------------------------------+

Some variables need to be changed, but cannot be done dynamically on the server.
Do you want to perform the required configuration changes? [y/n]: y
Do you want to restart the instance after configuring it? [y/n]: n
Configuring instance...
The instance 'mysql_node1:3306' was configured for InnoDB cluster usage.
NOTE: MySQL server needs to be restarted for configuration changes to take effect.

Petit aparté sur la persistance des informations de configurations.


Les informations de configurations sont automatiquement sauvegardées avec la commande SET PERSIST. Commande super pratique (Cloud Friendly) apparue en 8.0, qui me permet de faire des changements de configurations à chaud (online) et persistants, stockés dans le fichier mysqld-auto.cnf.

# cat /var/lib/mysql/mysqld-auto.cnf 

On peut également voir ces informations en SQL, grâce à la table persisted_variables de performance_schema :

SELECT * FROM performance_schema.persisted_variables;

La persistance des changements ce fait automatiquement grâce au paramètre persisted_globals_load activé par défaut (ON). Elle est effective après l’exécution des commandes:

Note:
je te recommande cet excellent article sur les variables persistantes de mon non moins excellent collègue Jesper.

SET PERSIST est très appréciable dans les environnements où accéder au fichier de configuration est compliqué voir impossible (Cloud Friendly, je te dis!!!).
Cependant, dans un environnement plus maîtrisé (plus classique), il peut être préférable, de centraliser toutes les informations de configurations dans le fichier de configuration original (my.cnf / my.ini).
Question de goûts et d’habitudes….

C’est bien évidemment possible grâce à l’option mycnfPath de la méthode dba.configureInstance().

Exemple:

dba.configureInstance('root@mysql_node1:3306', {mycnfPath: "/etc/my.cnf"})

Tu as donc le choix !

Note:
Assure toi d’avoir les droits nécessaires pour mettre à jour le fichier de configuration de MySQL.


Fin de l’aparté, revenons donc a la configuration de notre instance.

Grâce à ton œil aiguisé, tu as remarqué, que l’outil me propose de redémarrer l’instance MySQL, ce qui est bien pratique.
Cependant, dans le cas présent, la plateforme que j’utilise (image MySQL officielle Docker) ne dispose pas d’un processus de supervision de mysqld (comme c’est généralement le cas sur ta plateforme préférée. Info ici).
En clair, je vais devoir redémarrer sans l’outil (mais je devrai m’en sortir… ou pas 🙂 ).

L’instance mysql_node1 est configurée et prête pour être un membre d’un cluster MySQL InnoDB Cluster \o/

Tu peux le vérifier en redémarrant l’instance MySQL et refaire un checkInstanceConfiguration  :

MySQL JS> dba.checkInstanceConfiguration('root@mysql_node1:3306')
Validating MySQL instance at mysql_node1:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1:3306

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'mysql_node1:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}

Statut: OK.

La même procédure doit être appliquée sur les autres instances MySQL.

MySQL JS> dba.configureInstance('root@mysql_node2:3306')
...

MySQL JS> dba.configureInstance('root@mysql_node3:3306')
...

Je me retrouve, après configuration et redémarrage  avec le résultat suivant:

$ mysqlsh -- dba checkInstanceConfiguration --user=root  --host=mysql_node1
Validating MySQL instance at mysql_node1:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node1:3306

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'mysql_node1:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}


$ mysqlsh -- dba checkInstanceConfiguration --user=root  --host=mysql_node2
Validating MySQL instance at mysql_node2:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node2:3306

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'mysql_node2:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}


$ mysqlsh -- dba checkInstanceConfiguration --user=root  --host=mysql_node3
Validating MySQL instance at mysql_node3:3306 for use in an InnoDB cluster...

This instance reports its own address as mysql_node3:3306

Checking whether existing tables comply with Group Replication requirements...
No incompatible tables detected

Checking instance configuration...
Instance configuration is compatible with InnoDB cluster

The instance 'mysql_node3:3306' is valid for InnoDB cluster usage.

{
    "status": "ok"
}

Ca fait plaisir !

Créer le cluster

Une fois les 3 instances correctement configurées, l’étape suivante consiste à créer le cluster avec createCluster.
Cette fonction va être jouée sur le premier membre, l’instance MySQL sur  mysql_node1,  elle va permettre de créer un cluster… d’un nœud. 
Bon faut bien commencer quelque part 🙂

$ mysqlsh root@mysql_node1
MySQL Shell 8.0.17
...
Server version: 8.0.17 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.


MySQL mysql_node1 JS> var cluster = dba.createCluster('pocCluster')
A new InnoDB cluster will be created on instance 'mysql_node1:3306'.

Validating instance at mysql_node1:3306...

This instance reports its own address as mysql_node1:3306

Instance configuration is suitable.
Creating InnoDB cluster 'pocCluster' on 'mysql_node1:3306'...

Adding Seed Instance...
Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.

createCluster() prend comme paramètre le nom du cluster (pocCluster).
Tu peux aussi lui passer des paramètres facultatifs comme par exemple ipWhitelist, multiPrimary, adoptFromGR
La liste exhaustive est dispo dans la doc.

On peut vérifier l’état du nœud dans le cluster avec status() :

MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Notes :
Assure toi que ton DNS (ou /etc/hosts) est correctement configuré, sinon tu vas avoir des soucis de connections…

L’ajouts des nœuds suivant se fait avec addInstance().

Nœud 2

MySQL mysql_node1 JS> cluster.addInstance('root@mysql_node2')

NOTE: The target instance 'mysql_node2:3306' has not been pre-provisioned (GTID set
is empty). The Shell is unable to decide whether incremental distributed state
recovery can correctly provision it.
The safest and most convenient way to provision a new instance is through
automatic clone provisioning, which will completely overwrite the state of
'mysql_node2:3306' with a physical snapshot from an existing cluster member. To
use this method by default, set the 'recoveryMethod' option to 'clone'.

The incremental distributed state recovery may be safely used if you are sure
all updates ever executed in the cluster were done with GTIDs enabled, there
are no purged transactions and the new instance contains the same GTID set as
the cluster or a subset of it. To use this method by default, set the
'recoveryMethod' option to 'incremental'.


Please select a recovery method [C]lone/[I]ncremental recovery/[A]bort (default Clone): 
Validating instance at mysql_node2:3306...

This instance reports its own address as mysql_node2:3306

Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
Clone based state recovery is now in progress.

NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.

* Waiting for clone to finish...
NOTE: mysql_node2:3306 is being cloned from mysql_node1:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed
** Stage RECOVERY: |
NOTE: mysql_node2:3306 is shutting down...

* Waiting for server restart... ready
* mysql_node2:3306 has restarted, waiting for clone to finish...
* Clone process has finished: 65.84 MB transferred in 3 sec (21.95 MB/s)

State recovery already finished for 'mysql_node2:3306'

The instance 'mysql_node2' was successfully added to the cluster.

Le nœud ajouté doit être « synchro » avec le cluster.
MySQL InnoDB Cluster peut le gérer automatiquement pour toi.

Il te propose 2 méthodes, par défaut, le Clone (une copy physique) ou alors l’Incrémental (copy logique).
Le clone est la méthode la plus sûre d’un point de vue cohérence d’ensemble du cluster, c’est donc cette méthode que je te recommande.

Sauf si tu te portes garant de la cohérence des données du nœud à ajouter (et accessoirement que ton volume de données est important => temps de clonage important).


Je me permet de porter à ton attention sur un point important.
L’une des étapes du clonage, nécessite un redémarrage du serveur MySQL.
Comme vu plus haut, si comme ici tes instances MySQL ne disposent pas d’un processus de supervision de mysqld, comme c’est le cas avec Docker, bah MySQL ne va pas redémarrer automatiquement.

Pas de panique, il faut redémarrer l’instance a la mano au bon moment, c’est à dire, quand tu verras:

Note: * Waiting for server restart

Dans mon cas, dans une autre fenêtre j’ai exécuté:

daz@daz-TECRA-Z40-B:~$ docker restart mysql_node2
 mysql_node2

Easy!
Ok, ça ne casse pas 3 pattes à un canard, mais ça t’évitera de perdre du temps inutilement 😉


Je peux voir le nouvel état de mon cluster avec status:

MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Ca prend forme…
Aller encore un nœud !

Nœud 3

MySQL mysql_node1 JS> cluster.addInstance('root@mysql_node3')

NOTE: The target instance 'mysql_node3:3306' has not been pre-provisioned (GTID set
is empty). The Shell is unable to decide whether incremental distributed state
recovery can correctly provision it.
The safest and most convenient way to provision a new instance is through
automatic clone provisioning, which will completely overwrite the state of
'mysql_node3:3306' with a physical snapshot from an existing cluster member. To
use this method by default, set the 'recoveryMethod' option to 'clone'.

The incremental distributed state recovery may be safely used if you are sure
all updates ever executed in the cluster were done with GTIDs enabled, there
are no purged transactions and the new instance contains the same GTID set as
the cluster or a subset of it. To use this method by default, set the
'recoveryMethod' option to 'incremental'.


Please select a recovery method [C]lone/[I]ncremental recovery/[A]bort (default Clone): 
Validating instance at mysql_node3:3306...

This instance reports its own address as mysql_node3:3306

Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
Clone based state recovery is now in progress.

NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.

* Waiting for clone to finish...
NOTE: mysql_node3:3306 is being cloned from mysql_node1:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed

NOTE: mysql_node3:3306 is shutting down...

* Waiting for server restart... ready
* mysql_node3:3306 has restarted, waiting for clone to finish...
** Stage RESTART: Completed
* Clone process has finished: 65.84 MB transferred in 2 sec (32.92 MB/s)

Incremental distributed state recovery is now in progress.

* Waiting for distributed recovery to finish...
NOTE: 'mysql_node3:3306' is being recovered from '<NULL>:0'
* Distributed recovery has finished

The instance 'mysql_node3' was successfully added to the cluster.

Même processus que pour le nœud 2.

Le résultat final est:

MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Et voilà!

Un Cluster MySQL Group Replication de 3 nœuds facilement et rapidement déployé grâce à MySQL Shell, c’est ça MySQL InnoDB Cluster (enfin presque, il manque encore un élément).

La configuration actuelle est la suivante:

  • Nœud 1 = mysql_node1 : Primaire (lecture/écriture)
  • Nœud 2 = mysql_node2 : Secondaire (lecture seule)
  • Nœud 3 = mysql_node3 : Secondaire (lecture seule)

Et qu’est ce que l’on fait maintenant ???

Le Router !

Le Router !

Le Router !

Configuration de MySQL Router

Il est recommandé (mais pas obligatoire) d’installer MySQL Router sur la machine hôte de l’application, je vais donc suivre cette recommandation et l’installer sur la machine app1.

Si tu ne souhaites / peux / veux pas mettre MySQL Router sur l’application, pas de souci.
Il te faudra au moins 2 instances de MySQL Router et de plus tu vas devoir gérer le HA du Router.
Plusieurs solutions sont envisageables comme par exemple :

Bootstrap MySQL Router

La première étape est le bootstrap, c’est à dire créer LE lien entre MySQL Router et ton cluster.
Il faut fournir à mysqlrouter a minima l’adresse d’un membre du cluster :

daz@app1:~$ mysqlrouter --bootstrap root@mysql_node1:3306 --conf-base-port 3306 --directory /routerDir/RouterPoC1
Please enter MySQL password for root: 
# Bootstrapping MySQL Router instance at '/routerDir/RouterPoC1'...

- Checking for old Router accounts
  - No prior Router accounts found
- Creating mysql account mysql_router1_h8rjf49kin1d@'%' for cluster management
- Storing account in keyring
- Adjusting permissions of generated files
- Creating configuration /routerDir/RouterPoC1/mysqlrouter.conf

# MySQL Router configured for the InnoDB cluster 'pocCluster'

After this MySQL Router has been started with the generated configuration

    $ mysqlrouter -c /routerDir/RouterPoC1/mysqlrouter.conf

the cluster 'pocCluster' can be reached by connecting to:

## MySQL Classic protocol

- Read/Write Connections: localhost:3306
- Read/Only Connections:  localhost:3307

## MySQL X protocol

- Read/Write Connections: localhost:3308
- Read/Only Connections:  localhost:3309

Note:
Il se peut que tu rencontres un problème de permission. Probablement dû à la configuration de AppArmor
Google (ou équivalent) est ton ami (ou pas)
Si tu es sous Ubuntu clic ici.

Quelques explications…

J’ai créé une configuration différente de celle par défaut, en personnalisant avec quelques options:

  • conf-base-port : le port proposé par défaut est 6446 pour la lecture/écriture. Dans mon cas, je veux utiliser le célèbre port 3306.
  • directory : histoire de ranger tout le bazar de cette instance de Router dans le répertoire spécifié. Bien utile si tu as plusieurs instances de Router sur la machine.

La liste complète des options est disponible ici.

Pour résumer, 4 ports TCP ont été configurés, dont 2 pour les connexions MySQL traditionnelles:

  • 3306 (au lieu de 6446 par défaut) : lectures / écritures pour le nœud primaire (stratégie de routing: first available par défaut).
  • 3307 (au lieu de 6447 par défaut) : lectures seules pour les nœuds secondaires (stratégie de routing: Round-Robin with fallback par défaut).

Et le pendant pour les connexions avec le protocole X (3308 & 3309 (au lieu de respectivement 64460 & 64470)), pour une utilisation NoSQL Document Store de MySQL.

Le fichier de configuration de MySQL Router contient quelques informations importantes, tel que le(s) port(s) à utiliser par l’application (comme vu précédemment) :

daz@app1:~$ cat /routerDir/RouterPoC1/mysqlrouter.conf 
# File automatically generated during MySQL Router bootstrap
[DEFAULT]
logging_folder=/routerDir/RouterPoC1/log
runtime_folder=/routerDir/RouterPoC1/run
data_folder=/routerDir/RouterPoC1/data
keyring_path=/routerDir/RouterPoC1/data/keyring
master_key_path=/routerDir/RouterPoC1/mysqlrouter.key
connect_timeout=15
read_timeout=30
dynamic_state=/routerDir/RouterPoC1/data/state.json

[logger]
level = INFO

[metadata_cache:pocCluster]
router_id=1
user=mysql_router1_h8rjf49kin1d
metadata_cluster=pocCluster
ttl=0.5
use_gr_notifications=0

[routing:pocCluster_default_rw]
bind_address=0.0.0.0
bind_port=3306
destinations=metadata-cache://pocCluster/default?role=PRIMARY
routing_strategy=first-available
protocol=classic

[routing:pocCluster_default_ro]
bind_address=0.0.0.0
bind_port=3307
destinations=metadata-cache://pocCluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=classic

[routing:pocCluster_default_x_rw]
bind_address=0.0.0.0
bind_port=3308
destinations=metadata-cache://pocCluster/default?role=PRIMARY
routing_strategy=first-available
protocol=x

[routing:pocCluster_default_x_ro]
bind_address=0.0.0.0
bind_port=3309
destinations=metadata-cache://pocCluster/default?role=SECONDARY
routing_strategy=round-robin-with-fallback
protocol=x

Il est possible de modifier ce fichier, si besoin.

Ensuite, il faut démarrer MySQL Router avec le script start.sh

daz@app1:~$ /routerDir/RouterPoC1/start.sh 
PID 20307 written to '/routerDir/RouterPoC1/mysqlrouter.pid'
logging facility initialized, switching logging to loggers specified in configuration


daz@app1:~$ ps aux | grep mysqlroute[r]
daz      20307  0.8  0.0 697792  3612 pts/3    Sl   Aug08  12:10 /usr/bin/mysqlrouter -c /routerDir/RouterPoC1/mysqlrouter.conf

L’arrêt du Router se fait avec le script stop.sh (mais tu l’avais deviné).

Voilà pour le Router !
C’est terminé pour la phase de déploiement du cluster.

Simple, rapide et surtout facilement automatisable, tels sont les principales caractéristiques du déploiement d’un MySQL InnoDB Cluster.
Qualités qui constituent le cœur même de l’ADN de MySQL.

Se connecter au cluster

Ton MySQL InnoDB Cluster est maintenant « up and running »

Ton application va donc devoir se connecter au port 3306 (car on l’a configuré comme cela, sinon c’est 6446 par défaut) pour utiliser la base de donnée.
D’ailleurs, on peut dire que du point de vue de l’application, la base de donnée c’est MySQL Router, sauf qu’en réalité ce n’est pas 1 instance, mais bel et bien 3 instances MySQL qui sont en backend et ceci en toute transparence \o/.

La partie utilisation du cluster est hors du scope de cet article, mais on peut facilement simuler le comportement de l’application avec un client MySQL et MySQL router.

A partir de app1, je peux par exemple me connecter avec MySQL Shell (mon application), en local, au port 3306, en mode SQL:

daz@app1:~$ mysqlsh root@localhost:3306 --sql -e"SELECT @@report_host AS 'Primary is';"
Primary is
mysql_node1

La connexion au port 3306, m’envoie sur le membre primaire du cluster, ici le nœud : mysql_node1.

Le paramètre report_host (défini dans mon fichier de configuration) me renvoi la valeur du  nœud 1, qui est pour l’instant le primaire.

Gestion des nœuds

Je te propose maintenant de passer en revues les commandes les plus utiles.

Récupérer les méta-données du cluster

La variable, communément nommé cluster, – celle là même utilisée lors de la création du cluster: var cluster = dba.createCluster(‘…’) – qui permet d’effectuer un grand nombre de manipulation sur ce dernier, a une durée de vie limitée à la session.

Pour récupérer ces informations relatives à l’état du cluster dans une nouvelle session il faut utiliser la méthode dba.getCluster :

$ mysqlsh root@localhost:3306
MySQL Shell 8.0.17
...
Server version: 8.0.17 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.


MySQL localhost JS > cluster.status()
ReferenceError: cluster is not defined


MySQL localhost JS> var cluster = dba.getCluster('pocCluster')


MySQL localhost JS > cluster.status()
...

Une alternative pratique est d’utiliser le paramètre –cluster lors de la connexion à MySQL Shell:

$ mysqlsh root@localhost:3306 --cluster
MySQL Shell 8.0.17
...
Your MySQL connection id is 38657
Server version: 8.0.17 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
You are connected to a member of cluster 'pocCluster'.
Variable 'cluster' is set.
Use cluster.status() in scripting mode to get status of this cluster or cluster.help() for more commands.


MySQL localhost JS> cluster.status()
{

Choisir un nouveau nœud primaire

Pour changer de primaire à la volée, il faut utiliser la fonction setPrimaryInstance()

## Set New Primary
MySQL mysql_node1 JS> cluster.setPrimaryInstance("mysql_node2:3306")
Setting instance 'mysql_node2:3306' as the primary instance of cluster 'pocCluster'...

Instance 'mysql_node1:3306' was switched from PRIMARY to SECONDARY.
Instance 'mysql_node2:3306' was switched from SECONDARY to PRIMARY.
Instance 'mysql_node3:3306' remains SECONDARY.

WARNING: The cluster internal session is not the primary member anymore. For cluster management operations please obtain a fresh cluster handle using <Dba>.getCluster().

The instance 'mysql_node2:3306' was successfully elected as primary.

... {Reconnection} ...


MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Décrire le cluster

describe() décrit la structure du cluster:

MySQL mysql_node1 JS> cluster.describe()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "topology": [
            {
                "address": "mysql_node1:3306", 
                "label": "mysql_node1:3306", 
                "role": "HA", 
                "version": "8.0.17"
            }, 
            {
                "address": "mysql_node2:3306", 
                "label": "mysql_node2:3306", 
                "role": "HA", 
                "version": "8.0.17"
            }, 
            {
                "address": "mysql_node3:3306", 
                "label": "mysql_node3:3306", 
                "role": "HA", 
                "version": "8.0.17"
            }
        ], 
        "topologyMode": "Single-Primary"
    }
}

Passer en mode multi-primary

Le passage du mode single-primary à multi-primary se fait avec la fonction switchToMultiPrimaryMode():

## Switch To Multi Primary Mode
MySQL mysql_node1 JS> cluster.switchToMultiPrimaryMode()
Switching cluster 'pocCluster' to Multi-Primary mode...

Instance 'mysql_node1:3306' was switched from SECONDARY to PRIMARY.
Instance 'mysql_node2:3306' remains PRIMARY.
Instance 'mysql_node3:3306' was switched from SECONDARY to PRIMARY.

The cluster successfully switched to Multi-Primary mode.


MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Multi-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Passer en mode single-primary

Le passage du mode multi-primary à single-primary se fait avec la fonction switchToSinglePrimaryMode():

## switch To Single PrimaryMode
MySQL mysql_node1 JS> cluster.switchToSinglePrimaryMode("mysql_node1:3306")
Switching cluster 'pocCluster' to Single-Primary mode...

Instance 'mysql_node1:3306' remains PRIMARY.
Instance 'mysql_node2:3306' was switched from PRIMARY to SECONDARY.
Instance 'mysql_node3:3306' was switched from PRIMARY to SECONDARY.

WARNING: Existing connections that expected a R/W connection must be disconnected, i.e. instances that became SECONDARY.

The cluster successfully switched to Single-Primary mode.

... {Reconnection} ...


MySQL mysql_node1 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Database Failover

Le basculement niveau base de données (changement de primaire) est automatiquement géré par les membres du cluster entre eux.

Par exemple, si j’ai un crash du nœud primaire : mysql_node1.
Les connexions a MySQL Router vont avoir la vue suivante:

$ mysqlsh root@localhost:3306 --cluster
MySQL Shell 8.0.17
...


MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node1' (113)", 
                "status": "(MISSING)"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node2:3306"
}

Un nouveau primaire est élu par le groupe, ici mysql_node2.
mysql_node1 est lui porté disparu (MIA).

Les données de configuration du cluster étant sauvegardées, une fois le nœud redémarré / réparé / restauré, il fera automatiquement parti du cluster à nouveau. et il aura un rôle de secondaire, c’est à dire en lecture seule (si je suis en single-primary).

MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node2:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node2:3306"
}

En cas de configuration non persistante, un rejoinInstance() est nécessaire pour remettre le nœud dans le cluster.

Supprimer un membre du groupe

Pour supprimer un nœud du cluster, c’est la fonction removeInstance():

## Remove a member from the group
MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}


MySQL localhost JS> cluster.removeInstance("mysql_node3:3306")
The instance will be removed from the InnoDB cluster. Depending on the instance
being the Seed or not, the Metadata session might become invalid. If so, please
start a new session to the Metadata Storage R/W instance.

Instance 'mysql_node3:3306' is attempting to leave the cluster...

The instance 'mysql_node3:3306' was successfully removed from the cluster.


MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Pour remettre le membre, il suffit d’utiliser la fonction addInstance():

MySQL localhost JS> cluster.addInstance("mysql_node3:3306")
The safest and most convenient way to provision a new instance is through
automatic clone provisioning, which will completely overwrite the state of
'mysql_node3:3306' with a physical snapshot from an existing cluster member. To
use this method by default, set the 'recoveryMethod' option to 'clone'.

The incremental distributed state recovery may be safely used if you are sure
all updates ever executed in the cluster were done with GTIDs enabled, there
are no purged transactions and the new instance contains the same GTID set as
the cluster or a subset of it. To use this method by default, set the
'recoveryMethod' option to 'incremental'.

Incremental distributed state recovery was selected because it seems to be safely usable.
Validating instance at mysql_node3:3306...

This instance reports its own address as mysql_node3:3306

Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
State recovery already finished for 'mysql_node3:3306'

The instance 'mysql_node3:3306' was successfully added to the cluster.
MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Repartir après un arrêt total du cluster

En cas d’arrêt total du cluster i.e. toutes les instances sont éteintes, il faut utiliser, une fois les instances MySQL de nouveau démarrées  rebootClusterFromCompleteOutage().

Le reboot doit se faire sur l’instance la plus à jour, dans le cas contraire, l’erreur suivante (Dba.rebootClusterFromCompleteOutage: The active session instance isn’t the most updated in comparison with the ONLINE instances of the Cluster’s metadata.) est lancée :

$ mysqlsh root@mysql_node1:3306
MySQL Shell 8.0.17
...


MySQL mysql_node3 JS> var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
Reconfiguring the cluster 'pocCluster' from complete outage...

The instance 'mysql_node1:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance 'mysql_node2:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

Dba.rebootClusterFromCompleteOutage: The active session instance isn't the most updated in comparison with the ONLINE instances of the Cluster's metadata. Please use the most up to date instance: 'mysql_node1:3306'. (RuntimeError)

L’outil à la gentillesse de m’indiquer quel est le membre le plus à jour, ici mysql_node1:3306.

$ mysqlsh root@mysql_node1:3306
MySQL Shell 8.0.17
...


MySQL mysql_node1 JS>  var cluster = dba.rebootClusterFromCompleteOutage('pocCluster')
Reconfiguring the cluster 'pocCluster' from complete outage...

The instance 'mysql_node2:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance 'mysql_node3:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The safest and most convenient way to provision a new instance is through
automatic clone provisioning, which will completely overwrite the state of
'mysql_node1:3306' with a physical snapshot from an existing cluster member. To
use this method by default, set the 'recoveryMethod' option to 'clone'.

The incremental distributed state recovery may be safely used if you are sure
all updates ever executed in the cluster were done with GTIDs enabled, there
are no purged transactions and the new instance contains the same GTID set as
the cluster or a subset of it. To use this method by default, set the
'recoveryMethod' option to 'incremental'.

Incremental distributed state recovery was selected because it seems to be safely usable.

The cluster was successfully rebooted.

Je peux maintenant me connecter à nouveau à mon cluster (en utilisant le Router):

$ mysqlsh root@localhost:3306 --cluster
MySQL Shell 8.0.17
...


MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node1:3306"
}

Le membre sur lequel la commande a été exécutée est le nouveau primaire.

Perte du quorum

Si le cluster perd plus de la moitié de ses membres – i.e. majorité de nœuds inaccessibles – (crash ou split brain par exemple) il se retrouve dans un état assez désagréable, network partitioning, en clair il faut une intervention externe au cluster pour permettre aux membres restant de continuer à faire leur boulot.

Note:
Par perte j’entend arrêt non prévu (crash). En cas d’arrêt normal ou propre, même si le cluster perd son quorum (dans ce cas présent arrêt normal de 2 nœuds), le nœud survivant sait que les autres nœuds ne sont plus là (en clair pas de risque de split brain) donc le cluster continue de fonctionner. Mais c’est un cluster avec un seul nœud… 

$ mysqlsh root@mysql_node3:3306 --cluster
MySQL Shell 8.0.17
...
WARNING: Cluster has no quorum and cannot process write transactions: 2 out of 3 members of the InnoDB cluster are unreachable from the member we’re connected to, which is not sufficient for a quorum to be reached.
WARNING: You are connected to an instance in state 'Read Only'
Write operations on the InnoDB cluster will not be allowed.

You are connected to a member of cluster 'pocCluster'.
Variable 'cluster' is set.
Use cluster.status() in scripting mode to get status of this cluster or cluster.help() for more commands.


MySQL mysql_node3 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node1:3306", 
        "ssl": "REQUIRED", 
        "status": "NO_QUORUM", 
        "statusText": "Cluster has no quorum as visible from 'mysql_node3:3306' and cannot process write transactions. 2 members are not active", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node1' (113)", 
                "status": "UNREACHABLE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node2' (113)", 
                "status": "UNREACHABLE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

mysql_node1 & mysql_node2 sont inaccessibles (e.g. le datacenter qui les héberge est tombé).

Le failover automatique ne peut pas s’enclencher, le nœud survivant mysql_node3 se met en mode « protection » contre les incohérences (inconsistency) en se plaçant automatiquement en lecture seule.
Il n’acceptera donc aucune écriture tant qu’il n’aura pas reçu l’ordre de le faire.

Cette intervention se nomme, déblocage de partition, et elle s’exécute à l’aide de la fonction forceQuorumUsingPartitionOf():

MySQL mysql_node3 JS> cluster.forceQuorumUsingPartitionOf("root@mysql_node3:3306")
Restoring replicaset 'default' from loss of quorum, by using the partition composed of [mysql_node3:3306]

Restoring the InnoDB cluster ...

The InnoDB cluster was successfully restored using the partition from the instance 'root@mysql_node3:3306'.

WARNING: To avoid a split-brain scenario, ensure that all other members of the replicaset are removed or joined back to the group that was restored.


MySQL mysql_node3 JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node3:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 2 members are not active", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node1' (113)", 
                "status": "(MISSING)"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node2' (113)", 
                "status": "(MISSING)"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

Evidemment, sauf si tu es joueur 🙂 , il faut éviter de rester trop longtemps dans cet état.

Une fois les instances remisent en condition, il faut les remettre dans le cluster. Le process dépends du scénario de la panne.

Dans le cas le plus favorable, une fois le soucis identifié et réparé, il suffit de redémarrer l’instance. Grâce au système de restauration du cluster, elle se mettra à jour automatiquement.

Exemple avec mysql_node2:

MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node3:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node1' (113)", 
                "status": "(MISSING)"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

Autre hypothèse plus embêtante, une corruption des données irréversible…
Dans ce cas une solution consiste à re-provisionner le nœud.

Il faut tout d’abord, réinitialiser les méta données du cluster avec un rescan().

Exemple avec mysql_node1 :

MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node3:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "n/a", 
                "readReplicas": {}, 
                "role": "HA", 
                "shellConnectError": "MySQL Error 2003 (HY000): Can't connect to MySQL server on 'mysql_node1' (113)", 
                "status": "(MISSING)"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

Rescan() :

MySQL localhost JS> cluster.rescan()
Rescanning the cluster...

Result of the rescanning operation for the 'default' ReplicaSet:
{
    "name": "default", 
    "newTopologyMode": null, 
    "newlyDiscoveredInstances": [], 
    "unavailableInstances": [
        {
            "host": "mysql_node1:3306", 
            "label": "mysql_node1:3306", 
            "member_id": "f6cdb648-b9d7-11e9-bf6c-0242ac14000b"
        }
    ]
}

The instance 'mysql_node1:3306' is no longer part of the ReplicaSet.
The instance is either offline or left the HA group. You can try to add it to the cluster again with the cluster.rejoinInstance('mysql_node1:3306') command or you can remove it from the cluster configuration.
Would you like to remove it from the cluster metadata? [Y/n]: Y
Removing instance from the cluster metadata...
The instance 'mysql_node1:3306' was successfully removed from the cluster metadata.

Nouvel état du cluster, plus d’info sur mysql_node1 :

MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node3:3306", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures.", 
        "topology": {
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

Puis ajouter le (nouveau) membre, en utilisant la fonctionnalité clone :

MySQL localhost JS> cluster.addInstance("root@mysql_node1:3306")

WARNING: A GTID set check of the MySQL instance at 'mysql_node1:3306' determined that it
contains transactions that do not originate from the cluster, which must be
discarded before it can join the cluster.

mysql_node1:3306 has the following errant GTIDs that do not exist in the cluster:
f6cdb648-b9d7-11e9-bf6c-0242ac14000b:1-3

WARNING: Discarding these extra GTID events can either be done manually or by completely
overwriting the state of mysql_node1:3306 with a physical snapshot from an
existing cluster member. To use this method by default, set the
'recoveryMethod' option to 'clone'.

Having extra GTID events is not expected, and it is recommended to investigate
this further and ensure that the data can be removed prior to choosing the
clone recovery method.

Please select a recovery method [C]lone/[A]bort (default Abort): C
Validating instance at mysql_node1:3306...

This instance reports its own address as mysql_node1:3306

Instance configuration is suitable.
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Adding instance to the cluster...

WARNING: User 'mysql_innodb_cluster_2835519091'@'%' already existed at instance 'localhost:3306'. It will be deleted and created again with a new password.
Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
Clone based state recovery is now in progress.

NOTE: A server restart is expected to happen as part of the clone process. If the
server does not support the RESTART command or does not come back after a
while, you may need to manually start it back.

* Waiting for clone to finish...
NOTE: mysql_node1:3306 is being cloned from mysql_node2:3306
** Stage DROP DATA: Completed
** Clone Transfer  
    FILE COPY  ############################################################  100%  Completed
    PAGE COPY  ############################################################  100%  Completed
    REDO COPY  ############################################################  100%  Completed
** Stage RECOVERY: \
NOTE: mysql_node1:3306 is shutting down...

* Waiting for server restart... ready
* mysql_node1:3306 has restarted, waiting for clone to finish...
* Clone process has finished: 66.89 MB transferred in about 1 second (~inf TB/s)

State recovery already finished for 'mysql_node1:3306'

The instance 'mysql_node1:3306' was successfully added to the cluster.

Encore une fois, n’oublie pas qu’avec Docker MySQL n’est pas géré par un processus de supervision – mysqld is not managed by supervisor process.
Tu dois donc redémarrer toi même l’instance.

Et enfin la structure finale de notre cluster, est celle que je désire:

MySQL localhost JS> cluster.status()
{
    "clusterName": "pocCluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "mysql_node3:3306", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "mysql_node1:3306": {
                "address": "mysql_node1:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node2:3306": {
                "address": "mysql_node2:3306", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }, 
            "mysql_node3:3306": {
                "address": "mysql_node3:3306", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE", 
                "version": "8.0.17"
            }
        }, 
        "topologyMode": "Single-Primary"
    }, 
    "groupInformationSourceMember": "mysql_node3:3306"
}

Voilà c’est tout pour aujourd’hui.
Reste en contact, car il y a encore plein de chose à dire sur le sujet et également plein de bonnes choses qui arrivent 🙂

Dans la même thématique:

Video:

Articles connexes:

Misc
Some optional but useful parameters from my node1’s my.cnf:

[mysqld]
report_port 	   = 3306
report_host 	   = mysql_node1

plugin_load        = group_replication.so
plugin-load-add    = mysql_clone.so

Thanks for using MySQL!

Follow me on Linkedin

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using HeatWave & MySQL!

12 Responses to “Tutoriel – Déployer MySQL 8.0 InnoDB Cluster (09-2019)”

  1. […] Tutoriel – Déployer MySQL 8.0 InnoDB Cluster […]

  2. […] Tutoriel – Déployer MySQL 8.0 InnoDB Cluster […]

  3. […] already know what is MySQL Group Replication & MySQL InnoDB Cluster.Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague […]

  4. […] make it as simple as possible ?I’m using MySQL 8.0.16.I have an InnoDB Cluster setup – up and running.So my main assumption is that you already know what is MySQL Group […]

  5. […] already know what is MySQL Group Replication & MySQL InnoDB Cluster.Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague […]

  6. […] Tutoriel – Déployer MySQL 8.0 InnoDB Cluster (09-2019) […]

  7. Bonjour Olivier,

    j’utilise MySQL 8.0.18 en version cluster. Je cherche à scripter l’installation de MySQL Router (sera lancé via Ansible), à savoir :
    mysqlrouter –bootstrap root@node1:3306 –user=mysqlrouter

    Pb : je ne vois pas comment indiquer le mot de passe de root ? quelle variable ?

    Merci
    Cdt
    Thierry (on s’est rencontré chez SETRA)

  8. Bonjour Thierry,
    en utilisant la syntaxe

    {user}:{pwd}@{host}:{port}

    Ex:
    root:myPwd@node1:3306

    Olivier

  9. […] n’avaient pas la possibilité d’utiliser cette formidable techno qu’est MySQL InnoDB Cluster parce que MySQL Group Replication ne permettait pas l’utilisation de ce type de […]

  10. […] 3 are members of a MySQL Group Replication cluster. I created a MySQL InnoDB Cluster (resources here and here). I will not use MySQL Router in this […]

  11. Many thanks
    This is one of the best tutorials I’ve around the subject. Keep it up!

  12. Thanks for the support!