MySQL InnoDB Cluster – Easy Recovering and provisioning

September 10, 2019
Cloudy Sky by Olivier DASINI

As the administrator of a cluster, among other tasks, you should be able to restore failed nodes and grow (or shrink) your cluster by adding (or removing) new nodes

Olivier DASINI

Up to MySQL 8.0.16, to perform these tasks you could:

Starting with MySQL 8.0.17, the easiest and more convenient method is to use the CLONE feature.

Note:
Regardless of the recovering and provisioning method, it is important to have a proven backup/restore procedure.

I recommend reading the excellent blog post from my colleague Jesper – MySQL Backup Best Practices.

Context

Let’s make it as simple as possible 🙂
I’m using MySQL 8.0.17.

I running a MySQL InnoDB Cluster well setup.
So my main assumption is that you already know what is MySQL Group Replication & MySQL InnoDB Cluster.
Additionally you can read this tutorial and this article from my colleague lefred or this one on Windows Platform from my colleague Ivan.

Scenario 1 – Node Recovering

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3, in single primary mode
  • MySQL Router is configured to enable R/W connections on 3306 and RO connections on 3307
  • M1 is currently the primary (that is in Read/Write mode)
  • M2 & M3 are currently the secondaries (that is Read Only mode)
  • M1 crashed! :'(
  • M2 & M3 are now the (new temporary) cluster
  • Then…

The goal then is to rebuild M1 and put it back to the cluster.

So like I said before we have a 3 nodes MySQL 8.0.17 InnoDB Cluster up and running:

Then node M1 crashed… (status is “MISSING“ & we have a relevant shellConnectError message) :

M1 was the primary.
The cluster initiated an automatic database failover to elect a new primary… blablabla
Anyway you already know the story ?

After a while M1 is finally fixed and ready to be part of the cluster again.

Node Recovery: Auto distributed recovery

Well I have a great news for you most of the time your only task will be to start the fixed MySQL instance.

Before starting the node (M1), status is Missing:

Then MySQL InnoDB Cluster will automatically choose the relevant distributed recovery method.
During a certain period of time status is Recovering.
Details are available in recoveryStatusText field:

The node eventually becomes online.
Status: Online.

So you get it now, the node recovering process is handle by MySQL InnoDB Cluster using the automated distributed recovery.

That means, if your lucky, your only task (in addition to fixing the issue) is to (re)start the MySQL instance.

If you’re not lucky, you’ll probably need to remove and add the node again.

Node Recovery: Remove and add the node

The strategy here is to remove the node from the cluster metadata and then add it back – as a new node – to the cluster.

For example for node M3, remove the node with removeInstance():

force is a boolean. By default, set to false.
Indicating if the instance must be removed (even if only from metadata) in case it cannot be reached.
Useful if the instance is no longer reachable (RTFM).

Then add the node with addInstance():

An alternative, that I prefer, is to use rescan() then addInstance().

Let’s see how to use it, assuming M3 crashed.
The current status is:

Let’s (re)scan the cluster and remove the missing node (M3):

The new cluster status is:

Now it’s time to bring back M3, using the Clone distributed process:

Here we go!
Our 3 nodes MySQL InnoDB Cluster can tolerate up to 1 failure again:

Scenario 2 – Node Provisioning

  • A 3 nodes MySQL InnoDB Cluster – M1 / M2 / M3 in single primary mode
  • MySQL Router is configured to enable R/W connections on port 3306 and RO connections on port 3307
  • M1 is currently the primary (that is Read/Write mode)
  • M2 & M3 are currently the secondaries (that is Read Only mode)

The goal then is to add 2 new nodes: M4 & M5

So we have the 3 nodes MySQL 8.0.17 InnoDB Cluster that we used in the first part of this article. The cluster is up and running.

Actually adding new nodes is very close to what we have done previously.

The process is :

  • Deploy the new MySQL instance preferably already configured for Group Replication
  • Use the automatic distributed recovery process

Add node M4:

As you can see we use the Clone process.
The new cluster status is:

Same process for node M5:

Our final status is then:

References

Video:

Misc
Node 1 – Group Replication configuration variables

Some optional but useful parameters from my node1’s my.cnf:

Thanks for using MySQL!

Follow me on twitter

Watch my videos on my YouTube channel and subscribe.

My Slideshare account.

My Speaker Deck account.

Thanks for using MySQL!

One Response to “MySQL InnoDB Cluster – Easy Recovering and provisioning”

  1. […] MySQL InnoDB Cluster – Easy Recovering and provisioning […]