Percona XtraDB Cluster

Putting a MySQL database server on Ceph storage seems like a moderately good idea. At least with my hardware. When using GlusterFS it was absolutely terrible but even Ceph gives some worrying IO Wait times on my MySQL host.

To remedy the situation when using GlusterFS I installed Percona XtraDB Cluster on three nodes stored using LVM Thin volumes on SSD. Great performance but also problems. wsrep_commit errors were really common when one of the cluster nodes was made of bits of old twig and connected via a not entirely ideal LAN connection. They continued much more rarely in a tighter setup so I didn’t dare put anything other than Zabbix monitoring data on it.

And a good thing too because as of last week I can’t get it back online. Or to put it another way, I don’t have the time or the inclination to go back through Btrfs snapshots, trying to find a combination of perceived states that the cluster will accept. I’ve restored it from a cold start before using the restart-bootstrap start sequence on one node and then letting the others join. But even with editing of grastate.dat files it won’t start.

This is basically what I used to worry about when it came to Ceph. It basically like this:
– Can I have my data now?
– No, can’t give you any data because Partition.
– Oh, the cluster is partitioned?
– Yes.
– Can you discard nodes that are unable to join?
– No, because Partition.

And so on. It was easier to solve split-brain on GlusterFS and that involved editing bitmasks. I understand the whole idea of data reliability and that clustering makes that harder but forcing a node to become a master should always be a fallback.

My conclusion is this: if you want a scalable database then you can’t have a relational database. You’re going to have to use MongoDB, Cassandra or one of those graph databases. The exception might be if you can afford IBM DB2 or whatever Oracle sells. But by and large, don’t do multi-master RDBMS. The people who make PostgreSQL seem to agree: https://www.postgresql.org/docs/8.2/high-availability.html