MySQL crash recovery

Let’s crash some MySQL-databases, shall we? Well that’s harder than it seems. Especially if you’ve had to recover crashed databases on production servers. If they – set up to be very reliable – can crash during a network failure or hard reset, how come I can’t replicate the same when I tweak all settings to be at their most unreliable? Maybe the uptime of the server, the size of the database on disk, number of distinct databases and so on play a role? Well, let’s try to investigate MySQL crash recovery as best we can.

Hmm, maybe we should first establish some method of simulating a workload? Well, I have a script that simulates the sort of workload that you would see in high volume eCommerce. It makes full use of transactions to make multiple updates to multiple tables happen atomically(i.e. completely or not at all). Basically it pretends that there are customers that add credit to their account and then make purchases using that credit. Let’s try it out:

The script ownsysbench.py starts on process per IP or hostname provided and each runs 12 threads that each simulate the workload. As we saw in mytop, the number of queries per second varies between 700 and 100. This is on a dual core VM running on an Intel Atom server. So I’m kind of impressed. But in fairness I ran this on an ext4 filesystem with nobarrier on and relaxed commit rules(https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit). Let’s try it again with standard barrier for ext4 and default commit rules. Oh, maybe I should turn off Cache(unsafe) for the disk in Proxmox? That probably gave this run a performance advantage. Okey, so no extra disk caching, default flush of transactions to disk and ext4 has its barrier again:

So 600 to 800 queries per second. That’s not so bad. It’s not like it’s a tiny database either:

Well, 90 MB is kind of tiny but 200 000 customers is on the big side. With a smaller database we get up to 1200 and 1300 queries per second. Now, the point here is actually just to get a feel for what kind of load we’re putting on the system when we tweak variables. Even with all safeties on we have several hundred queries per second, of which many are INSERT and UPDATE.

So one could be forgiven for thinking that rebooting the server would cause problems. Whenever you pull the plug there will probably be a couple of transactions running. Let’s try it with these safe settings. I’m using a slightly smaller database size but we can try the big one later on:

I have set MySQL to not start automatically, to make it easier for me to spot error states. But as you can see, MySQL starts up without issue. It’s only when we look in the error.log(it’s a general log, not really sure why they named it that) that we see that some recovery-work had to be done. But it was entirely automatic.

5 transaction(s) which must be rolled back or cleaned up in total 6 row operations to undo 
2020-03-21T21:19:59.885968Z 0 [Note] InnoDB: Trx id counter is 373760 
2020-03-21T21:19:59.885997Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
<...>
2020-03-21T21:20:00.715726Z 0 [Note] InnoDB: Rolling back trx with id 373313, 1 rows to undo 
2020-03-21T21:20:00.715914Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1" 
2020-03-21T21:20:00.715974Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables 
2020-03-21T21:20:00.716079Z 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ... 
2020-03-21T21:20:00.724026Z 0 [Note] InnoDB: Rollback of trx with id 373313 completed 
2020-03-21T21:20:00.724074Z 0 [Note] InnoDB: Rolling back trx with id 373312, 1 rows to undo 
2020-03-21T21:20:00.731498Z 0 [Note] InnoDB: Rollback of trx with id 373312 completed 
2020-03-21T21:20:00.731537Z 0 [Note] InnoDB: Rolling back trx with id 373309, 1 rows to undo 
2020-03-21T21:20:00.734520Z 0 [Note] InnoDB: Rollback of trx with id 373309 completed 
2020-03-21T21:20:00.734557Z 0 [Note] InnoDB: Rolling back trx with id 373286, 1 rows to undo 
2020-03-21T21:20:00.737645Z 0 [Note] InnoDB: Rollback of trx with id 373286 completed 
2020-03-21T21:20:00.737681Z 0 [Note] InnoDB: Rolling back trx with id 373213, 2 rows to undo 
2020-03-21T21:20:00.890034Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 
2020-03-21T21:20:00.891339Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active. 
2020-03-21T21:20:00.891411Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active. 
2020-03-21T21:20:00.892242Z 0 [Note] InnoDB: Waiting for purge to start 
2020-03-21T21:20:00.892640Z 0 [Note] InnoDB: Rollback of trx with id 373213 completed
2020-03-21T21:20:00.890034Z 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB. 
2020-03-21T21:20:00.891339Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active. 
2020-03-21T21:20:00.891411Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active. 
2020-03-21T21:20:00.892242Z 0 [Note] InnoDB: Waiting for purge to start 
2020-03-21T21:20:00.892640Z 0 [Note] InnoDB: Rollback of trx with id 373213 completed 
2020-03-21T21:20:00.892756Z 0 [Note] InnoDB: Rollback of non-prepared transactions completed

mysqlcheck gave the tables a clean bill of health and the internal consistency is correct according the benchmark software itself:

cjp@kdemain:~/sysbench$ bin/python3.7 ownsysbench.py --hosts 192.168.1.154 --check true                 
192.168.1.154 
Connection to 192.168.1.154: <mysql.connector.connection.MySQLConnection object at 0x7fbbda456710> by X 
self.conn.in_transaction: False 
Checking correctness 
Numbers check out on a high level. Checking each customer.
cjp@kdemain:~/sysbench$

So I’ve tried turning off ext4’s barrier(letting writes be done a bit more sloppy), flush of innodb logs on transaction commit(they are then done once a second instead) and even that caching-mode in Proxmox called “Writeback (unsafe)”. InnoDB hasn’t given me a single complaint… As mentioned earlier, I have seen InnoDB crash in the wild so it’s not completely safe but I at least can hit it with a shovel for a couple of days and fail to make a dent.

I was planning on provoking an error which could be solved by starting MySQL in InnoDB force recovery mode level 1 and demonstrate how that can fix things. This setting(https://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html) is something you hope not having to fiddle with but when InnoDB does crash in some way, innodb_force_recovery is the way forward. We would move on to something that requires recovery level 2 or 3. All the way up past level 4. Levels 4 and up typically involve some data loss and usually requires exporting and importing all the databases.

As noted above however, no go on generating these kinds of errors as intended. But if we can’t get one of those elusive errors that can be solved with innodb_force_recovery=3 then I know a way to generate errors that can be “solved” with innodb_force_recovery=6. So let me reset the database server while it is running again, with 200 000 customers and maximum danger settings on. Then we can do some damage ourselves and see what happens.

Okey, that actually didn’t go to plan… I was expecting to have to go to level 6 and for mysqldump to fail at some stage. Maybe I deleted the wrong ib_logfile? I have a snapshot of the whole mysql partition after the reset. I can try again. No, now not even level 6 worked. All right, let’s start over. I have – as previously mentioned – been hitting this setup with a shovel for a couple of days.

So I switched to Btrfs instead of Ext4, for no other reason than feeling like it. Now let’s try that intentional break again:

Again we only had to go to level 5 of innodb_force_recovery but importantly we got stuck when trying to dump a specific table from the database. In a real life situation it’s easy to imagine how losing the list of items for all purchases would be a bad thing. I’m pretty sure audits by the IRS go better when you can show what you sold and to whom.

So while this doesn’t show tremendously well how to deal with more subtle forms of InnoDB crashes you get the general idea. Notice how I created a Btrfs snapshot before starting to tinker with stuff. That can be a good idea if the option is available. Here I intentionally broke something but the process of fixing things could actually break things even more. And the most important take-away here is that not all corruptions are recoverable. Even if all the data in the sysbench-folder is correct a failure in one of the ib_logfiles can leave you with entire tables being unrecoverable. You got to have backups!

So what about MyISAM? Well it’s a lot easier to break MyISAM. In fact, since the benchmark tool is based on transactions – and MyISAM doesn’t support transactions – I only need to run it for a few seconds to get garbage data out. Let’s first dig ourselves out of the hole we dug for ourselves.

But MyISAM says this clearly on the box so we can’t very well complain. Need transactions? Don’t use MyISAM. Need crash tolerance? Don’t use MyISAM. Have large table? Don’t use MyISAM. It’s surprising how many MySQL-databases in the world don’t conflict with those constraints and merrily run MyISAM without issue. But let’s break a MyISAM setup just for kicks!

This run was done with maximum safe settings. Btrfs doesn’t have a nobarrier equivalent as far as I know. But Proxmox added no caching and MySQL’s settings were conservative. The errors that arose were recoverable but we can’t tell if the contents of the databas is consistent with the application because as we saw before, MyISAM can’t guarantee that even when there’s no one doing a hard reset every couple of minutes.

So that was a slightly disjointed tour of MySQL failure and recovery. The next installment will be about how to cover our backs. Step 1 is clearly to use InnoDB as the underlying engine and I guess step 2 is not deleting ib_logfiles. Snapshots, binlogs and replication will follow.