MySQL Log Sequence Error: solved, sort of

Back in November, I noticed that I was getting a lot of MySQL crashes, with MySQL’s logs saying that it had a log sequence error. It was actually MariaDB, the community-managed version of MySQL (as opposed to Oracle’s “MySQL proper“), but for all intents and purposes, it was MySQL.

I tried everything I could think of: delete the logs, restore from binary backup, restore from text backup, everything. After switching away from MariaDB and to the “actual” MySQL, it seemed to be more stable at last…

But last week my wife reported that her site was having some of the same problems. I took a look at the logs again, and .. the log sequence errors were back. What’s more, while watching the logs (via tail -f mysql.log I could see new failures happen, through routine usage: MySQL crashed, then restarted. Then crashed, and restarted.

Back to the drawing board: first, I repeated some of the same steps I’d done before, making backups just in case. I cleared out the database, and recreated it from scratch; same problems.

By this time I was really frustrated, but then I noticed someone mentioning… memory.

I immediately cranked up top (actually, htop), and noticed mysqld was using a ton of RAM, and that my server was swapping quite a bit.

I’d not tuned MySQL (as the RamNode wiki actually suggests you do!) but my first thought was that I wanted this server to actually be used for more things – so I bumped up my VPS to give it three gigabytes of RAM. It’s a new service plan, but the nice thing about it was that as soon as I ordered it, the instance was allocated more RAM – I literally watched it happen via htop.

And all my MySQL crashes went away. It hasn’t had a crash since then.

My other server – oddly enough, the one that runs this site – is still on a more constrained instance. I may move it at some point, because I don’t know that I need or want two servers; they’re just cheap enough that it wasn’t a big deal.

MySQL log sequence error

I just had the most fun time ever with MySQL backing an instance of WordPress. I’m writing up what happened and how I “fixed” it – note the quotes – just so others are aware of it. Maybe someone has insight into how it happened or has a better idea of how to fix it for the future.

The Errors

The timeline for the error stretches back months, to an ill-advised VPS update on my part back in August or September, I think.

I use RamNode as a hosting provider; I chose Fedora as my OS (because it’s Fedora, and I like Fedora.) However, it’s limited to Fedora 20. I wanted to run something more current (since Fedora 20’s a little bit outdated) so I went through the update process — which broke everything catastrophically.

RamNode justified itself, by providing me with a system image backup (which isn’t, by the way, their responsibility – they just came through anyway. I highly recommend RamNode.) So I rebuilt the image. I copied all of the WordPress files back into place, and did the same for the MySQL database directories.

The truth is: it was MariaDB, not MySQL. All of the commands are the same and the database files are binary-compatible; I prefer MariaDB for social reasons. By the end of the adventure, though, I was using the community version of MySQL instead of MariaDB just in case it was something about the MariaDB release. Voodoo debugging on my part. 🙁

I got a few log sequence number errors there, but removing the old log files cleared that up, or so I figured. At the very least, I didn’t see any problems.

The “Fix”

Now let’s zoom up to November. My wife told me her site was performing very poorly, and since her stuff’s pretty important to me (it’s hers!) I took a look at the logs to try to figure out what was happening to the database.

It turns out that the log sequence problem was back, with a vengeance. Now every database interaction was firing off dozens of log sequence errors, resulting in the database being killed and restarted. No wonder her site was performing poorly.

I removed the log files again (to try to get the log sequence reset, because ignorance) and that didn’t fix anything; I tried to do a mysqldump and it couldn’t even read the data. I’d get errors trying to make a backup.

This was not good.

I then took a copy of the backup (made via tar cvzf) to a virtual machine (through VirtualBox, on a Fedora image). Here’s the odd thing: the machine in VirtualBox had no problem reading the database. Record counts were fine, all of the data was there.

I didn’t alter the files at all – but took the opportunity to dump the data (via mysqldump again, this time getting a valid SQL dump.)

I went back to the VPS, and uninstalled the database, cleared the directories, and then restarted the database (i.e., exactly what I’d done with my virtual image). (This is when I switched from MariaDB to MySQL, incidentally.)

… and what happened, you ask? Well, the same thing – lots of journal sequence errors. Here’s a sample:

151115 15:42:11  InnoDB: Error: page 1749 log sequence number 632633543
InnoDB: is in the future! Current system log sequence number 510316477.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.

This was concerning. I was about to have to open up the MySQL files and alter them directly (which I’d been trying to avoid.) But I wanted to try one more thing, since I’d done the data dump on my virtual image…

So I shut down the database, and removed the data again. I restarted the database server, and created the database and database user.

Then I fed the SQL into the mysql client, and ran some simple queries to see record counts and some data. No errors.

Then I restarted WordPress … and lo and behold, everything seems to be okay.

The moral of the story is, of course, to make regular backups and watch your servers… I’m hardly a MySQL admin, and clearly I need to get better about all of this.