Automatic, Incremental Backups of MYSQL Databases

Mysql is the world's most popular database for Internet use, powering applications on thousands of websites.  It's used in most blogs (such as Wordpress), most CMS's (such as Joomla and Drupal), and most forums (such as phpBB).  It's very likely that you use one or more Mysql databases.

Mysql databases need to be backed-up, and backup should usually be done more than once per day.   Rimuhosting (my VPS provider, which I highly recommend) backs up VPS servers once per week. Obviously, if you use mysql databases for frequently-updated data, you *must* have a backup and recovery strategy.  (A RAID sometimes fails!)

This article (with backup script) shows you how to set up full and incremental compressed backups of multiple databases to Rimuhosting's (or another) backup server using mysqldump and mysql binary log files. Backup will be done every 6 hours (configurable), with a full backup once per week.  Since backups are incremental and are compressed, the amount of data to be transmitted is relatively small.  It is assumed you have a running mysql server, which should provide you also with the utility programs needed for both backup and recovery.  This backup method should work for any mysql database storage engine (e.g., myisam and innodb).  It is however tested only for myisam databases on Debian Etch.
 
MYSQLDUMP
mysqldump is a utility program that reads databases and generates the mysql statements necessary to completely rebuild the databases.   It can be used to make a full backup of one or more databases in a single operation.
 
BINARY LOG FILES
Mysql can be configured to generate "binary logs."  A mysql binary log file in an incremental backup.  It contains the mysql  statements needed to update the database since the last full or incremental backup.
 
The backup strategy is:

  1. Generate a full backup of databases with mysqldump.  This full backup is compressed and backed-up to off-site storage via FTP. 
  2. With binary logging enabled, every mysql action which modifies the database after the full backup is stored in the current binary log file.
  3. Every 6 hours (configurable), the current binary log file is closed, and it is FTP'd to the backup server.  A new binary log file is created to continue to record changes to the databases.
  4. Step 3 is repeated.  Once per week, a new full backup is created (step 1), and all binary log files are purged.  The previous full backup and the now-obsolete binary log files are deleted from the backup server.

The restore strategy is:

  1. The full backup and all binary log files are copied to the server where the databases are to be recreated  (the databases will be created if they do not exist.)  The full backup is restored.
  2. Apply the incremental database changes contained in each binary log file, consecutively.  The utility programs "mysqladmin" and "mysqlbinlog" are used for this.  This brings the databases up-to-date to the moment of
    the latest incremental backup.

Utility Programs

  • mysqldump and mysqladmin are part of the mysql-client-x.x package
  • mysqlbinlog is part of the mysql-server-x.x package
  • lftp: Sophisticated command-line FTP/HTTP client program (required).  Install it with apt-get install lftp

 

 

IMPLEMENTATION
 
1. Configure mysql to do binary logging.

edit /etc/mysql/my.cnf:
Add:
log-bin = mybinlog

 You can specify which databases to do binary logging for, or which databases NOT to do binary logging for.
 
"binlog_do_db" turns binary logging on for a given database.  If this statement is used, all databases not specifically named will have binary logging turned off.  Multiple statements can be made, one to a line:

binlog_do_db = drupalaa
binlog_do_db = tosdrupal1

Alternatively, you can use the statement binlog_ignore_db.  This statements turns binary logging on for all databases EXCEPT the database(s) names.  Again, you can make several binlog_ignore_db statements, one to a line:

binlog_ignore_db  = sqlgrey
binlog_ignore_db  = uselessdb

If you have applications that use more than one database, you should be careful when using either binlog_do_db or binlog_ignore_db.  If you have doubts, you should read http://dev.mysql.com/doc/refman/5.0/en/binary-log.html.
 
Restart your mysql server:

/etc/init.d/mysqld restart

 
2. Set up automatic full and incremental backup via cron

(The following script assumes the backup directory on the FTP server is /myvar/lib/mysql. Modify this as desired.  The script refers to the Rimuhosting backup server.  However, the script can easily be modified to back up to any FTP server.  I can install this system on most servers for a fee.  If interested, contact me)

Copy this script to /root/crontasks/newmysqlbinlog:

#!/bin/sh
#echo "Creating mysql new binary log at `date`"
if [ `date +%A` == "Sunday" -a `date +%H` == "06" -o "$1" == "dump" ]; then
        echo "Weekly Backup started `date`"
        echo "Full mysql database dump started"
        echo 'All existing full backups and binary log files will be removed'
        PREFIX='mysql-dump.'
        DT=`date "+%m%d%y"`
        DBFN=$PREFIX$DT'.sql'

        rm -f /var/lib/mysql/*.bz2

        mysqldump -uusername -ppassword --flush-logs --delete-master-logs --master-data=2 --add-drop-table --lock-all-tables --databases database1 database2 database3 > /var/lib/mysql/$DBFN
        bzip2 /var/lib/mysql/$DBFN
        echo "mysql dump complete"
else
#       echo "starting new bin log"
        mysqladmin -uusername -ppassword flush-logs
fi
newestlog=`ls -d /var/lib/mysql/mybinlog.?????? | sed 's/^.*\.//' | sort -g | tail -n 1`
for file in `ls /var/lib/mysql/mybinlog.??????`
do
        if [ "/var/lib/mysql/mybinlog.$newestlog" != "$file" ]; then
                bzip2 "$file"
        fi
done

lftp -u 'ftpuser,ftppassword' backupspace.rimuhosting.com -e "set ftp:ssl-protect-data true; mirror -er --reverse -I *.bz2 -X $newestlog /var/lib/mysql /myvar/lib/mysql; mput /var/lib/mysql/mybinlog.index -O /myvar/lib/mysql; exit;"
#echo "Bin Logs backed up"

 
Edit the above script to reflect your mysql username (probably root) and password, as well as your FTP backup server username and password.  Replace "database1 database2" with the names of the databases to back up. (The "echoed" statements will be emailed to you, the system administrator, by cron.  Remove these lines if you do not want the emails.)

The script handles both incremental and full backups, compresses all files before transmission, and automatically cleans outdated backup files from the backup ftp server.  It performs one full backup per week, on Sunday at 06:02.  You can force the creation of a full backup (for testing or whatever purpose) by passing the argument "dump".

The "flush-logs" mysql command is used to create a new binary log file.  This command might be issued to the mysql server outside of this script (such as when the mysql server is restarted), producing an extra binary log file.  In this case, the script will compress and back up the extra binary log file the next time backup is done.
 
Store the following line in /etc/crontab to execute newmysqlbinlog every 6 hours (this is of course configurable):

2 */6 * * * root /root/crontasks/newmysqlbinlog

DATABASE RECOVERY
 
(If you have to restore damaged MyISAM tables, try to recover them using REPAIR TABLE or myisamchk -r first.)
 

  1. The full backup will be bzip2-compressed, and will look like this: mysql-dump.122307.sql.bz2  Binary log files will look like this:  mybinlog.000006.bz2)  Copy all these bz2 files to the server where you need to recover the databases.
  2. Decompress all the backup files (for example: bunzip2 mysql-dump.122307.sql.bz2 and bunzip2 mybinlog.000006.bz2).
  3. Run a command like this: mysql < mysql-dump.122307.sql  This should recreate the databases as they were at the time of the full backup.  (If the the databases exist, any existing tables will be removed!) 
  4. Next run a command like this: mysqlbinlog binlog.[0-9]* | mysql  This command tells the program "mysqlbinlog" to process all the binary log files, converting them into mysql statements.  Mysql applies all the sql commands, bringing the databases up-to-date, to the moment of the last incremental backup.

 
For any backup system, It is a good idea to practice the recovery step before assuming that you are protected!

 

PLEASE READ THE MANUAL!

The options you have specified DO NOT limit binary logging for the database.

binlog_do_db = drupalaa
binlog_do_db = tosdrupal1
binlog_ignore_db = sqlgrey
binlog_ignore_db = uselessdb

If you use the command "use sqlgrey;" and then issue "delete from drupalaa.users;" the commands will not be logged in the binlog as your current database is set to sqlgrey. Filtering is performed on the connection NOT on the SQL commands issued.

See the following URL for a proper explanation of these options http://www.mysqlperformanceblog.com/2009/05/14/why-mysqls-binlog-do-db-option-is-dangerous

It is true that the binary

It is true that the binary logging controls (binlog_do_db and binlog_ignore_db) do their filtering based on the default database (the one in USE). That could potentially cause trouble, but not, I think, under most circumstances in which these backup instructions would be used.

In sum, to avoid the trouble you have cited, one must name, in binlog_do_db statements, ALL the databases USED by your applications for which you want to back up data.

Moreover, an application which USES a database that you DON'T want to log queries for should not ever USE (or probably even know about, or have access to) a database you DO want to record statements for.

If the above two conditions are not met, there is definitely a danger here. Thanks for pointing this out!

(In my case, the sqlgrey database is used by an entirely separate process which knows nothing about the Drupal databases drupalaa and tosdrupal1.)

Really nice work

This is really a nice work, it works for me.

Thanks a lot!

Thanks for letting me know

Thanks for letting me know this was useful to you!
--
Lloyd

Great, great job. Very

Great, great job. Very useful to me.

Thanks.