A common question MySQL users have after populating their first database is, “how do I make a backup?”
Like most things in life, it really depends on your requirements:
- How much data do you have?
- How often does data change?
- How much downtime can you afford per week?
- How many tables are read-only, how many are write?
- How valuable is the data?
- How much data can you afford to lose?
- How fast do you want to restore?
- How much manual intervention in the backup process is acceptable?
- How much manual intervention in the restore process is acceptable?
- How often can you do backups?
- Do you need full or incremental backups?
- Which storage engine are you using?
- Do you need consistent data within and across tables?
- Can you recompute your data instead of backing it up? (common in science)
- Can you recalculate your indexes, or are they especially large or versioned?
- What type of data is it? BLOB?
Once you understand your requirements, you can look at the available options. These are listed in order of overall reliability, starting with the most reliable:
- True Backups (consistent, binary copy of original tables)
- cold (binary) backup – fool-proof, consistent, useful, constant speed (would be fast on ZFS), requires database shutdown
- InnoDB Hot Backup (ibbackup) for InnoDB tables
- filesystem snapshots – more difficult to setup, consistent, useful, fast (instant), requires FLUSH TABLES WITH READ LOCK and all files on same volume
- binary logs – easiest to setup (my.cnf: log-bin), not easiest to restore, must be enabled for replication.
- Near-True Backups (possibly inconsistent, binary copy of original tables)
- mysqlhotcopy for MyISAM tables – complicated by locking, not consistent unless you lock all the tables, speed depends on lock availability
- Non-True Backups (better than nothing, corruption or failed restores possible)
- LOAD DATA FROM MASTER and LOAD TABLE FROM MASTER for MyISAM tables
- SELECT … INTO OUTFILE – not difficult to setup, can be consistent with read locks, slow
- mysqldump – not difficult to setup, can be consistent with –lock-all-tables or –master-data options, slow, prone to bugs so may not restore
- replication – statement and binary – moderately complicated to setup and manage, real-time, prone to bugs, prone to interruptions. statement-based replication is single-threaded and can lag behind a master that does a lot of writes.
And a decision needs to be made how and where to store the backups:
- hard drives aren’t permanent, safe storage media
- tape backups are portable, but must be tested by restoring them periodically. You would also need to keep a working drive for the forseeable future.
- not sure of the shelf-life of burned DVDs
- punch cards can be mis-sorted (just kidding)
- do the backups need the same level of privacy as the live database? (HIPAA, secret, trade-secret, etc.)
- how long does it take to retrieve a physical backup media? Corporations in North America like using Iron Mountain, but it would take a while to retrieve your backup tapes from the bottom of a salt mine shaft.
Personally, I generally use periodic cold backups, statement-based replication to another data center, and nightly mysqldumps on key tables to a local spare hard drive to backup production systems.
mysqldump is very popular, but I am not comfortable in relying on it. (That’s a polite way of saying I don’t trust it.) However, the -d option is a handy way of dumping the database schema only to a text file for backup though.