Performing a backup is one of those tasks that ensures your application can recover from database or hardware failures should they ever occur. Over four year ago, we recognized this as a best practice and came out with PGBackups, an add-on that reduces the risk and complexity of taking database backups. Today, we’re pleased to announce two big improvements: enhanced reliability, and the ability to schedule backups.
One of the main drivers for the upgrade was the occasional backup stall experienced by users. In some cases, PGBackups would encounter a bug that resulted in degraded performance of the database while a backup was being taken. This had adverse effects for the database as well as the applications that counted on it. While we espouse Continuous Protection for our standard plans and above, we needed to fix the issue of these backup stalls -- so we did.
Dirk Kelly, a Lead Software Engineer at Interexchange.org, has been using the new backup system for a few months now and says that “[he] uses it heavily at Interexchange to sync data between environments. Since using the new system, we’ve noticed an increase in reliability and performance.”
With PGBackups being out in the wild for such a long time, it’s given us an opportunity to see the many different ways it has been used. One of the main use cases was scheduling backups. We saw thousands of manual backups being performed daily amongst all of the databases we host, so we wanted to give everyone the ability to perform a backup on their schedule without resorting to using Heroku Scheduler. With the new PGBackups, you can do just that. For each database that you have in your application, you can give it a time as well as a timezone of when you want to start that backup process:
heroku pg:backups schedule HEROKU_POSTGRESQL_GOLD --at="02:00 PDT" --app sushi
In this example, we’re kicking off our daily backup process at 2 AM Pacific time. When scheduling with the
at option, you can specify a timezone abbreviation, like PDT, or the full time zone designation, like ‘America/Los_Angeles’. The best part about this new feature is that there’s no need to manage an external crontab or Heroku Scheduler to invoke scheduled backups.
Heroku Postgres Backups still uses
pg_dump to create the backup on disk, which works well for databases that are 20GB in size or less. The reason we have a limit around 20GB is that pg_dump on larger databases causes contention for IO, memory and CPU. As a result, the longer run time needed to complete the backup increases the chance of an error that will end your backup capture prematurely.
If your database is larger than 20GB in size, don’t fret! Heroku’s Continuous Protection has got you covered. As a best practice, when a database does get big enough, the backup mechanism should transition from using a tool like
pg_dump to creating binary copies of the cluster files and Write Ahead Logs (WAL).
Over the course of the next two weeks, we will be working to migrate all of the old backups from the PGBackups add-on to the new system. You will be notified when this cutover will be taking place. On top of that, we will be removing PGBackups from the Add-on Marketplace starting today. As a result, the backup commands have changed slightly because they’re now included in the
pg namespace of the Heroku Postgres add-on. This means that you should start using the new commands immediately and, to get you started, we’ve provided a mapping document between the old commands and the new ones.
heroku pg:backups [subcommand]
Of course, if you don’t like our approach to backups, you don’t need to use PGBackups. It’s an add-on like any other in the Add-on Marketplace, and we welcome other add-ons that may be different or better for different circumstances. Ours is but one approach to doing backups and we want to make sure that developers have the choice and flexibility to build a solution that works for them.