Easily one of my favorite things about developing apps with Heroku is its bevy of console commands that make it easy to manage Postgres databases. Recently, however, as one of my apps crossed the threshold from a small “hobby” tier database to a larger “standard” database, the heroku CLI started warning me that several of my favorite commands could risk degrading the database’s performance and its availability to my production application.
[Related: I ignored these warnings and ran the commands anyway, only for Bugsnag to send me dozens of emails as my users began to experience database network timeouts and transaction failures. Would not recommend.]
Unfortunately, the documentation explaining why it’s more complicated to fetch backups for standard and premium databases doesn’t make it terribly clear how to do it anyway, so I wrote this post as a recap of Heroku’s backup facilities for fetching hobby tier databases before providing the quickest, most direct path to safely using those same tools once you’ve moved on to a larger database.
Backup tools available to small databases
To begin, let’s recap some of the fun and convenient commands that make development such a breeze for small databases, because the heroku
CLI lets you do several really cool operations much more easily than the built-in Postgres commands do.
Pull it down
My favorite command for fetching a production database to reproduce something locally is easily pg:pull. With it (and its sibling, pg:push
), you can fetch and restore the entire database with an easy-to-remember one-line command.
In most simple cases (and assuming you have your heroku
git remote pointing to the right app), the command will look like this:
$ heroku pg:pull DATABASE_URL some_new_local_database
The above will download the full database and restore it to a new local postgres database named some_new_local_database
. This is super handy, because if you’re using a framework like Rails, you can just update your local config/database.yml
to point to the new database:
default: &default
adapter: postgresql
encoding: unicode
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
development:
<<: *default
database: some_new_local_database
Then restart your server or console and start playing around with a pristine copy of whatever had just been in your production database.
I’ve only needed it once or twice when deploying major data migrations, but pg:push
can be used much the same way in reverse. You’ll just want to make sure to turn on maintenance mode first and understand that pushing requires you to first delete any data in the remote database.
What this enables are simple workflows like the following (assuming your users can stomach the downtime):
# Disable access to your application
$ heroku maintenance:on
# Pull down the data
$ heroku pg:pull DATABASE_URL data_to_migrate
# Do whatever you need to run and validate your migration tasks locally
$ script/run_your_migration_or_otherwise_do_stuff
# Delete the remote data (which requires confirmation input)
$ heroku pg:reset DATABASE_URL
# Push up the migrated data
$ heroku pg:push data_to_migrate DATABASE_URL
# Re-enable access to the application
$ heroku maintenance:off
I don’t know about you, but for something as fraught as “delete and replace all of my production data”, it’s really nice to only need five easy-to-remember commands to pull it off.
Save a backup dump
If you want to save your production data to a file as opposed to a local postgres database, that’s a breeze, too. It requires two commands, pg:backups:capture
and pg:backups:url
(docs):
$ heroku pg:backups:capture
Which will output:
Starting backup of postgresql-something-1234... done
Use Ctrl-C at any time to stop monitoring progress; the backup will continue running.
Use heroku pg:backups:info to check progress.
Stop a running backup with heroku pg:backups:cancel.
Backing up DATABASE to b1799... done
Then with that b1799
identifier, you can fetch a URL and download the dump file like this:
$ heroku pg:backups:url b1799
This will output a very long and very secure-looking URL, which you can paste into a browser to download or otherwise pass to another command like curl
or wget
.
Schedule regular backups
Even though a hobby-basic
Postgres database only costs $9 per month, it includes an automated backup scheduler, which you can enable with something like:
$ heroku pg:backups:schedule --at 12:00
You can verify your hobby database’s schedule with:
$ heroku pg:backups:schedules
=== Backup Schedules
DATABASE_URL: daily at 12:00 UTC
```
And see a list of backups (including with the identifiers you'd need to invoke
`heroku pg:backups:url`, as shown above) with the top-level `pg:backups` command:
```
$ heroku pg:backups
=== Backups
ID Created at Status Size Database
───── ───────────────────────── ─────────────────────────────────── ──────── ────────
b1799 2019-11-11 02:51:55 +0000 Completed 2019-11-11 02:51:57 +0000 4.42MB DATABASE
# ……
Not bad for nine bucks!
Downloading a backup of larger databases
Even though an entry-level standard-0
database will run you $50 per month, the heroku pg:backups:schedule
command will warn you not to use it, and pg:backups:capture
and pg:pull
will very likely cause connection timeouts between your production application and the database, leading to availability issues for your users.
All this might seem like a downgrade, but it’s important to keep in mind that for the primary purpose of actually backing up and recovering critical data, the standard & premium tier databases provide physical backups that are far more reliable than the limited logical backup tools made available to the smaller hobby tier.
But we’re not here to use backups to protect our data, we’re here for quick and convenient ways to download our production data for our own nefarious purposes!
If you were to attempt to schedule a backup of a larger database using the method shown above, it’ll succeed, but the CLI will spit out this warning:
$ heroku pg:backups:schedule --at 00:00
▸ Continuous protection is already enabled for this database. Logical backups of
▸ large databases are likely to fail.
▸ See
▸ https://devcenter.heroku.com/articles/heroku-postgres-data-safety-and-continuous-protection#physical-backups-on-heroku-postgres.
Scheduling automatic daily backups of postgresql-something-1234... done
If you try to find an alternate method by reading the article at the URL above, you’ll likely be disappointed by the lack of one-off commands that are as simple and memorable as what you’re used to for smaller databases. But fear not, there’s an easy workaround!
Forking your primary database
The simplest path to “just give me a copy of production without crippling my app” is to create a fork of your primary database and then fetch it as you normally would, comforted by the knowledge that it’s completely separated from your production application and won’t break anything.
First, you’ll need to know what tier addon your production database is, because you’ll probably want your fork to be the same size. You can find this by running pg:info
:
$ heroku pg:info DATABASE_URL
=== DATABASE_URL, HEROKU_POSTGRESQL_AQUA_URL
Plan: Standard 0
# ……
Because my primary database’s plan is “Standard 0”, that means we need to provision a heroku-postgresql:standard-0
addon for our fork, like this:
$ heroku addons:create heroku-postgresql:standard-0 --fork DATABASE_URL
This will do two things, first it will provision a new database from the most recent physical production backup (which might be up to 30 hours old), and then it will replay all of the intervening write transactions (stored as WAL files) to catch it up to the current state of production. This can take anywhere from a few minutes to a long time.
And if you don’t want to wait
But what if you don’t care about the most recent 30 hours of data and you just want your backup now? Well, lucky for us, Heroku has implemented a --fast
flag for just this impulse.
$ heroku addons:create heroku-postgresql:standard-0 --fast --fork DATABASE_URL
Creating heroku-postgresql:standard-0 on ⬢ your-app-name-here... $50/month
▸ Release command executing: config vars set by this add-on will not be available
▸ until the command succeeds. Use `heroku releases:output` to view the log.
Fork will contain data from November 10, 2019 6:22 AM UTC (about 21 hours ago).
To create a fork with up-to-date data, exclude the `--fast` flag.
Database will become available after it completes forking.
Use `heroku pg:wait` to track status.
postgresql-something-4321 is being created in the background. The app will restart when complete...
Use heroku addons:info postgresql-something-4321 to check creation progress
Use heroku addons:docs heroku-postgresql to view documentation
Invoking the addons:info
command output above will tell you the creation status of the fork as well as get its database URL alias, like so:
$ heroku addons:info postgresql-something-4321
=== postgresql-something-4321
Attachments: your-app-name-here::HEROKU_POSTGRESQL_NEONGREEN
Installed at: Mon Nov 11 2019 13:07:56 GMT+0500 (Eastern Standard Time)
Owning app: your-app-name-here
Plan: heroku-postgresql:standard-0
Price: $50/month
State: creating
I ran the command a few times, and it took about 5 minutes for the state to switch from “creating” to “created”.
Downloading your forked data
Once you have the fork and its database URL, you can use all the same tricks outlined above, like pg:backups:capture
:
$ heroku pg:backups:capture HEROKU_POSTGRESQL_NEONGREEN
▸ Continuous protection is already enabled for this database. Logical backups of
▸ large databases are likely to fail.
▸ See
▸ https://devcenter.heroku.com/articles/heroku-postgres-data-safety-and-continuous-protection#physical-backups-on-heroku-postgres.
Starting backup of postgresql-something-4321 done
Use Ctrl-C at any time to stop monitoring progress; the backup will continue running.
Use heroku pg:backups:info to check progress.
Stop a running backup with heroku pg:backups:cancel.
Backing up NEONGREEN to b279... done
Which, in spite of the same “standard” tier warning, should be fine since nothing is connected to and writing to the database. It can then be downloaded from the link generated by heroku pg:backups:url b279
.
And you’ll find that running pg:pull
also works, even though it’s likely to take longer now if you have a lot of data:
$ heroku pg:pull HEROKU_POSTGRESQL_NEONGREEN yay_production_data
Don’t forget to delete all your data!
In your excitement of fetching a backup of your production database, don’t forget to delete the newly-provisioned addon when you’re done! It’d be a shame to forget until you get next month’s bill and discover it to be $50 to $8,500 pricier than usual.
To make sure you delete the right database (a worthwhile precaution), run heroku addons
to verify the name of the database that isn’t assigned DATABASE_URL
alias, and then delete it with the addons:remove
command (which, since it’s a destructive operation, will require confirmation input):
$ heroku addons:remove postgresql-something-4321
We did it! We backed up Heroku's Postgres!
And that’s it! I hope this guide showed you a thing or two about leveraging Heroku’s robust backup tools to more easily fetch and work with your data. But be warned: once you grow accustomed to this level of convenience, it can be tough to go back to other services that lack the same features.
By the way, a big thanks to Heroku engineer and friend of Test Double Mike Busch for patiently answering all my questions about how to work with backups for standard-tier database plans.