Opbeat is joining forces with Elastic – Learn more about Elastic APM

Postgresql backup to S3: Part one

Where is your backup? In a datacenter? What happens if that datacenter burns down? How fast can you have a new database up and running if your current one dies? Could be hours? Days? Backups of your database are absolutely essential and with Amazon Web Services it’s easy to do the right thing.

At Opbeat we absolutely love PostgreSQL. Its versatility and robustness is second to none, and frankly, I wouldn’t trust my data with many other. Thus, this article is aimed at PostgreSQL users. In this post I will show you how to set up a good backup scheme. In the next post I will go deeper in to the considerations you make to ensure your backup strategy meets your business goals.

Jan 23rd 2013: Updated installation instructions and links changed to new wal-e organization on GitHub.

Feb 24th 2013: _Updated ‘archivecommand’ to use full wal-e path.

Jun 11th 2013: Updated wal-e installation instructions to use PyPi.

Background

Part of ensuring smooth sailing with any database is to make sure you have a consistent backup. We do both streaming replication (available from PostgreSQL 9.1) and WAL archiving to AWS S3. Streaming replication ensures that we have a “hot standby” slave at all times. With a “hot standby” slave, any changes that happen at our master, will be applied almost instantaneously in our slaves - this is both good and bad. The good thing is that we can switch to a different server in the event that our master PostgreSQL server goes down and keep trucking with very little downtime and virtually no data-loss.

On the other hand, if someone accidentally performs a “DROP TABLE users;” it will effectively remove the users table on all our servers and the data will be lost. Additionally, if you have a complete datacenter outage, you should be able to pull out your backup and set up shop in a new datacenter. If your only backup strategy is streaming replication inside a single datacenter, you’ll be out of luck in the case of a catastrophic failure. Sending Write Ahead Log (WAL) archives to S3 solves these issues.

The Write Ahead Log (WAL) is essentially a chronological log of changes to your database. By shipping archives of WALs to S3 you can ensure that if your server goes down, you’ll be able to reconstruct the database by replaying the WAL archives. In order to replay the log of changes, you need a _base backup _to begin from. I advice you to take a daily base backup to S3 and ship WALs to S3 in the interim.

The WAL-E project from Heroku makes this pretty straight forward.

Access to S3

The first thing you need to do is create a bucket to upload your backups to, we call ours “opbeat-pg-backups”. You can do this from the console. Make sure “versioning“ is enabled for your bucket. Also, create an AWS user to use for sending backups to S3 in IAM. If your database server is compromised, the attacker might delete the data in your database on that machine. He would also likely gain access to the AWS credentials you use to send backups to S3 - if this gives him access to also delete your backup, you’re in big trouble. With a policy like the one below, you can restrict users from deleting files, but actually they can still overwrite an existing file, rendering your backup useless. With versioning enabled, an attacker will be able to overwrite a file, but you will always be able to get the original file back. Only the owner of the S3 bucket can permanently delete an object in a versioned bucket.

Regardless, I recommend using a policy that disallows deletion. You can adapt the policy below to your own need:

{
  "Statement": [
    {
      "Sid": "Stmt1355040132425",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket",
        "s3:PutObject"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::opbeat-pg-backup",
        "arn:aws:s3:::opbeat-pg-backup/*"
      ]
    }
  ]
}

(You may wonder why the policy includes the ListBucket and GetObject permissions. We bootstrap new db-slaves automatically from S3, so we need to be able to read from S3 as well.)

Pushing WAL archives to S3

When you have the _access key id _and secret access key, install “envdir”. In ubuntu/debian you can install the daemontools package:

$ sudo apt-get install daemontools

Install WAL-E:

$ sudo pip install wal-e

Set up the access keys in the following manner (from the WAL-E readme):

# Assumption: the group is trusted to read secret information
$ umask u=rwx,g=rx,o=
$ mkdir -p /etc/wal-e.d/env
$ echo "secret-key-content" > /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY
$ echo "access-key" > /etc/wal-e.d/env/AWS_ACCESS_KEY_ID
$ echo 's3://some-bucket/directory/or/whatever' > \
  /etc/wal-e.d/env/WALE_S3_PREFIX
$ chown -R root:postgres /etc/wal-e.d

Make sure you use the access key and secret key from your newly created backup user. Now, set up your postgresql.conf to include the following:

wal_level = archive # hot_standby in 9.0+ is also acceptable
archive_mode = on
archive_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-push %p'
archive_timeout = 60

Base backups

Finally, you need to make a base copy, and you should do this periodically, e.g. once a day:

$ sudo su postgres
$ crontab -e

Paste the following:

0 2 * * * /usr/bin/envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /var/lib/postgresql/9.1/main

This will perform a base backup every day at 2am. You need to make sure you manually delete old backups from S3 once in a while. WAL-E makes this quite easy, take a look at the documentation. Stay tuned for part 2!

About the author
Ron Cohen is co-founder and CTO at Opbeat. A founding member of Django Copenhagen, Ron enjoys sharing his experiences with code and ops around the world.
You can follow him on Twitter or GitHub.