Personal backups: The geek way
Encrypting and uploading to Backblaze using duplicity ⌛ 6 mins

Living in the digital era is awesome. Everyone has smartphones, iPads, GoPro’s and the same old computer. We all have email, all sort of documents and specially photos and videos (can you keep up with an Instagram feed??).

Actually based on a random live counter, the whole internet is 10,961,628 Petabytes big as of the second I started this paragraph. That means the average person carries 1.5 Terabytes in his pocket.

Great! Now everyone can just buy whatever 2 TB drive is available in the store and we can all save our data in our pocket drive. Problem is: sometimes stuff gets broken. Who doesn’t have that friend that lost his entire life because his pocket drive went nuts for good?

Bad luck Brian agrees

Moving faster: almost everyone I know uses services like Dropbox, iCloud Storage, Google Drive or One Drive. And that’s great! Especially because the latter two actually offer a decent amount of storage for free.

While this is a huge step to avoid loosing your data, there are a few caveats:

Because of this I decided to start uploading all my data encrypted to a cloud storage provider. While I’m not ditching my pocket 2TB drive nor Google Drive, now I have an encrypted replica of both personal and sensitive data.

A few providers I know: Amazon, Google, Microsoft, OVH and Backblaze. I chose Backblaze to start with, mainly due to their good prices. You can even have them ship a drive to your house.

The technical bits

When I was looking for a tool to upload my data to the cloud, I was looking for 3 main requirements:

That’s how I found Duplicity, and I even got a bonus feature: it’s incredibly easy to use! You can get it on your Mac using Homebrew:

$ brew install duplicity

or Linux using your package manager:

$ apt-get install duplicity # Debian based
$ yum install duplicity # CentOS based

Sorry Windows folks, this is the geek way!

For reference, the version I have installed now is 0.7.12.

I decided to create two buckets, one for documents and another for photos and video. That’s because documents are always changing and media is not, so I can set different Lifecycle Rules. You’ll probably want to keep all versions of documents, but that is not necessarily true with media (unless your job is to produce and edit media).

Generating PGP key with Keybase

To manage PGP keys I use GPGTools on Mac, but GnuPG will work on both Mac and Linux. If you’re not familiar with PGP follow this tutorial from RedHat to create your key.

There is also a recent tool which is still invite only but it’s definitely bringing a shiny face to managing keys and encryption: Keybase.

After installing keybase app, you can generate your new key with these simple commands:

$ keybase pgp gen
Enter your real name, which will be publicly visible in your new key: My Name
Enter a public email address for your key: my@email.com
Enter another email address (or <enter> when done):
Push an encrypted copy of your new secret key to the Keybase.io server? [Y/n] n
▶ INFO PGP User ID: My Name <my@email.com> [primary]
▶ INFO Generating primary key (4096 bits)
▶ INFO Generating encryption subkey (4096 bits)
▶ INFO Generated new PGP key:
▶ INFO   user: My Name <my@email.com>
▶ INFO   4096-bit RSA key, ID 6A3D610F79008975, created 2017-07-03
▶ INFO Exported new key to the local GPG keychain

Backup

Now to the good part, lets upload a folder to Backblaze:

$ duplicity --encrypt-sign-key=<your-key-id> --encrypt-key=<your-key-id> <folder-to-backup> b2://<your-account-id>@<bucket-name>/<destination-folder> --log-file=duplicity_$(date +"%Y%m%d%H%M%S").log

In the example above, I encrypt and sign the data with my key and send my folder-to-backup to my bucket-name inside the destination-folder. This also produces a log file named something like duplicity_20170622013652.log.

You’ll be prompted for your Backblaze Application Key, and if your key requires a passphrase, you’ll be prompted for both encryption and signing keys.

With a bit of magic from b2 cli tool and jq, lets list the files in our bucket:

$ b2 list-file-names <bucket-name> | jq -r '.files[] |  [.fileName ,.size] | @tsv'
destination-folder/duplicity-full-signatures.20170622T015944Z.sigtar.gpg	3156104
destination-folder/duplicity-full.20170622T015944Z.manifest.gpg	1570
destination-folder duplicity-full.20170622T015944Z.vol1.difftar.gpg	209787961
destination-folder/duplicity-full.20170622T015944Z.vol2.difftar.gpg	209770656
destination-folder/duplicity-full.20170622T015944Z.vol3.difftar.gpg	117115477

Here I want you to notice the manifest file, and that the sizes of the vol* files are something like 200MB each, except the last one.

Try to download and open the manifest file (you’ll have to decrypt it):

Hostname your-computer-hostname
Localdir 20170622
Volume 1:
    StartingPath   .  
    EndingPath     file1.pdf 11
    Hash SHA1 6abebce21621499d4cb63ab05fd87ee845eb2a97
Volume 2:
    StartingPath   file1.pdf 12
    EndingPath     file4.docx 373
    Hash SHA1 86c01b4cb69e3ab04750b4066165222790362e38
Volume 3:
    StartingPath   file4.docx 374
    EndingPath     file7.xlsx  
    Hash SHA1 10c534901e759a1de3ab021dc09d4cb692ea033e
Filelist 7
    new      file1.pdf
    new      file2.pdf
    new      file3.pdf
    new      file4.docx
    new      file5.docx
    new      file6.docx
    new      file7.xlsx

As you can see, this is a list mapping all your files to each volume. This means that if you need to download a file smaller than 200MB, you’ll download a maximum of 400MB, instead of the whole thing!

Of course you won’t need to download the manifest file every time you want to do a ls of your files, duplicity has that feature:

$ duplicity list-current-files b2://<your-account-id>@<bucket-name>/<destination-folder>
Password for '<your-account-id>@B2':
Local and Remote metadata are synchronised, no sync needed.
Last full backup date: Thu Jun  22 02:59:44 2017
Thu Jun  22 02:02:44 2017 .
Thu Jun  22 01:57:40 2017 file1.pdf
Thu Jun  22 01:57:41 2017 file2.pdf
Thu Jun  22 01:57:43 2017 file3.pdf
Thu Jun  22 01:57:47 2017 file4.docx
Thu Jun  22 01:57:49 2017 file5.docx
Thu Jun  22 01:57:50 2017 file6.docx
Thu Jun  22 01:57:51 2017 file7.xlsx

Restore

We have our files backed up in Backblaze, now lets try to restore them:

$ duplicity restore b2://<your-account-id>@<bucket-name>/<destination-folder> <restore_folder>

Or if you want to restore a single file:

$ duplicity restore --file-to-restore file3.pdf b2://<your-account-id>@<bucket-name>/<destination-folder> file3.pdf

After this you’ll have your file3.pdf in the folder you were at, just like that!

Some ideas

Feel free to check out Duplicity Man page for more information. It has a lot of features and once you get your backups up and running, you can tweak the example commands I gave as much as you want.

If you also want to backup data you have on a server, one idea is to create a PGP key that belongs to the server, and use it to sign the backup, while using your personal PGP public key to encrypt it. Use Cron for periodic backups!

Note: Backblaze doesn’t really support users and ACL’s, so I’d create a different account for automated backups.

Now get backing up those petabytes!

*****
Written by Ricardo Marques on 22 June 2017
© 2024 Ricardo Marques