Living in the digital era is awesome. Everyone has smartphones, iPads, GoPro’s and the same old computer. We all have email, all sort of documents and specially photos and videos (can you keep up with an Instagram feed??).
Actually based on a random live counter, the whole internet is 10,961,628 Petabytes big as of the second I started this paragraph. That means the average person carries 1.5 Terabytes in his pocket.
Great! Now everyone can just buy whatever 2 TB drive is available in the store and we can all save our data in our pocket drive. Problem is: sometimes stuff gets broken. Who doesn’t have that friend that lost his entire life because his pocket drive went nuts for good?
Moving faster: almost everyone I know uses services like Dropbox, iCloud Storage, Google Drive or One Drive. And that’s great! Especially because the latter two actually offer a decent amount of storage for free.
While this is a huge step to avoid loosing your data, there are a few caveats:
- If you have a lot of data (lets say 100GB and up) it starts getting expensive
- Those providers have raw access to your data
- People set small passwords for convenience and get hacked
Because of this I decided to start uploading all my data encrypted to a cloud storage provider. While I’m not ditching my pocket 2TB drive nor Google Drive, now I have an encrypted replica of both personal and sensitive data.
The technical bits
When I was looking for a tool to upload my data to the cloud, I was looking for 3 main requirements:
- Support for multiple cloud storage providers
- Incremental backups
- Encrypted backups via PGP (nice to have)
$ brew install duplicity
or Linux using your package manager:
$ apt-get install duplicity # Debian based $ yum install duplicity # CentOS based
Sorry Windows folks, this is the geek way!
For reference, the version I have installed now is 0.7.12.
I decided to create two buckets, one for documents and another for photos and video. That’s because documents are always changing and media is not, so I can set different Lifecycle Rules. You’ll probably want to keep all versions of documents, but that is not necessarily true with media (unless your job is to produce and edit media).
Generating PGP key with Keybase
There is also a recent tool which is still invite only but it’s definitely bringing a shiny face to managing keys and encryption: Keybase.
After installing keybase app, you can generate your new key with these simple commands:
$ keybase pgp gen Enter your real name, which will be publicly visible in your new key: My Name Enter a public email address for your key: email@example.com Enter another email address (or <enter> when done): Push an encrypted copy of your new secret key to the Keybase.io server? [Y/n] n ▶ INFO PGP User ID: My Name <firstname.lastname@example.org> [primary] ▶ INFO Generating primary key (4096 bits) ▶ INFO Generating encryption subkey (4096 bits) ▶ INFO Generated new PGP key: ▶ INFO user: My Name <email@example.com> ▶ INFO 4096-bit RSA key, ID 6A3D610F79008975, created 2017-07-03 ▶ INFO Exported new key to the local GPG keychain
Now to the good part, lets upload a folder to Backblaze:
$ duplicity --encrypt-sign-key=<your-key-id> --encrypt-key=<your-key-id> <folder-to-backup> b2://<your-account-id>@<bucket-name>/<destination-folder> --log-file=duplicity_$(date +"%Y%m%d%H%M%S").log
In the example above, I encrypt and sign the data with my key and send my
folder-to-backup to my
bucket-name inside the
destination-folder. This also produces a log file named something like
You’ll be prompted for your Backblaze Application Key, and if your key requires a passphrase, you’ll be prompted for both encryption and signing keys.
$ b2 list-file-names <bucket-name> | jq -r '.files | [.fileName ,.size] | @tsv' destination-folder/duplicity-full-signatures.20170622T015944Z.sigtar.gpg 3156104 destination-folder/duplicity-full.20170622T015944Z.manifest.gpg 1570 destination-folder duplicity-full.20170622T015944Z.vol1.difftar.gpg 209787961 destination-folder/duplicity-full.20170622T015944Z.vol2.difftar.gpg 209770656 destination-folder/duplicity-full.20170622T015944Z.vol3.difftar.gpg 117115477
Here I want you to notice the manifest file, and that the sizes of the
vol* files are something like 200MB each, except the last one.
Try to download and open the manifest file (you’ll have to decrypt it):
Hostname your-computer-hostname Localdir 20170622 Volume 1: StartingPath . EndingPath file1.pdf 11 Hash SHA1 6abebce21621499d4cb63ab05fd87ee845eb2a97 Volume 2: StartingPath file1.pdf 12 EndingPath file4.docx 373 Hash SHA1 86c01b4cb69e3ab04750b4066165222790362e38 Volume 3: StartingPath file4.docx 374 EndingPath file7.xlsx Hash SHA1 10c534901e759a1de3ab021dc09d4cb692ea033e Filelist 7 new file1.pdf new file2.pdf new file3.pdf new file4.docx new file5.docx new file6.docx new file7.xlsx
As you can see, this is a list mapping all your files to each volume. This means that if you need to download a file smaller than 200MB, you’ll download a maximum of 400MB, instead of the whole thing!
Of course you won’t need to download the manifest file every time you want to do a
ls of your files, duplicity has that feature:
$ duplicity list-current-files b2://<your-account-id>@<bucket-name>/<destination-folder> Password for '<your-account-id>@B2': Local and Remote metadata are synchronised, no sync needed. Last full backup date: Thu Jun 22 02:59:44 2017 Thu Jun 22 02:02:44 2017 . Thu Jun 22 01:57:40 2017 file1.pdf Thu Jun 22 01:57:41 2017 file2.pdf Thu Jun 22 01:57:43 2017 file3.pdf Thu Jun 22 01:57:47 2017 file4.docx Thu Jun 22 01:57:49 2017 file5.docx Thu Jun 22 01:57:50 2017 file6.docx Thu Jun 22 01:57:51 2017 file7.xlsx
We have our files backed up in Backblaze, now lets try to restore them:
$ duplicity restore b2://<your-account-id>@<bucket-name>/<destination-folder> <restore_folder>
Or if you want to restore a single file:
$ duplicity restore --file-to-restore file3.pdf b2://<your-account-id>@<bucket-name>/<destination-folder> file3.pdf
After this you’ll have your
file3.pdf in the folder you were at, just like that!
Feel free to check out Duplicity Man page for more information. It has a lot of features and once you get your backups up and running, you can tweak the example commands I gave as much as you want.
If you also want to backup data you have on a server, one idea is to create a PGP key that belongs to the server, and use it to sign the backup, while using your personal PGP public key to encrypt it. Use Cron for periodic backups!
Note: Backblaze doesn’t really support users and ACL’s, so I’d create a different account for automated backups.
Now get backing up those petabytes!