Your own public cloud β€” why not ?

Alright, this is going to be quite a long article, about a subject that I have grown particularely fond of lately : online privacy.

We're going to deal with self-hosting too, a lot πŸ˜‡. And we'll get technical, don't worry.

In a nutshell, this article is my personal take on what I value when I say online privacy, and how I decided to try to mitigate the risks by using open-source tools to create my own public cloud ☁️.

self-hosting

Your mileage may vary on the definition of privacy, on the risks taken, and on the legitimity of the tools I will describe and implement, but I think this article can give you a good head start if you're looking into this kind of stuff at the moment.

⎈ Summary

    
  1. What is the problem, to begin with?
  2. A personal cloud?
  3. Technical implications
    a. Self-hosted to the extreme ?
    b. Docker / Compose
    c. Risks and mitigations
  4. Solutions
    a. Drive
    b. Bookmarks
    c. Passwords
    d. Tasks / Boards
    e. Notes
    f. Calendar / Contacts
    g. Raw syncing
  5. Go
    a. Get docker and compose up and running
    b. Retrieve the repo
    c. Create your cloud infrastructure
    d. Add or remove services
    e. Tweak the configuration
    f. Build the custom images
  6. Launch
  7. Final words
    a. A note on E2E encryption
    b. Backups
    c. Donate and contribute


1. πŸ’πŸ» What is the problem, to begin with?

Privacy is becoming a mainstream subject right now, and that is good. The immense amount of data that one generates while using various internet services like social networks, online services or softwares is now more and more a personal stress factor; for techies of course but also more and more for the profanes.

* * *

Because of data breaches, first. Not one day goes by without its new security breach (Equifax, Option Way, CircleCI, CapitalOne, Movie Pass, LinkedIn, Twitter ...) exposing thousands of people's personal data, including, in the worst cases, passwords in plain text.

Passwords in plain text ? We're in 2019 for Christ's sake β€” this is just mind-boggling that platforms or saas providers that have thousands or millions accounts to manage do not have the slightest knowledge of what basic security / encryption / good practice is.

Screenshot-2019-09-16-at-10.50.28

This type of answer is beyond understanding

You can also check the 453+ pages of https://plaintextoffenders.com to check if a service you want to use has questionable practices.

* * *

But also and maybe more importantly (and more concerning to be honest) because of the marketing and targetting mayhem this data is subject to. What does Google do with my emails ? Does it read my Drive documents ? And these Facebook searches ? Why do I keep seeing ads that are directly related to my previous internet searches ? Why do I have to load more than 80ko of trackers on each and every website I browse ? etc...

This is not a problem per se, but it could well become one when the companies / countries that control these targetting and marketing efforts decide or are forced to hand over our data to malicious third parties.

I don't want to be too alarming, but I'm myself concerned about the data I create and how it may or may not be used by the providers I trust it with.

That's why I went (and am still) on a quest to mitigate this risk.

* * *

I have quite a bit of history with what I would call "incumbent" service providers, since I've been mainly active on the Internet at the time these providers were gaining a lot of traction: I have Gmail emails and associated Google accounts. I use it for contacts and calendar, too. I have a Dropbox account. I use Trello for todos and whatnots. I use various sync services for my browsers (Firefox, Safari) ... the list goes on.

Now I'm not saying that these providers do a bad job regarding privacy, but regarding all the recent heat a few of them (all of them?) have received, I no longer trust any "big-enough" actor with my personal data.

Now that means that I could find and use smaller, local providers, but how can I be sure that they are well-funded and can provide a full-featured service in the long-run (I'm happy to pay for a service of course, but even then)? And what if they get bought later on ? How can I assess their security measures and processes ? At least I'm sure Google has a team dedicated (even many teams) to secure and backup its servers and put them back online in a snap.

So a small provider is not a really satisfying answer.

Of course I could revoke all these accounts and do everything offline; but we're in the 21th century and as a lot of people out there, I'm not ready to trade the practicality of online tools such as mail, calendar, contacts, file repository, readily available on any device anywhere, for an extreme privacy posture.

TL;DR: Richard Stallman is right on many points, but his position is generally too absolute and radical, and not compatible with a fair and reasonable use of today's tools (I think).

2. πŸŒ₯ A personal cloud?

So a personal cloud could be a good solution, in this regard. A small provider, that happens to be yourself. Not perfect, but better than other options.

To me, here are some pros and cons of this solution:

Pros :
  • The data is on your servers / you own your data for real
  • You control the applications you put on these servers
  • You can modify them if needed, to suit your needs
  • You can choose solutions that are open-source and have auditable security
  • You can find feature-rich solutions that are the alsmost exact equivalent of Google Drive, Trello, etc ...
Cons :
  • You are the admin of your cloud apps. This means you are the technical admin of your cloud apps.
  • In case of a problem, it's up to you to fix it
  • It needs a bit of work to set up correctly
  • It needs a bit of work to maintain and to update

So yes, you need to get a bit under the hood to be able to create your own cloud, that's the culprit. But the amount of work needed is not overwhelming, and we're going to use standard technologies here, with extensive documentation, communities, and easy-to-access help on all the different stacks.

youcan

You've got this.

What exactly does a personal cloud cover in terms of functionality ?

It depends. You might need a feed reader, a webmail, a calendar, and a file storage solution, or you might just need a todo list and a task manager, with a VPN, ... that's up to you to decide.

In this article, I will focus on some standard needs (that happen to be my needs 😁) : a drive, a calendar and contacts, a bookmarks manager, a password manager, a notes app, a kanban-board (à la Trello) and a sync/backup solution.

This should cover a wide range of needs for a personal user, and could very well work for small associations or companies too, that want to go the open-source / privacy-focused way (and that have a technical contact that can manage that of course).

3. πŸ›  Technical implications

Ok, now let's dive into the how.

a. Self-hosted to the extreme ?

The first obvious technical issue we're facing is obviously that this cloud will need to be hosted somewhere on the Internet.

Two options are available : either you push the paradigm to the extreme and use a real, physical server that you physically own, and that will sit behind your home broadband connexion, or you have to rely on the "cloud", that is, a third party that will provide you with a virtual or physical server, or even an abstracted infrastructure you can deploy things on.

If you are really worried about the whereabouts of your data, your own server is the straightforward solution; personally, I think it is over-complicated and has various drawbacks :

  • Your home IP might change
  • Your home ADSL / fibre may have connectivity problems from time to time
  • You could have a power cut / brownout etc
  • You need to maintain hardware
  • Your cat could trip over the network cable of the server

(On a side note, you could also have your own server, but on someone else's Internet, like what Jeff Atwood did for Discourse β€” but I think that is not ideal, since apart from the Internet connexion, you still have the drawbacks of having to manage hardware, i.e. replace disks, etc)

So putting your infrastructure somewhere else, where it likely will be monitored round the clock, seems reasonable to me. You need to assess that your provider is somewhat respectful of what you do on your instances, but you can mitigate that quite easily too.

I'm already using OVH and I'm pretty happy with their pricings so I would recommand them, but they are a lot of other cloud infrastructure providers out there : Amazon of course, Microsoft Azure, Oracle, Vultr, Rackspace, etc ...

For the setup that I'm going to describe here, you should expect to pay about 20€ per month at OVH, which is very reasonable.

b. How is it going to work exactly ?

The plan is the following :

First, we'll create a relatively powerful instance on your chosen infrastructure provider. This will serve as a host for an automation / containerization tool (see below), that will do the heavy lifting and provide isolation between services.

You could as well spin off an instance (aka virtual server) for each service of your cloud that you need, but this is not ideal since most of them will sit doing nothing most of the time (you are the sole user of your cloud); and moreover, you end up with a lot of instances to take care of and to pay for.

A powerful host, containerized, should be a better use of the resources.

Next, we are going to create a container for every service that we wish to use : one for our drive, one for our calendars, etc. We'll see that in details later on.

Of course, we'll need to expose all these services to the Internet through the host, while limiting the attack surface. Our containerization tool will help us do that.

Finally, if you want to have a clean namespaced cloud, you will certainly need to create some DNS entries to match for all your container services / ports on the host.

b. Docker / Compose

Alright, now, with this plan, the obvious simple containerization tool that comes to (my) mind is Docker.

Screenshot-2019-09-20-at-13.54.15

I'm saying obvious here because it's relatively stable, has a lot of documentation, has a great community and a lot of stackoverflow questions answers.

We're clearly not going to create a handful of cloud apps from scratch with our bare hands; There are fantastic automation tools out there that are just right for the job; Docker is my choice but feel free to use the one you like or are comfortable with.

Docker itself would be sufficient to instantiate all we need but it would be tedious β€” we're going to rely heavily on docker-compose to do the job.

In fact, our entire configuration for our cloud apps will likely reside in a single docker-compose configuration file.

c. Risks and mitigations

So what's the culprit ?

Our infrastructure provider have access to our instance and storage

Yes, it does. I see two risks here :

  1. what if the provider gets p*wned ?
  2. what if the provider decides to use this data (sell it to third-parties, etc) ?

Well, point 1 is legit : if someone infiltrates their DC or network and gets your block storage, it will have access to your data. Point 2 is irrelevant; your data is a big blob of disk storage, and is not valuable as-is. You really should be someone of interest for any third-party to try to buy your raw data and try to extract something from it. The only data your provider could sell is your account on their platform with your billing info / personal info, that's it.

For point 1 though, it's up to you to store your data securely, for instance by choosing apps that store your data encrypted. In the apps that we're going to install here, some do and some don't. But for the important data (passwords, notes for example), I chose ones that do.

What if somehow steals my database ?

That could happen if you happen to have a misconfigured service or a kind of exploit that has not been patched yet.

Same than the point 1 above : if your data is encrypted, who cares ?

If my infrastructure provider fails ?

Well, that could happen too. If it's temporary, well, you just had a downtime. If the physical server on which your data (apps or storage) was stored crashes, then, you lost it all.

It's not as likely as you would think since the "cloud" is not just someone else's computer; it's a bit more complicated than that and involves clusters of cpu and storage and a few layers of software that are generally running on cheap hardware that is expected to fail at some point. "Hardware failure is the norm" (it's Hadoop's motto); hence your data is not really just spinning on one disk somewhere. And is less prone to hardware failures.

But anyway, the providers generally have several ways to deal with that :

  • data is replicated on their side: in case of a crash, they will restore your instance and data from hot backups, or from cold backups they have at another location
  • they provide tools to make block storage snapshots on your own: either via a web interface or an API, or in an automated fashion
    - you can also make backups (a little different from snapshots) so you can restore to any previous state, especially for your instance

What if one of my container fails / stops / crashes ?

Well, it's generally not that bad. Maybe a glitch. As we're going to see, your apps are going to be separated from your data, thus it's possible to restart / stop / rebuild a container from scratch without losing anything.

If you're away on vacation while it crashes, and you don't have access to a computer to restart it, chances are you don't need your cloud anyway ;)

4. πŸ–₯ Apps

Ok enough talking (sorry for that), let's get to the point: what apps are we going to use ?

This list and my choices are totally personal β€” if you want more (a lot more) options, check out Kickball's github repository that has an extensive list, for all kind of needs.

a. Drive

Ok let's start with the obvious solution you need: storing documents in the cloud.

proxy.duckduckgo.com

I've settled on Cozy (https://cozy.io); it's a French solution. It's still a bit young and sometimes buggy, but their interface is very intuitive, very clear, and it does the job well. They have a mobile app and a desktop app that are getting updated almost on a daily basis.

photo-timeline-en

They take security and privacy quite seriously, even on their hosted plan (very reasonably priced by the way); Their CPO was Tristan Nitot (now heading the Qwant search engine), founder and former president of Mozilla Europe, and they are likely to take over some cloud solutions in the French government and administrations.

Unfortunately their docs for the self-hosted version are not on par yet, but I've already done that work for you with the dockerfile so you won't have to deal with outdated instructions.

cozy-drive-home

The beauty of Cozy is its simple set of features β€” it's clearly aimed at single-users and not at companies (at least in my opinion); you can still share folders quite easily with people nevertheless, which is very practical.

Extra modules are available : photos, contacts, banks to name a few.. and they share the same "simple, functional" philosophy than the drive counterpart.

Alternatives

  • ownCloud : I've been using it in a work context and it's pretty powerful, with a lot of extensions and modules. The desktop clients are robust, and they have a strong community if you run into issues.
  • Seafile : quite powerful too, the team is Chinese, so you might want to check that it complies with your needs and principles. Otherwise, it has everything you need.
  • NextCloud : I haven't tested this one, but it's the go-to solution that a lot of experts advocate. Might be worth it even if a bit complicated I think for a single-user solution.

b. Bookmarks

I use bookmarks a lot when researching the web, so I needed a solution to store them and access them anywhere easily.

I obviously didn't want to use any vendor solution such as Firefox Sync or iCloud, to be able to use any browser, and of course so I could self-host this too.

X-browser sync (https://www.xbrowsersync.org/) seemed to me the perfect solution : simple, encrypted, robust.

512x512bb

And it has Docker instructions, too;

Alternatives

c. Password manager

This is a must-have for your own private cloud.

I've looked into strong and robust solutions, and I found Passbolt to be a very good contender.

604283-original

It uses GnuPG, is open-source, has a clear roadmap, and its development is taking place in the heart of Europe (Luxemburg).

Moreover, they answer quite quickly (on Github at least) which is very nice.

They have a plugin for Firefox / Chrome / Edge (with the Chromium engine), but not Safari yet.

passbolt

Alternatives

  • Bitwarden looks great but it threw me a bit off that one needs to request an id and key (https://bitwarden.com/host/) to self-host. It seems quite powerful and has a strong documentation.

d. Tasks / Boards

Everybody loves a good Kanban board to sort things out. It deserves a space in your private cloud;

The only worthy Trello opponent I found is Wekan.

Screenshot-2019-09-26-at-15.55.32

It looks very much like Trello, you can import boards (but beware β€” if the boards are too big it will fail silently!), and can tweaks settings for every part of the software.

wekan-markdown

It has a docker image already, which is a plus.

Alternatives

  • TaskBoard β€” nice, but was lacking the "Trello" touch and feel. Also, on a technical side, it was not as easy to install properly and maintain
  • Restyaboard β€” Same same : very interesting but the interface did not suit me at all.

e. Notes

This was an easy choice. There is one player out there that is a really ahead of the competition : Standard Notes.

standard-notes-logo-image

It's simple, encrypted, it has a robust set of features and you can add extensions that are simple JS apps (there are a few already available, such as theming, markdown editors, spreadsheets ...).

sn

Yet the apps are beautifully designed and are available on MacOS, Android, iOS ... and there is a web app too, that you can use even if you self-host your instance.

I use it daily and it's a pleasure to take notes with it.

Alternatives

To be really honest, there are a lot, but none of them just come close to SN. Check the 'others' section below to have a list (not curated by me)

f. Calendar / Contacts

The de facto standard for calendar and contacts exchange on the Internet is CalDav/CardDav, but it's a complicated protocol / standard.

There are a few options out there and I decided to go with one that was on a stack I'm familiar with : PHP and MySQL, so I could easily tweak the code if needed.

So here comes Baikal.

baikal

Baikal has adockerized installation, is pretty easy to install and configure, and has multi-users support. It works out-of-the-box with all my macOS apps (calendar, contacts) and also with all my Android apps (see below).

There is a web dashboard that gives you a lot of information at a glance :

Baikal-Dashboard

The only problem I'm still trying to mitigate is that answers to invitations do not change the invitee RSVP status in my calendar: for instance if I create an event and invite a few people on gmail adresses, and if they respond to the invitation, I will receive a mail, but my calendar event will not be updated.

This maybe a configuration problem, but I think it's a deeper issue; I'll try to investigate later on to see if this can be fixed - but I have to up my knowledge of the CalDav standard first...

Oh, you're on Android ?

Unfortunately getting a CalDav / CardDav server to work on a vanilla Android phone is tricky. I would recommend using the very nice DavX application. It's open-source too (https://gitlab.com/bitfireAT/davx5-ose) but at 4€, it's worth just paying for it.

Alternatives

  • SoGo β€” is more complete and extensive, but I needed something simpler since I am planning to use it as a single user
  • Radicale β€” a more roots solution, but not updated recently
  • Calendar Server (Apple) β€” seems very robust and well-thought, but I'm wary of the amount of effort Apple have put in the documentation and the OS community on this

g. Raw syncing

So we already have a drive solution, with a web interface so we can access our documents anywhere anytime with just an Internet connexion.

But not all our documents need to be accessed this way.

You could also have a need to sync a large collection of photos / files to the cloud as a kind of backup, or to be able to retrieve it on another machine (work / home for instance).

I chose Syncthing for that.

syncthing-logo

Syncthing is purely decentralized, so you can easily add "nodes" to improve your data resilience. It is easily dockerized.

It has a neat web interface to see how things get replicated across your networks of devices, but it's really a "fire and forget" type of software, which is pretty convenient.

syncth

Alternatives

  • Seafile, ownCloud and Nextcloud (that we presented above) are solutions that could work for this too

h. Others ?

Maybe you need another type of web app that I didn't include here β€” rejoice! As there is very surely an open-source and easily dockerizable software out there waiting for you.

As a start, Edward has compiled a very comprehensive list on Github here : https://github.com/Kickball/awesome-selfhosted so you can find your dream package to install in your cloud.

Some of them already have a dockerfile / Docker hub repository, some don't but it's generally not a big deal to create one.

5. ▢️ Go!

Now's the technical part at last !

finally

We'll need a few tools to do this.

On your local machine, you'll need docker, docker-machine and docker-compose. At some point, you'll need to clone the repository where I have put all the configuration / scripts, so git is kind of mandatory.

You won't need to install a VM on your local machine since we'll use Docker to remote control another Docker instance (that will be in the cloud). So hopefully you don't need to install Virtualbox or Hyperkit. Nevertheless, if you want to test your config locally when making changes, I'd recommend that you have a minimal VM locally.

a. Get docker and compose up and running on your local machine

I won't extend myself too much on this. There are many tutorials on the web if you are not an expert, and the Docker documentation is very comprehensive.

b. Retrieve the repo

Head to https://github.com/tchapi/own-private-cloud and clone it locally.

The repository uses submodules, don't forget to init them.

git clone https://github.com/tchapi/own-private-cloud
git submodule update --init
cd infra-public

I have tried to organize it properly so it is understandable. The main part is of course the top-level docker-compose.yml file.

In the build folder reside all the needed Dockerfiles.
All the configuration files (that will ultimately be copied to the containers) live in configurations.
And finally, locally-executed scripts are in the scripts folder.

c. Create your cloud infrastructure

In this example I will use OVH Public Cloud as my infrastructure provider.

For my cloud, I use the smallest possible production instance, a b2-7 (see screenshot below for specs), and it's largely sufficient in terms of CPU and RAM, but you can adapt accordingly.

Disclaimer : these instructions / tutorial only works with OVH Public Cloud. It should be relatively straightforward to adapt it to another cloud provider since most of the steps are just Docker cli commands ..

0. Signup for a public cloud account

Obviously.

1. Create a new instance

This will be your main machine and Docker host.

You could be tempted to use the web interface like below but don't :

Screenshot-2019-09-27-at-15.01.37

Why ? Because if you do so, you won't be able to manage your Docker host with your local docker-machine utility, since it will have no record of having created it.

You're better doing this with the command line.

But before, we need to do two things :

First, you need to retrieve your credentials for the underlying OpenStack service.

To do so, go to the Users panel, create a user if you haven't done so yet, and then select "Download OpenStack's RC file" from the menu :

Screenshot-2019-09-27-at-15.14.37

Select the datacenter you want to use primarily (should be the same that your instance will live in) β€” here, I chose GRA5 β€” and check"V3", then click Download :

Screenshot-2019-09-27-at-15.14.48

The resulting openrc.sh file will need to be executed whenever you want to use Docker to control your machines, so the correct environment variables are set beforehand.

Second, you need to add at least one SSH key in your account to that you can login to your machine (and so that Docker can).

In the SSH Keys menu, add a new key, and give it a name (for instance, HOME) :

Screenshot-2019-09-27-at-15.02.30

(PS : You may need to create the SSH key in the Horizon web interface too β€” It's OpenStack's own web GUI that is accessible in the menu on the left)

Once you know the model you want to create (like b2-7), you're ready to create the instance via a cli. On your local machine, source openrc.sh:

source openrc.sh

Then, use docker-machine to create the instance :

docker-machine create -d openstack \
  --openstack-flavor-name="b2-7" \
  --openstack-region="GRA5" \
  --openstack-image-name="Debian 9" \
  --openstack-net-name="Ext-Net" \
  --openstack-ssh-user="debian" \
  --openstack-keypair-name="HOME" \
  --openstack-private-key-file="path_to/.ssh/id_rsa" \
  default

Somes notes :

  • OVH uses OpenStack as its backend for its cloud, so we use the openstack driver to talk to it
  • use the same datacenter code that your RC file
  • use the same SSH key that the one you uploaded
  • name the machine default β€” it's not mandatory, but if you do you won't have to type the name of your machine when invocating docker-machine

The rest should be straightforward.

Wait a few seconds and πŸ’₯ ! You have a docker host living on a Debian 9 instance in the cloud.

You can check that everything is correct by going to the Instances web interface β€” you default instance should be there, with your brand new IP (make a note of it !) :

Screenshot-2019-09-27-at-14.59.17

2. Configure your instance

Now, we need to tweak the host a little bit before we can start working with it.

Entropy

For a cloud instance to be able to generate entropy at a correct rate, we need the haveged package β€” this will be necessary for the password manager for instance :

docker-machine ssh default 'sudo apt update && sudo apt install -y -f haveged'

Paths and mounted volumes

If you want to do things correctly, you want to store your data (databases, files, etc) not directly on the instance, but rather on an attached volume.

Why? Because that will allow you to :

  • Rebuild your instance without losing any data
  • Backup your data and your data only when you want
  • Generally speaking, "separate concerns" β€” your data and your apps should not live in the same space

How to do this ? It's quite easy, you need to create a block storage device (or several, in my case), that you will then mount on your instance.

To create them, simply head to the Storage menu, and create two blocks. Here, one named "databases" that will store all databases data files (MySQL, mongo, couch, ...), and one names "files" that will store all "real" files (the cozy cloud files or syncthing files for instance).

Screenshot-2019-09-27-at-14.59.26

Attach them to your default instance via the web interface too; they will be made available as disks.

Once it's done, you need to initialize these disks and mount them. Behold (using fdisk is beyond the scope of this article β€” here) :

In the following lines, I assume that the databases volume is at /dev/sdb and the files volume at /dev/sdc.

The databases volume :

docker-machine ssh default 'sudo fdisk /dev/sdb # n, p, w'
docker-machine ssh default 'sudo mkfs.ext4 /dev/sdb1'
docker-machine ssh default 'sudo mkdir /mnt/databases && sudo mount /dev/sdb1 /mnt/databases'
docker-machine ssh default 'sudo mkdir /mnt/databases/mysql /mnt/databases/couch /mnt/databases/mongo'

The files volume :

docker-machine ssh default 'sudo fdisk /dev/sdc # n, p, w'
docker-machine ssh default 'sudo mkfs.ext4 /dev/sdc1'
docker-machine ssh default 'sudo mkdir /mnt/files && sudo mount /dev/sdc1 /mnt/files'
docker-machine ssh default 'sudo mkdir /mnt/files/cozy /mnt/files/sync'

3. Test that Docker works well on the host

Now that your machine is up to date, let's try to see if Docker works correctly on the host.

First, we need to tell your local Docker instance what to control; For that, Docker-machine gives us the correct environment variables that we need to use

docker-machine env default

To eval those vars automatically, just do :

eval $(docker-machine env default)

Once it's done, your local docker utility should directly control your remote Docker host. Try :

docker info

... to see if it works ! This should return information from your instance (operating system should be Debian 9, etc)

If it's all good, you're set up to create containers on your host, from the confort of your local terminal πŸŽ‰.

You can now use all of Docker commands to manage your host and containers.

d. Add or remove services

Now, before you deploy, you might not need all of these services I have described.

Here is the dependencies graph so you can change the docker-compose file easily β€” each block is a container:

depgraph

Note that if you remove a service (let's say for instance, Calendar), you must remove its network from the networks list, and from the reverse-proxy networks too.

The certbot container is standalone, but uses a shared volume with the reverse-proxy container, so that the certificates are available for the two of them.

The only container that is exposed to the web is the reverse-proxy (through port 80 of the host). This container depends on all others just because it needs to know the backend for all virtual hosts.

For instance, in the nginx config of this container (just an excerpt) :

upstream docker-passbolt {
    server passbolt;
}

...

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name passbolt.mydomain.com;

    ssl_certificate /path/to/fullchain.pem;
    ssl_certificate_key /path/to/privkey.pem;

    location / {
        proxy_pass  http://docker-passbolt;
        proxy_redirect     off;
        proxy_set_header   Host $host;
    }
}

If the passbolt container is not up when nginx starts, it will refuse to start because it can't resolve the upstream.

I could have done one file per virtual host, or a different setup that would allow the reverse-proxy container to not depend on the other containers but I guess modifying a nginx configuration file is relatively easy.

e. Tweak the configuration

You will notice a .env.dist file in the root of the repository. You must duplicate this file to a .env file before building or creating containers.

This .env file is very important : it will allow you to tweak the configuration for your entire cloud, namely :

  • the reverse DNS for each service
  • the password for each database, service, web interface
  • your mail configuration
  • some other container-specific configuration

This file is a bit cumbersome to fill in, and it's due to the fact that you must duplicate information in different parts because different containers do not use the same env vars.

This would be easy if the file was a regular env file (in a shell way of speaking), but it's not, and Docker doesn't like variable replacement in this file, so 🀷.

Beware of shell expansions too, if you're using special characters in the passwords. I've put an example in the dist file.

f. Build the custom images

The docker-compose file is based upon official images and custom ones that I modified. Before launching, and to check that your configuration tweaks did not break anything, it's a good idea to build them.

Beforehand though, you need to build the configuration files. Configuration files are copied onto the various containers and I created them so that the environment variables are written directly in the configuration, to make the whole repository agnostic of my own implementation.

I use a very simple templating system where configuration files include bash-style env vars, like so (see the $NOTES_DOMAIN var):

{
  "url": "https://$NOTES_DOMAIN/extensions/secure-spreadsheets/dist/index.html",
  "download_url": "https://github.com/sn-extensions/secure-spreadsheets/archive/1.3.2.zip",
  "latest_url": "https://$NOTES_DOMAIN/extensions/secure-spreadsheets.json"
}

These vars are simply replaced when playing the script :

./scripts/build-configuration-files.sh

The only culprit of this system is that you need to escape $'s with \$

Once it's done, you can build the different custom containers :

docker-compose build

Configuring things before launch

Some images need to be prepared before they can be started, or else they will fail and quit.

This is especially true for the reverse-proxy that needs SSL certificates (we only serve https, you should too).

To set dummy SSL certificates, then launch nginx and retrieve the real certificates from Let's encrypt, you should run this script :

./scripts/certbot/init-letsencrypt.sh

It relies heavily on the configuration of the domains that is in the .env file, so make sure that everything is ok in there before running it.

Of course, before that, you must have created the correct DNS entries pointing to your instance (the IP you have written early on)

As for the Cozy container, you must create what they call an instance before being able to connect to your cloud. I guess that this is related to multi-users installations. Just run the below script once :

./scripts/cozy/init-cozycloud.sh

NB : you can change the quota (10GB) in the script directly if you need more, but then be careful to not put a quote above your real disk capacity (it's part of the files block storage from before, that we created via the OVH web interface).

A note about these custom images and scripts

So as I was saying, I use mainly custom images derived from the official ones to add my own tweaks. Here are some details of what I did for those of interest.

Standard notes

I've included a few extensions I find useful in the container.

To add an extension to your desktop Standard Notes app, go to "Extensions" in the bottom bar, click "Import extension" and paste the link to the JSON description file of the extension you want :

On your custom domain : https://notes.mydomain.com/extensions/

  • Advanced markdown editor : advanced-markdown-editor.json
  • Plus editor : plus-editor.json
  • Secure spreadsheets : secure-spreadsheets.json
  • Simple task editor : simple-task-editor.json
  • Autocomplete tags : autocomplete-tags.json

extensions

NB : Side note on this; Standard notes says that the extensions are "public source" but not "open source", which I don't quite understand fully, to be honest (see here, where there is no clear answer to "Can I self host existing extensions?"). The source code is published on Github, without license β€”Β In my repository I only link to it via submodule, which I think is pretty much in line with Github's Terms, but if you're from Standard Notes and want me to remove this, just contact me and I will abide.

Nginx β€”Β reverse proxy

I've created a quite reasonable configuration file by following the best practices for SSL parameters and HTTPS.

All services are only served in HTTPS and if possible, via HTTP/2.

All insecure requests are redirected to their secure counterpart, except for the certificate challenges.

With all these settings, all the front websites should achieve at least an A rating on SSL Labs

Baikal β€” calendars and contacts

The official Dockerfile was not really up to date and was including too many things like postfix for instance, so I revamped it quite a lot.

Following, binfalse's article I also used ssmtp in lieu of sendmail to be able to use an external SMTP to route the emails.

Check out the Dockerfile to see exactly all the steps

xBrowser sync

The configuration I created allows only one sync (this is a personal cloud), and by default, does not allow new syncs.

That means that you need to create the container with a different configuration allowing new syncs, create your sync id (via the browser extension), and then if you wish, recreate the container with the initial configuration (not allowing syncs). This is just a security measure.

Passbolt

By default on the new versions, you are disconnected every 24 minutes because of the way the sessions are managed by PHP.

So I extended the garbage collector session lifetime to allow to have 3 days without having to reconnect.

(see this Github issue for more context)

6. πŸš€ Launch !

It's time.

docker-compose up -d

NB : -d is to run in daemon mode.

After having done that, you still need to :

Init the Baikal instance if needed (if the tables do not already exist in the database)

./scripts/baikal/init-mysql-tables.sh

Create the Passbolt admin user

./scripts/passbolt/init-admin-user.sh

It will give you a url that you need to go to to setup your account, your key, etc ...

You now should have secured working services behind your custom subdomains or domains. Congrats πŸŽ‰ !

7. πŸ’­ Final words

A note on E2E encryption

Back to privacy. As you have seen, not all solutions I use here offer end-to-end encryption.

Only the notes, passwords and bookmarks are fully E2E encrypted.

The other solutions often provide transport encryption (like Syncthing) and of course we have https, but if someone gets hold of the table containing your Wekan boards and cards, they will be able to read it for sure.

So it's up to you to decide on which services you want to create data. For now, I'm pretty happy with the non-encrypted drive, sync and kanban boards because my main focus is privacy, and not a total security from a hacker that would explicitly decide to extract my data and files.

If you're concerned about this, Cryptpad apparently has a neat solution for an encrypted Drive + cloud apps, with a dockerfile on their Github. I haven't tested it though.

As for the calendar / contacts, I haven't found an open-source encrypted calendar solution yet. Some providers offer it as a service (for instance, tutanota), but that seems to be quite a niche.

I guess the best solution here would be to dive into the Baikal source code, fork it and add encryption on persistence. Feasible, but could be quite a challenge (see this).

NB : The CalDav / CardDav protocols are plain-text in essence, hence the importance of https.

If you find good encrypted alternatives, do not hesitate to put them in the comments below this article.

Backup

It's more probable that your storage will fail before someone tries to steal your data. So backup is an important part of your private personal cloud.

I touched on the subject before, but to recap, it's a good idea to backup your block storage from time to time.

On OVH (my provider of choice), it's relatively easy with the web interface β€” it's called Volume snapshot : just select your storage, click "create a snapshot" and you're done.

Screenshot-2019-10-01-at-17.42.07

Screenshot-2019-10-01-at-17.41.59

NB : Of course, you can also do this automatically with the OpenStack API / the Horizon interface, but this is out of the scope of this article.

As for the instance, I think it's less important since your instance is basically a configuration file, so if it crashes, you can just restart a new container with a single command line, without losing any data (only downtime).

πŸ’° Donate or contribute

Ok so now you have you own private cloud and all these apps working well. Time to thank all the people that made this possible!

I would encourage to donate to all these projects, for instance the amount that you would have paid for a month or two of services (for those that provide a cloud-based solution), or a small reasonable fee (like 10€ ?) for the other.

We've worked with free software here, free as in speech. They're also free (as in beer) to download and self-host, but if we want the maintainers of these tools to continue to make releases, patch bugs and add features, we need to acknowledge that these people need to earn a living, too. A little gesture is always welcome.

You can also get involved with the development of these tools if you have the ability and time to do so : by writing clear and precise bug reports, by submitting pull requests, or helping in any other way possible. That's the beauty of open-source.


Extra literature

Some articles I found while researching my self-hosting mania with Docker:

If you want to go further and also self-host your emails, Gilles Chehade (poolp) made a very nice article about this very topic: https://poolp.org/posts/2019-09-14/setting-up-a-mail-server-with-opensmtpd-dovecot-and-rspamd/