Simuliidae: an open-source design for hosting with containers

2020-06-29 11:35
Written by

My recent blog posts have talked about containers, Docker and why I think they're a good fit for hosting CiviCRM. Here's a deeper dive into two design ideas that are part of my Simuliidae open source hosting project.

You can try these out using the project's Quick Start.

Ruby, a border collie puppy

What does Open Source hosting even mean?

I'll admit, it's not really A Thing. Open Source usually applies to tools, and most hosting services use lots of open source tools. By describing this project as an open source hosting project, I'm describing my intent - to be transparent about how I'm using the different tools, and by sharing the bits of glue code I use. It's not designed as a turn-key open source solution (you'd get a lot closer to that goal using something like Aegir).

Container Images, Microservices and File Persistence

Any container-based hosting solution needs to make strategic decisions about what the images look like. There are lots of different options. For example, Pantheon's web server images don't contain any php code, they're just a receptacle to hold the code which gets pulled in via git in a clever custom workflow. A more conventional image might be similar, but hold all the php code in a volume container. And then images like the official Drupal docker image, or most of the CiviCRM docker images out in the wild, contain Drupal (and/or CiviCRM) code so that upgrading Drupal (or CiviCRM) would require updating the image itself (and keeping the other files via mounted volume containers). Orthogonal to those decisions are ones about how many containers to use - i.e. whether to follow the Docker orthodoxy or to combine some functions/services into one container. My point is that there is no single agreed upon answer to these questions, and you should first of all try and understand why these questions are important. The answer you choose may depend on the tools you pick, but also the kind of hosting you're looking to support (e.g. are you using a container hosting service? Running local copies on your desktop? etc.).

So what's my answer? Unfortunately for you: it depends.

Using the Docker Compose Format, and the script

I make use of the "docker compose file format", to answer to the question. The 'docker compose file format' is a declarative, machine-readable way to describe which images to use in building your application, and how they fit together, along with the use of volumes for persistence. And naturally, can be used with the eponymous docker-compose program to actually launch a 'site'.

A docker compose file can get very complicated and hard to maintain and understand, so one of my innovations in this project is to generate docker compose files from smaller, easier to maintain bits. I do this using the 'docker-compose' tool, which is usually used to launch applications, but also has a clever feature that it can be used to combine multiple docker-compose files and write out the result.

The result is this file: which I suggestively named because the result is a docker-compose file that can be used to launch a 'full stack' for a site, i.e. at least a web server and db server, but also potentially other services (see the 'admin' service next, but also things like a redis server for better performance).

That also lets me tailor my hosting services more easily, providing different tools/capacities for different clients/prices.

If you're trying out the quick start, you'll see step 2. uses this script to generate the docker compose file at "7/apache/compose.yml"

The "admin" service

Another innovation I use (that I haven't seen elsewhere) is the use of an 'admin' container. All my hosted sites have one of these, and it's essentially a copy of the web server, with a few additions.

The main addition is having Drush installed, and I use that for every site to run the Drupal site crons. There are other ways of accomplishing this, but having a drush that can operate on every site in a programmatic way is essential for any mass site management (e.g. upgrades!). I also confess that I sometimes update code in production sites directly, and this container supports that questionable activity.

Of course, I could just include Drush in the webserver, and the reasons I don't are a mix of "following the Docker Way", and security/reliability/performance. For example: every extra tool you add to the main webserver provides an extra vector that can be used to attack it, so the less you expose, the better. And the reason the admin container doesn't increase the attack vector is because that container doesn't get exposed, it's only accessible from the infrastructure and its scripts. As a side benefit: the admin environment can be tuned for what it needs (fewer parallel processes, but using more memory) vs. the web server which usually wants to limit the amount of resources that can be used by any one process.

You might think this is a bit of a waste of resources (haven't I just doubled how much memory I need?), but it turns out that with the magic of Docker (overlay, copy-on-write), these containers take very little memory - essentially, only the extra memory required during program execution.

If you're trying out the quick start, you'll see that the admin container has a script that runs when it starts that notices you don't already have a site and generates it from the evaluate.env file configuration values, using Drush.

Questions? Did you try out the quickstart?

Filed under