[A video of me talking about this is available here]
Docker is extremely popular with developers, having gone as a product from zero to pretty much everywhere in a few years.
I started tinkering with Docker three years ago, got it going in a relatively small corp (700 employees) in a relatively unregulated environment. This was great fun: we set up our own registry, installed Docker on our development servers, installed Jenkins plugins to use Docker containers in our CI pipeline, even wrote our own build tool to get over the limitations of Dockerfiles.
I now work for an organisation working in arguably the most heavily regulated industry, with over 100K employees. The IT security department itself is bigger than the entire company I used to work for.
There’s no shortage of companies offering solutions that claim to meet all the demands of an enterprise Docker platform, and I seem to spend most of my days being asked for opinions on them.
I want to outline the areas that may be important to an enterprise when considering developing a Docker infrastructure.
If I’ve missed anything or you have any comments get in touch below or tweet @ianmiell
You will need a registry. There’s an open source one (Distribution), but there’s numerous offerings out there to choose from if you want to pay for an enterprise one.
- Does this registry play nice with your authentication system?
- Does it have a means of promoting images?
- Does it have role-based access control?
- Does it cohere well with your other artifact stores?
An important one.
When images are uploaded to your registry, you have a golden opportunity to check that they conform to standards. For example, could these questions be answered:
- Is there a shellshock version of bash on there?
- Is there an out of date ssl library?
- Is it based on a fundamentally insecure or unacceptable base image?
Static image analysers exist and you probably want to use one.
How are images going to be built? Which build methods will be supported and/or are strategic for your organisation? How do these fit together?
Dockerfiles are the standard, but some users might want to use S2I, Docker + Chef/Puppet/Ansible or even hand-craft them.
- Which CM tool do you want to mandate (if any)
- Can you re-use your standard governance process for your configuration management of choice?
- Can anyone build an image?
You need to know that the images running on your system haven’t been tampered with between building and running.
- Have you got a means of signing images with a secure key?
- Have you got a key store you can re-use?
- Can that key store integrate with the products you choose?
Third Party Images
Vendors will arrive with Docker images expecting there to be a process of adoption.
- Do you have a governance process already for ingesting vendor technology?
- Can it be re-used for Docker images?
- Do you need to mandate specific environments (eg DMZs) for these to run on?
- Will Docker be available in those environments?
If you already have software development lifecycle (SDLC) processes, how does Docker fit in?
- How will patches be handled?
- How do you identify which images need updating?
- How do you update them?
- How do you tell teams to update?
- How do you force them to update if they don’t do so in a timely way?
Somehow information like database passwords need to be passed into your containers. This can be done at build time (probably a bad idea), or at run time.
- How will secrets be managed within your containers?
- Is the use of this information audited/tracked and secure?
If you run Docker in an enterprise, you might want to mandate the use of a company-wide base image:
- What should go into this base image?
- What standard tooling should be everywhere?
- Who is responsible for it?
Security and Audit
The ‘root’ problem
By default, access to the docker command implies privileges over the whole machine. This is unlikely to be acceptable to most sec teams in production.
- Who (or what) is able to run the docker command?
- What control do you have over who runs it?
- What control do you have over what is run?
Solutions exist for this, but they are relatively new.
Monitoring what’s running
A regulated enterprise is likely to want to be able to determine what is running across its estate. What can not be accounted for?
- How do you tell what’s running?
- Can you match that content up to your registry/registries?
- Is what is running up to date?
- Have any containers changed critical files since startup?
When things go wrong people will want to know what happened. In the ‘old’ world of physicals and VMs there were a lot of safeguards in place to assist post-incident investigation. A Docker world can become one without ‘black box recorders’.
- Can you tell who ran a container?
- Can you tell who built a container?
- Can you determine what a container did once it’s gone?
- Can you determine what a container might have done once it’s gone?
Application logging is likely to be a managed or controlled area of concern:
- Do the containers log what’s needed for operations?
- Do they follow standards for logging?
- Where do they log to?
Containers can quickly proliferate across your estate, and this is where orchestration comes in. Do you want to mandate one?
- Does your orchestrator of choice play nicely with other pieces of your Docker infrastructure?
- Do you want to bet on one orchestrator, hedge with a mainstream one, or just sit it out until you have to make a decision?
Enterprise operating systems can lag behind the latest and greatest.
- Is your standard OS capable of supporting all the latest features? For example, some orchestrators and Docker itself require kernel versions or packages that may be more recent than is supported. This can come as a nasty surprise…
- Which version of Docker is available in your local package manager?
- Developers love having admin. Are you ready to effectively give them admin with Docker?
- Are their clients going to be consistent with deployment? If they’re using docker-compose, they might resent switching to pods in production.
Jenkins is the most popular CI tool, but there’s other alternatives popular in the enterprise.
- What’s your policy around CI/CD plugins?
- Are you ready to switch on a load of new plugins PDQ?
- Does your process for CI cater for ephemeral Jenkins instances as well as persistent, supported ones?
Docker has in its core the use of volumes that are independent of the running containers, in which persistent data is stored.
- Is shared storage easy to provision?
- Is shared storage support ready for increased demand?
- Is there a need for shared storage to be available across deployment locations?
Enterprises often have their own preferred Software Defined Networking solutions, such as Nuage, or new players like Calico.
- Do you have a prescribed SDN solution?
- How does that interact with your chosen solutions?
- Does SDN interaction create an overhead that will cause issues?
Having an aPaaS such as OpenShift or Tutum Cloud can resolve many of the above questions by centralising and making supportable the context in which Docker is run.
- Have you considered using an aPaaS?
- Which one answers the questions that need answering?
If you’re using a cloud provider such as Amazon or Google:
- How do you plan to deliver images and run containers on your cloud provider?
- Do you want to tie yourself into their Docker solutions, or make your usage cloud-agnostic?
Hey, what about x?
Get in touch: @ianmiell
My book Docker in Practice
Get 39% off with the code: 39miell