For years, IT teams were told that containers were best suited to stateless workloads and that it wasn't a good...
idea to put databases in them. But then, starting with SQL Server 2016, Microsoft decided to enable its relational database software to run in Docker containers. So, why is it now OK to containerize SQL Server on Docker?
To answer that question, it's helpful to understand a little bit about the anatomy of a container. Although containers are technically a type of virtualization technology, they differ significantly from virtual servers. Server virtualization abstracts a server's hardware layer so multiple operating system copies can be run simultaneously, each isolated on virtualized hardware.
Conversely, containers perform virtualization at the application layer. Different Docker containers running on a single host system share a common base image, meaning that they all use the same OS kernel and there's no need for a guest OS within each container. This is why containers have a reputation for being lightweight, and for performing so well on startup.
Data loss a non-starter for databases
If containers are really just virtualized application environments, why was running SQL Server on Docker once deemed a bad idea? One of the key differences between virtual machines and containers is that the latter are essentially disposable. Containers can be spun up at a moment's notice, but they are often removed from systems just as quickly.
When that happens, the data within a container is lost. It's the reason why containers are so often used to run stateless workloads that don't require persistent data, such as the processing jobs done by front-end web servers.
Docker Engine does give IT administrators the option of associating what's called a union file system with a container to act as a writable layer of storage. However, that approach is most often used in situations in which a permanent modification needs to be made to the base container image, thereby altering the containers that are based on the image. It might be appropriate, for example, when installing an application patch.
The problem is that the files created in the writable layer of a container go away when the container does. While that's OK for stateless uses, most applications -- and certainly databases like SQL Server -- need to be able to write to disk, even if only on a temporary basis.
To accomplish that, an alternative to writing data to a union file system is to write it to a volume. Volumes are mapped to containers, but they are set up as subdirectories in the Docker-managed portion of the host file system outside of the containers themselves.
Multiple containers can share a single volume and simultaneously access the data stored in it, and volumes aren't automatically destroyed when the containers associated with them are removed. Instead, you have to explicitly delete volumes if they're no longer needed.
Volume approach to database containers
Using volumes has other advantages beyond data persistence. It doesn't expand the size of individual containers, and volumes typically provide better read and write performance than the storage drivers used with union file systems. As a result, when it comes to hosting SQL Server on Docker -- or running any other database in a container -- the general rule of thumb is to use volumes for data storage.
One trick to ensure that you don't accidentally lose your data is to create volumes in a way that guarantees ongoing data persistence. For example, you could create a permanent, named container and associate a particular volume with it. Doing so enables you to safely share the volume with your SQL Server containers. Even if all of the database containers are removed, there's -- hopefully -- no danger that the volume will be deleted because it's associated with a container that still exists.
The persistent data storage enabled by volumes makes containerizing SQL Server more palatable for database administrators and developers -- a change in attitude that Microsoft confirmed when it first added support for putting SQL Server on Docker in Windows systems as part of the 2016 release. It then expanded the support to include Linux-based containers when SQL Server 2017 was released. To get started, prebuilt SQL Server container images for both Docker for Windows and Linux are available in the Docker Hub registry.
In spite of all of the hype, containers may or may not be the best choice for running SQL Server. It's still easier to put your databases in a virtual machine, but containers are a good option in some cases. For example, they could be the way to go if you develop database applications in-house and want to ensure that the same environment is used all the way from development to production, or if you need to maintain strict version control over the SQL Server software.