Creating and mounting data volumes
All meaningful applications consume or produce data. Yet containers are, ideally, meant to be stateless. How are we going to deal with this? One way is to use Docker volumes. Volumes allow containers to consume, produce, and modify a state. Volumes have a life cycle that goes beyond the life cycle of containers. When a container that uses a volume dies, the volume continues to exist. This is great for the durability of the state.
Modifying the container layer
Before we dive into volumes, let’s first discuss what happens if an application in a container changes something in the filesystem of the container. In this case, the changes are all happening in the writable container layer that we introduced in Chapter 4, Creating and Managing Container Images. Let’s quickly demonstrate this:
- Run a container and execute a script in it that is creating a new file, like this:
$ docker container run --name demo \ alpine /bin/sh -c 'echo "This is a test" > sample.txt'
- The preceding command creates a container named
demo
, and, inside this container, creates a file calledsample.txt
with the contentThis is a test
. The container exits after running theecho
command but remains in memory, available for us to do our investigations. - Let’s use the
diff
command to find out what has changed in the container’s filesystem concerning the filesystem of the original image, as follows:$ docker container diff demo
The output should look like this:
A /sample.txt
- A new file, as indicated by the letter
A
, has been added to the filesystem of the container, as expected. Since all layers that stem from the underlying image (Alpine, in this case) are immutable, the change could only happen in the writeable container layer.
Files that have changed compared to the original image will be marked with a C
and those that have been deleted with a D
.
Now, if we remove the container from memory, its container layer will also be removed, and with it, all the changes will be irreversibly deleted. If we need our changes to persist even beyond the lifetime of the container, this is not a solution. Luckily, we have better options, in the form of Docker volumes. Let’s get to know them.
Creating volumes
When using Docker Desktop on a macOS or Windows computer, containers are not running natively on macOS or Windows but rather in a (hidden) VM created by Docker Desktop.
To demonstrate how and where the underlying data structures are created in the respective filesystem (macOS or Windows), we need to be a bit creative. If, on the other hand, we are doing the same on a Linux computer, things are straightforward.
Let’s start with a simple exercise to create a volume:
- Open a new Terminal window and type in this command:
$ docker volume create sample
You should get this response:
sample
Here, the name of the created volume will be the output.
The default volume driver is the so-called local driver, which stores the data locally in the host filesystem.
- The easiest way to find out where the data is stored on the host is by using the
docker volume inspect
command on the volume we just created. The actual location can differ from system to system, so this is the safest way to find the target folder. So, let’s use this command:$ docker volume inspect sample
We should see something like this:
Figure 5.1 – Inspecting the Docker volume called sample
The host folder can be found in the output under Mountpoint
. In our case, the folder is /var/lib/docker/volumes/sample/_data
.
- Alternatively, we can create a volume using the dashboard of Docker Desktop:
- Open the Dashboard of Docker Desktop.
- On the left-hand side, select the Volumes tab.
- In the top-right corner, click the Create button, as shown in the following screenshot:
Figure 5.2 – Creating a new Docker volume with Docker Desktop
- Type in
sample-2
as the name for the new volume and click Create. You should now see this:
Figure 5.3 – List of Docker volumes shown in Docker Desktop
There are other volume drivers available from third parties, in the form of plugins. We can use the --driver
parameter in the create
command to select a different volume driver.
Other volume drivers use different types of storage systems to back a volume, such as cloud storage, Network File System (NFS) drives, software-defined storage, and more. The discussion of the correct usage of other volume drivers is beyond the scope of this book, though.
Mounting a volume
Once we have created a named volume, we can mount it into a container by following these steps:
- For this, we can use the
--volume
or-v
parameter in thedocker container run
command, like this:$ docker container run --name test -it \ -v sample:/data \ alpine /bin/sh
If you are working on a clean Docker environment, then the output produced by this command should look similar to this:
Unable to find image 'alpine:latest' locally latest: Pulling from library/alpine 050382585609: Pull complete Digest: sha256: 8914eb54f968791faf6a86... Status: Downloaded newer image for alpine:latest / #
Otherwise, you should just see the prompt of the Bourne shell running inside the Alpine container:
/ #
The preceding command mounts the sample volume to the
/data
folder inside the container. - Inside the container, we can now create files in the
/data
folder, as follows:/ # cd /data / # echo "Some data" > data.txt / # echo "Some more data" > data2.txt
- If we were to navigate to the host folder that contains the data of the volume and list its content, we should see the two files we just created inside the container. But this is a bit more involved so long as we are working on a Mac or Windows computer and will be explained in detail in the Accessing Docker volumes section. Stay tuned.
- Exit the tool container by pressing Ctrl + D.
- Now, let’s delete the dangling
test
container:$ docker container rm test
- Next, we must run another one based on CentOS. This time, we are even mounting our volume to a different container folder,
/app/data
, like this:$ docker container run --name test2 -it --rm \ -v sample:/app/data \ centos:7 /bin/bash
You should see an output similar to this:
Unable to find image 'centos:7' locally 7: Pulling from library/centos 8ba884070f61: Pull complete Digest: sha256:a799dd8a2ded4a83484bbae769d9765... Status: Downloaded newer image for centos:7 [root@275c1fe31ec0 /]#
The last line of the preceding output indicates that we are at the prompt of the Bash shell running inside the CentOS container.
- Once inside the CentOS container, we can navigate to the
/app/data
folder to which we have mounted the volume and list its content, as follows:[root@275c1fe31ec0 /]# cd /app/data [root@275c1fe31ec0 /]# ls –l
As expected, we should see these two files:
-rw-r--r-- 1 root root 10 Dec 4 14:03 data.txt -rw-r--r-- 1 root root 15 Dec 4 14:03 data2.txt
This is the definitive proof that data in a Docker volume persists beyond the lifetime of a container, as well as that volumes can be reused by other, even different, containers from the one that used it first.
It is important to note that the folder inside the container to which we mount a Docker volume is excluded from the Union filesystem. That is, each change inside this folder and any of its subfolders will not be part of the container layer but will be persisted in the backing storage provided by the volume driver. This fact is really important since the container layer is deleted when the corresponding container is stopped and removed from the system.
- Exit the CentOS container with Ctrl + D.
Great – we have learned how to mount Docker volumes into a container! Next, we will learn how to delete existing volumes from our system.
Removing volumes
Volumes can be removed using the docker volume rm
command. It is important to remember that removing a volume destroys the containing data irreversibly, and thus is to be considered a dangerous command. Docker helps us a bit in this regard, as it does not allow us to delete a volume that is still in use by a container. Always make sure before you remove or delete a volume that you either have a backup of its data or you don’t need this data anymore. Let’s learn how to remove volumes by following these steps:
- The following command deletes the sample volume that we created earlier:
$ docker volume rm sample
- After executing the preceding command, double-check that the folder on the host has been deleted. You can use this command to list all volumes defined on your system:
$ docker volume ls
Make sure the
sample
volume has been deleted. - Now, also remove the
sample-2
volume from your system. - To remove all running containers to clean up the system, run the following command:
$ docker container rm -v -f $(docker container ls -aq)
- Note that by using the
-v
or--volume
flag in the command you use to remove a container, you can ask the system to also remove any anonymous volume associated with that particular container. Of course, that will only work if the particular volume is only used by this container.
In the next section, we will show you how to access the backing folder of a volume when working with Docker Desktop.
Accessing Docker volumes
Now, let’s for a moment assume that we are on a Mac with macOS. This operating system is not based on Linux but on a different Unix flavor. Let’s see whether we can find the data structure for the sample
and sample-2
volumes, where the docker volume inspect
command told us so:
- First, let’s create two named Docker volumes, either using the command line or doing the same via the dashboard of Docker Desktop:
$ docker volume create sample $ docker volume create sample-2
- In your Terminal, try to navigate to that folder:
$ cd /var/lib/docker/volumes/sample/_data
On the author’s MacBook Air, this is the response to the preceding command:
cd: no such file or directory: /var/lib/docker/volumes/sample/_data
This was expected since Docker is not running natively on Mac but inside a slim VM, as mentioned earlier in this chapter.
Similarly, if you are using a Windows machine, you won’t find the data where the
inspect
command indicated.It turns out that on a Mac, the data for the VM that Docker creates can be found in the
~/
Library/Containers/com.docker.docker/Data/vms/0
folder.To access this data, we need to somehow get into this VM. On a Mac, we have two options to do so. The first is to use the
terminal screen
command. However, this is very specific to macOS and thus we will not discuss it here. The second option is to get access to the filesystem of Docker on Mac via the specialnsenter
command, which should be executed inside a Linux container such as Debian. This also works on Windows, and thus we will show the steps needed using this second option. - To run a container that can inspect the underlying host filesystem on your system, use this command:
$ docker container run -it --privileged --pid=host \ debian nsenter -t 1 -m -u -n -i sh
When running the container, we execute the following command inside the container:
nsenter -t 1 -m -u -n -i sh
If that sounds complicated to you, don’t worry; you will understand more as we proceed through this book. If there is one takeaway, then it is to realize how powerful the right use of containers can be.
- From within this container, we can now list all the volumes that are defined with
/ # ls -l /var/lib/docker/volumes
. What we get should look similar to this:
Figure 5.4 – List of Docker volumes via nsenter
- Next, navigate to the folder representing the mount point of the volume:
/ # cd /var/lib/docker/volumes/sample/_data
- And then list its content, as follows:
/var/lib/docker/volumes/sample/_data # ls –l
This should output the following:
total 0
The folder is currently empty since we have not yet stored any data in the volume.
- Similarly, for our
sample-2
volume, we can use the following command:/ # cd /var/lib/docker/volumes/sample-2/_data /var/lib/docker/volumes/sample-2/ # ls –l
This should output the following:
total 0
Again, this indicates that the folder is currently empty.
- Next, let’s generate two files with data in the
sample
volume from within an Alpine container. First, open a new Terminal window, since the other one is blocked by ournsenter
session. - To run the container and mount the
sample
volume to the/data
folder of the container, use the following code:$ docker container run --rm -it \ -v sample:/data alpine /bin/sh
- Generate two files in the
/data
folder inside the container, like this:/ # echo "Hello world" > /data/sample.txt / # echo "Other message" > /data/other.txt
- Exit the Alpine container by pressing Ctrl + D.
- Back in the
nsenter
session, try to list the content of the sample volume again using this command:/ # cd /var/lib/docker/volumes/sample/_data / # ls -l
This time, you should see this:
total 8 -rw-r--r-- 1 root root 10 Dec 4 14:03 data.txt -rw-r--r-- 1 root root 15 Dec 4 14:03 data2.txt
This indicates that we have data written to the filesystem of the host.
- Let’s try to create a file from within this special container, and then list the content of the folder, as follows:
/ # echo "I love Docker" > docker.txt
- Now, let’s see what we got:
/ # ls –l
This gives us something like this:
total 12 -rw-r--r-- 1 root root 10 Dec 4 14:03 data.txt -rw-r--r-- 1 root root 15 Dec 4 14:03 data2.txt -rw-r--r-- 1 root root 14 Dec 4 14:25 docker.txt
- Let’s see whether we can see this new file from within a container mounting the sample volume. From within a new Terminal window, run this command:
$ docker container run --rm \ -v sample:/data \ centos:7 ls -l /data
That should output this:
total 12 -rw-r--r-- 1 root root 10 Dec 4 14:03 data.txt -rw-r--r-- 1 root root 15 Dec 4 14:03 data2.txt -rw-r--r-- 1 root root 14 Dec 4 14:25 docker.txt
The preceding output is showing us that we can add content directly to the host folder backing the volume and then access it from a container that has the volume mounted.
- To exit our special privileged container with the
nsenter
tool, we can just press Ctrl + D twice.
We have now created data using two different methods:
- From within a container that has a sample volume mounted
- Using a special privileged folder to access the hidden VM used by Docker Desktop, and directly writing into the backing folder of the sample volume
In the next section, we will learn how to share data between containers.