Wednesday, October 12, 2022

Docker storage options

ephemeral storage - Depends on OS, when container is shutdown, the storage will be lost 

persistent Storage - Stored outside of the container

Docker data volumes

Docker data volumes provide the ability to create a resource that can be used to persistently store and retrieve data within a container. The functionality of data volumes was significantly enhanced in Docker version 1.9, with the ability to assign meaningful names to a volume, list volumes, and list the container associated with a volume.

Data volumes are a step forward from storing data within the container itself and offered better performance for the application. A running container is built from a snapshot of the base container image using file-based copy on write techniques so any data stored natively in the container attracts a significant overhead to manage. Data volumes sit outside this CoW mechanism and exist on the host filesystem, so they're more efficient to read and write to.

However, there are issues with using data volumes. For example, an existing volume can’t be attached to a running or new container, which means a volume can end up orphaned.

Data volume container


An alternative solution is to use a dedicated container to host a volume and to mount that volume space to other containers -- a so-called data volume container. In this technique, the volume container outlasts the application containers and can be used as a method of sharing data between more than one container at the same time.


Having a long-running container to store data provides other opportunities. For instance, a backup container can be spun up that copies or backs up the data in the container volume, for example. In both of the above scenarios, the container volume sits within the file structure of the Docker installation, typically /var/lib/docker/volumes. This means you can use standard tools to access this data, but beware, Docker provides no locking or security mechanisms to maintain data integrity.



Directory mounts


A third option for persistent data is to mount a local host directory into a container. This goes  a step further than the methods described above in that the source directory can be any directory on the host running the container, rather than one under the Docker volumes folder. At container start time, the volume and mount point are specified on the Docker run command, providing a directory within the container that can be used by the application, e.g., data.


Storage plugins


Probably the most interesting development for persistent storage has been the ability to connect to external storage platforms through storage plugins. The plugin architecture provides an interface and API that allows storage vendors to build drivers to automate the creation and mapping of storage from external arrays and appliances into Docker and to be assigned to a container.


Today there are plugins to automate storage provisioning from HPE 3PAR, EMC (ScaleIO, XtremIO, VMAX, Isilon), and NetApp. There are also plugins to support storage from public cloud providers like Azure File Storage and Google Compute Platform.


Plugins map storage from a single host to an external storage source, typically an appliance or array. However, if a container is moved to another host for load balancing or failover reasons, then that storage association is lost. ClusterHQ has developed a platform called Flocker that manages and automates the process of moving the volume associated with a container to another host. Many storage vendors, including Hedvig, Nexenta, Kaminario, EMC, Dell, NetApp and Pure Storage, have chosen to write to the Flocker API, providing resilient storage and clustered container support within a single data center.


https://www.edureka.co/commun

No comments:

Post a Comment