Block Storage Overview
Block storage allows a single pod to mount storage. This guide shows how to create a simple, multi-tier web application on Kubernetes using persistent volumes enabled by Rook.
Prerequisites¶
This guide assumes a Rook cluster as explained in the Quickstart.
Provision Storage¶
Before Rook can provision storage, a StorageClass
and CephBlockPool
CR need to be created. This will allow Kubernetes to interoperate with Rook when provisioning persistent volumes.
Note
This sample requires at least 1 OSD per node, with each OSD located on 3 different nodes.
Each OSD must be located on a different node, because the failureDomain
is set to host
and the replicated.size
is set to 3
.
Save this StorageClass
definition as storageclass.yaml
:
If you've deployed the Rook operator in a namespace other than rook-ceph
, change the prefix in the provisioner to match the namespace you used. For example, if the Rook operator is running in the namespace my-namespace
the provisioner value should be my-namespace.rbd.csi.ceph.com
.
Create the storage class.
Note
As specified by Kubernetes, when using the Retain
reclaim policy, any Ceph RBD image that is backed by a PersistentVolume
will continue to exist even after the PersistentVolume
has been deleted. These Ceph RBD images will need to be cleaned up manually using rbd rm
.
Consume the storage: Wordpress sample¶
We create a sample app to consume the block storage provisioned by Rook with the classic wordpress and mysql apps. Both of these apps will make use of block volumes provisioned by Rook.
Start mysql and wordpress from the deploy/examples
folder:
Both of these apps create a block volume and mount it to their respective pod. You can see the Kubernetes volume claims by running the following:
Example Output: kubectl get pvc
Once the wordpress and mysql pods are in the Running
state, get the cluster IP of the wordpress app and enter it in your browser:
Example Output: kubectl get svc wordpress
You should see the wordpress app running.
If you are using Minikube, the Wordpress URL can be retrieved with this one-line command:
Note
When running in a vagrant environment, there will be no external IP address to reach wordpress with. You will only be able to reach wordpress via the CLUSTER-IP
from inside the Kubernetes cluster.
Consume the storage: Toolbox¶
With the pool that was created above, we can also create a block image and mount it directly in a pod. See the Direct Block Tools topic for more details.
Teardown¶
To clean up all the artifacts created by the block demo:
Advanced Example: Erasure Coded Block Storage¶
If you want to use erasure coded pool with RBD, your OSDs must use bluestore
as their storeType
. Additionally the nodes that are going to mount the erasure coded RBD block storage must have Linux kernel >= 4.11
.
Attention
This example requires at least 3 bluestore OSDs, with each OSD located on a different node.
The OSDs must be located on different nodes, because the failureDomain
is set to host
and the erasureCoded
chunk settings require at least 3 different OSDs (2 dataChunks
+ 1 codingChunks
).
To be able to use an erasure coded pool you need to create two pools (as seen below in the definitions): one erasure coded and one replicated.
Attention
This example requires at least 3 bluestore OSDs, with each OSD located on a different node.
The OSDs must be located on different nodes, because the failureDomain
is set to host
and the erasureCoded
chunk settings require at least 3 different OSDs (2 dataChunks
+ 1 codingChunks
).
Erasure Coded CSI Driver¶
The erasure coded pool must be set as the dataPool
parameter in storageclass-ec.yaml
It is used for the data of the RBD images.
Node Loss¶
If a node goes down where a pod is running where a RBD RWO volume is mounted, the volume cannot automatically be mounted on another node. The node must be guaranteed to be offline before the volume can be mounted on another node.
Configure CSI-Addons¶
Deploy csi-addons controller and enable csi-addons
sidecar as mentioned in the CSI Addons guide.
Handling Node Loss¶
Warning
Automated node loss handling is currently disabled, please refer to the manual steps to recover from the node loss. We are actively working on a new design for this feature. For more details see the tracking issue.
When a node is confirmed to be down, add the following taints to the node:
After the taint is added to the node, Rook will automatically blocklist the node to prevent connections to Ceph from the RBD volume on that node. To verify a node is blocklisted:
The node is blocklisted if the state is Fenced
and the result is Succeeded
as seen above.
Node Recovery¶
If the node comes back online, the network fence can be removed from the node by removing the node taints: