Storage Classes

This document describes the concept of a StorageClass in Kubernetes. Familiarity with volumes and persistent volumes is suggested.

Introduction

A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems.

The StorageClass Resource

Each StorageClass contains the fields provisioner, parameters, and reclaimPolicy, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned.

The name of a StorageClass object is significant, and is how users can request a particular class. Administrators set the name and other parameters of a class when first creating StorageClass objects, and the objects cannot be updated once they are created.

Administrators can specify a default StorageClass only for PVCs that don't request any particular class to bind to: see the PersistentVolumeClaim section for details.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true
mountOptions:
  - debug
volumeBindingMode: Immediate

Provisioner

Each StorageClass has a provisioner that determines what volume plugin is used for provisioning PVs. This field must be specified.

Volume Plugin Internal Provisioner Config Example
AWSElasticBlockStore AWS EBS
AzureFile Azure File
AzureDisk Azure Disk
CephFS - -
Cinder OpenStack Cinder
FC - -
FlexVolume - -
Flocker -
GCEPersistentDisk GCE PD
Glusterfs Glusterfs
iSCSI - -
Quobyte Quobyte
NFS - NFS
RBD Ceph RBD
VsphereVolume vSphere
PortworxVolume Portworx Volume
ScaleIO ScaleIO
StorageOS StorageOS
Local - Local

You are not restricted to specifying the "internal" provisioners listed here (whose names are prefixed with "kubernetes.io" and shipped alongside Kubernetes). You can also run and specify external provisioners, which are independent programs that follow a defined by Kubernetes. Authors of external provisioners have full discretion over where their code lives, how the provisioner is shipped, how it needs to be run, what volume plugin it uses (including Flex), etc. The repository houses a library for writing external provisioners that implements the bulk of the specification. Some external provisioners are listed under the repository

For example, NFS doesn't provide an internal provisioner, but an external provisioner can be used. There are also cases when 3rd party storage vendors provide their own external provisioner.

Reclaim Policy

PersistentVolumes that are dynamically created by a StorageClass will have the reclaim policy specified in the reclaimPolicy field of the class, which can be either Delete or Retain. If no reclaimPolicy is specified when a StorageClass object is created, it will default to Delete.

PersistentVolumes that are created manually and managed via a StorageClass will have whatever reclaim policy they were assigned at creation.

Allow Volume Expansion

FEATURE STATE: Kubernetes v1.11 [beta]

PersistentVolumes can be configured to be expandable. This feature when set to true, allows the users to resize the volume by editing the corresponding PVC object.

The following types of volumes support volume expansion, when the underlying StorageClass has the field allowVolumeExpansion set to true.

Table of Volume types and the version of Kubernetes they require
Volume type Required Kubernetes version
gcePersistentDisk 1.11
awsElasticBlockStore 1.11
Cinder 1.11
glusterfs 1.11
rbd 1.11
Azure File 1.11
Azure Disk 1.11
Portworx 1.11
FlexVolume 1.13
CSI 1.14 (alpha), 1.16 (beta)

Mount Options

PersistentVolumes that are dynamically created by a StorageClass will have the mount options specified in the mountOptions field of the class.

If the volume plugin does not support mount options but mount options are specified, provisioning will fail. Mount options are not validated on either the class or PV. If a mount option is invalid, the PV mount fails.

Volume Binding Mode

The volumeBindingMode field controls when volume binding and dynamic provisioning should occur. When unset, "Immediate" mode is used by default.

The Immediate mode indicates that volume binding and dynamic provisioning occurs once the PersistentVolumeClaim is created. For storage backends that are topology-constrained and not globally accessible from all Nodes in the cluster, PersistentVolumes will be bound or provisioned without knowledge of the Pod's scheduling requirements. This may result in unschedulable Pods.

A cluster administrator can address this issue by specifying the WaitForFirstConsumer mode which will delay the binding and provisioning of a PersistentVolume until a Pod using the PersistentVolumeClaim is created. PersistentVolumes will be selected or provisioned conforming to the topology that is specified by the Pod's scheduling constraints. These include, but are not limited to, resource requirements, node selectors, pod affinity and anti-affinity, and taints and tolerations.

The following plugins support WaitForFirstConsumer with dynamic provisioning:

The following plugins support WaitForFirstConsumer with pre-created PersistentVolume binding:

FEATURE STATE: Kubernetes v1.17 [stable]
CSI volumes are also supported with dynamic provisioning and pre-created PVs, but you'll need to look at the documentation for a specific CSI driver to see its supported topology keys and examples.

apiVersion: v1
kind: Pod
metadata:
  name: task-pv-pod
spec:
  nodeSelector:
    kubernetes.io/hostname: kube-01
  volumes:
    - name: task-pv-storage
      persistentVolumeClaim:
        claimName: task-pv-claim
  containers:
    - name: task-pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: task-pv-storage

Allowed Topologies

When a cluster operator specifies the WaitForFirstConsumer volume binding mode, it is no longer necessary to restrict provisioning to specific topologies in most situations. However, if still required, allowedTopologies can be specified.

This example demonstrates how to restrict the topology of provisioned volumes to specific zones and should be used as a replacement for the zone and zones parameters for the supported plugins.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - us-central-1a
    - us-central-1b

Parameters

Storage Classes have parameters that describe volumes belonging to the storage class. Different parameters may be accepted depending on the provisioner. For example, the value io1, for the parameter type, and the parameter iopsPerGB are specific to EBS. When a parameter is omitted, some default is used.

There can be at most 512 parameters defined for a StorageClass. The total length of the parameters object including its keys and values cannot exceed 256 KiB.

AWS EBS

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io1
  iopsPerGB: "10"
  fsType: ext4

GCE PD

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  fstype: ext4
  replication-type: none
  • type: pd-standard or pd-ssd. Default: pd-standard

  • zone (Deprecated): GCE zone. If neither zone nor zones is specified, volumes are generally round-robin-ed across all active zones where Kubernetes cluster has a node. zone and zones parameters must not be used at the same time.

  • zones (Deprecated): A comma separated list of GCE zone(s). If neither zone nor zones is specified, volumes are generally round-robin-ed across all active zones where Kubernetes cluster has a node. zone and zones parameters must not be used at the same time.

  • fstype: ext4 or xfs. Default: ext4. The defined filesystem type must be supported by the host operating system.

  • replication-type: none or regional-pd. Default: none.

If replication-type is set to none, a regular (zonal) PD will be provisioned.

If replication-type is set to regional-pd, a will be provisioned. It's highly recommended to have volumeBindingMode: WaitForFirstConsumer set, in which case when you create a Pod that consumes a PersistentVolumeClaim which uses this StorageClass, a Regional Persistent Disk is provisioned with two zones. One zone is the same as the zone that the Pod is scheduled in. The other zone is randomly picked from the zones available to the cluster. Disk zones can be further constrained using allowedTopologies.

Glusterfs

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/glusterfs
parameters:
  resturl: "http://127.0.0.1:8081"
  clusterid: "630372ccdc720a92c681fb928f27b53f"
  restauthenabled: "true"
  restuser: "admin"
  secretNamespace: "default"
  secretName: "heketi-secret"
  gidMin: "40000"
  gidMax: "50000"
  volumetype: "replicate:3"

NFS

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: example-nfs
provisioner: example.com/external-nfs
parameters:
  server: nfs-server.example.com
  path: /share
  readOnly: "false"
  • server: Server is the hostname or IP address of the NFS server.
  • path: Path that is exported by the NFS server.
  • readOnly: A flag indicating whether the storage will be mounted as read only (default false).

Kubernetes doesn't include an internal NFS provisioner. You need to use an external provisioner to create a StorageClass for NFS. Here are some examples:

OpenStack Cinder

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gold
provisioner: kubernetes.io/cinder
parameters:
  availability: nova
  • availability: Availability Zone. If not specified, volumes are generally round-robin-ed across all active zones where Kubernetes cluster has a node.

vSphere

There are two types of provisioners for vSphere storage classes:

In-tree provisioners are deprecated. For more information on the CSI provisioner, see .

CSI Provisioner

The vSphere CSI StorageClass provisioner works with Tanzu Kubernetes clusters. For an example, refer to the .

vCP Provisioner

The following examples use the VMware Cloud Provider (vCP) StorageClass provisioner.

  1. Create a StorageClass with a user specified disk format.

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: fast
    provisioner: kubernetes.io/vsphere-volume
    parameters:
      diskformat: zeroedthick
    

    diskformat: thin, zeroedthick and eagerzeroedthick. Default: "thin".

  2. Create a StorageClass with a disk format on a user specified datastore.

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: fast
    provisioner: kubernetes.io/vsphere-volume
    parameters:
        diskformat: zeroedthick
        datastore: VSANDatastore
    

    datastore: The user can also specify the datastore in the StorageClass. The volume will be created on the datastore specified in the StorageClass, which in this case is VSANDatastore. This field is optional. If the datastore is not specified, then the volume will be created on the datastore specified in the vSphere config file used to initialize the vSphere Cloud Provider.

  3. Storage Policy Management inside kubernetes

There are few which you try out for persistent volume management inside Kubernetes for vSphere.

Ceph RBD

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/rbd
parameters:
  monitors: 10.16.153.105:6789
  adminId: kube
  adminSecretName: ceph-secret
  adminSecretNamespace: kube-system
  pool: kube
  userId: kube
  userSecretName: ceph-secret-user
  userSecretNamespace: default
  fsType: ext4
  imageFormat: "2"
  imageFeatures: "layering"
  • monitors: Ceph monitors, comma delimited. This parameter is required.

  • adminId: Ceph client ID that is capable of creating images in the pool. Default is "admin".

  • adminSecretName: Secret Name for adminId. This parameter is required. The provided secret must have type "kubernetes.io/rbd".

  • adminSecretNamespace: The namespace for adminSecretName. Default is "default".

  • pool: Ceph RBD pool. Default is "rbd".

  • userId: Ceph client ID that is used to map the RBD image. Default is the same as adminId.

  • userSecretName: The name of Ceph Secret for userId to map RBD image. It must exist in the same namespace as PVCs. This parameter is required. The provided secret must have type "kubernetes.io/rbd", for example created in this way:

    kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" \
      --from-literal=key='QVFEQ1pMdFhPUnQrSmhBQUFYaERWNHJsZ3BsMmNjcDR6RFZST0E9PQ==' \
      --namespace=kube-system
    
  • userSecretNamespace: The namespace for userSecretName.

  • fsType: fsType that is supported by kubernetes. Default: "ext4".

  • imageFormat: Ceph RBD image format, "1" or "2". Default is "2".

  • imageFeatures: This parameter is optional and should only be used if you set imageFormat to "2". Currently supported features are layering only. Default is "", and no features are turned on.

Quobyte

FEATURE STATE: Kubernetes v1.22 [deprecated]

The Quobyte in-tree storage plugin is deprecated, an StorageClass for the out-of-tree Quobyte plugin can be found at the Quobyte CSI repository.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: slow
provisioner: kubernetes.io/quobyte
parameters:
    quobyteAPIServer: "http://138.68.74.142:7860"
    registry: "138.68.74.142:7861"
    adminSecretName: "quobyte-admin-secret"
    adminSecretNamespace: "kube-system"
    user: "root"
    group: "root"
    quobyteConfig: "BASE"
    quobyteTenant: "DEFAULT"
  • quobyteAPIServer: API Server of Quobyte in the format "http(s)://api-server:7860"

  • registry: Quobyte registry to use to mount the volume. You can specify the registry as <host>:<port> pair or if you want to specify multiple registries, put a comma between them. <host1>:<port>,<host2>:<port>,<host3>:<port>. The host can be an IP address or if you have a working DNS you can also provide the DNS names.

  • adminSecretNamespace: The namespace for adminSecretName. Default is "default".

  • adminSecretName: secret that holds information about the Quobyte user and the password to authenticate against the API server. The provided secret must have type "kubernetes.io/quobyte" and the keys user and password, for example:

    kubectl create secret generic quobyte-admin-secret \
      --type="kubernetes.io/quobyte" --from-literal=user='admin' --from-literal=password='opensesame' \
      --namespace=kube-system
    
  • user: maps all access to this user. Default is "root".

  • group: maps all access to this group. Default is "nfsnobody".

  • quobyteConfig: use the specified configuration to create the volume. You can create a new configuration or modify an existing one with the Web console or the quobyte CLI. Default is "BASE".

  • quobyteTenant: use the specified tenant ID to create/delete the volume. This Quobyte tenant has to be already present in Quobyte. Default is "DEFAULT".

Azure Disk

Azure Unmanaged Disk storage class

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: eastus
  storageAccount: azure_storage_account_name
  • skuName: Azure storage account Sku tier. Default is empty.
  • location: Azure storage account location. Default is empty.
  • storageAccount: Azure storage account name. If a storage account is provided, it must reside in the same resource group as the cluster, and location is ignored. If a storage account is not provided, a new storage account will be created in the same resource group as the cluster.

Azure Disk storage class (starting from v1.7.2)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Standard_LRS
  kind: managed
  • storageaccounttype: Azure storage account Sku tier. Default is empty.
  • kind: Possible values are shared, dedicated, and managed (default). When kind is shared, all unmanaged disks are created in a few shared storage accounts in the same resource group as the cluster. When kind is dedicated, a new dedicated storage account will be created for the new unmanaged disk in the same resource group as the cluster. When kind is managed, all managed disks are created in the same resource group as the cluster.
  • resourceGroup: Specify the resource group in which the Azure disk will be created. It must be an existing resource group name. If it is unspecified, the disk will be placed in the same resource group as the current Kubernetes cluster.
  • Premium VM can attach both Standard_LRS and Premium_LRS disks, while Standard VM can only attach Standard_LRS disks.
  • Managed VM can only attach managed disks and unmanaged VM can only attach unmanaged disks.

Azure File

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: azurefile
provisioner: kubernetes.io/azure-file
parameters:
  skuName: Standard_LRS
  location: eastus
  storageAccount: azure_storage_account_name
  • skuName: Azure storage account Sku tier. Default is empty.
  • location: Azure storage account location. Default is empty.
  • storageAccount: Azure storage account name. Default is empty. If a storage account is not provided, all storage accounts associated with the resource group are searched to find one that matches skuName and location. If a storage account is provided, it must reside in the same resource group as the cluster, and skuName and location are ignored.
  • secretNamespace: the namespace of the secret that contains the Azure Storage Account Name and Key. Default is the same as the Pod.
  • secretName: the name of the secret that contains the Azure Storage Account Name and Key. Default is azure-storage-account-<accountName>-secret
  • readOnly: a flag indicating whether the storage will be mounted as read only. Defaults to false which means a read/write mount. This setting will impact the ReadOnly setting in VolumeMounts as well.

During storage provisioning, a secret named by secretName is created for the mounting credentials. If the cluster has enabled both RBAC and Controller Roles, add the create permission of resource secret for clusterrole system:controller:persistent-volume-binder.

In a multi-tenancy context, it is strongly recommended to set the value for secretNamespace explicitly, otherwise the storage account credentials may be read by other users.

Portworx Volume

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: portworx-io-priority-high
provisioner: kubernetes.io/portworx-volume
parameters:
  repl: "1"
  snap_interval:   "70"
  priority_io:  "high"

  • fs: filesystem to be laid out: none/xfs/ext4 (default: ext4).
  • block_size: block size in Kbytes (default: 32).
  • repl: number of synchronous replicas to be provided in the form of replication factor 1..3 (default: 1) A string is expected here i.e. "1" and not 1.
  • priority_io: determines whether the volume will be created from higher performance or a lower priority storage high/medium/low (default: low).
  • snap_interval: clock/time interval in minutes for when to trigger snapshots. Snapshots are incremental based on difference with the prior snapshot, 0 disables snaps (default: 0). A string is expected here i.e. "70" and not 70.
  • aggregation_level: specifies the number of chunks the volume would be distributed into, 0 indicates a non-aggregated volume (default: 0). A string is expected here i.e. "0" and not 0
  • ephemeral: specifies whether the volume should be cleaned-up after unmount or should be persistent. emptyDir use case can set this value to true and persistent volumes use case such as for databases like Cassandra should set to false, true/false (default false). A string is expected here i.e. "true" and not true.

ScaleIO

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/scaleio
parameters:
  gateway: https://192.168.99.200:443/api
  system: scaleio
  protectionDomain: pd0
  storagePool: sp1
  storageMode: ThinProvisioned
  secretRef: sio-secret
  readOnly: "false"
  fsType: xfs
  • provisioner: attribute is set to kubernetes.io/scaleio
  • gateway: address to a ScaleIO API gateway (required)
  • system: the name of the ScaleIO system (required)
  • protectionDomain: the name of the ScaleIO protection domain (required)
  • storagePool: the name of the volume storage pool (required)
  • storageMode: the storage provision mode: ThinProvisioned (default) or ThickProvisioned
  • secretRef: reference to a configured Secret object (required)
  • readOnly: specifies the access mode to the mounted volume (default false)
  • fsType: the file system to use for the volume (default ext4)

The ScaleIO Kubernetes volume plugin requires a configured Secret object. The secret must be created with type kubernetes.io/scaleio and use the same namespace value as that of the PVC where it is referenced as shown in the following command:

kubectl create secret generic sio-secret --type="kubernetes.io/scaleio" \
--from-literal=username=sioadmin --from-literal=password=d2NABDNjMA== \
--namespace=default

StorageOS

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/storageos
parameters:
  pool: default
  description: Kubernetes volume
  fsType: ext4
  adminSecretNamespace: default
  adminSecretName: storageos-secret
  • pool: The name of the StorageOS distributed capacity pool to provision the volume from. Uses the default pool which is normally present if not specified.
  • description: The description to assign to volumes that were created dynamically. All volume descriptions will be the same for the storage class, but different storage classes can be used to allow descriptions for different use cases. Defaults to Kubernetes volume.
  • fsType: The default filesystem type to request. Note that user-defined rules within StorageOS may override this value. Defaults to ext4.
  • adminSecretNamespace: The namespace where the API configuration secret is located. Required if adminSecretName set.
  • adminSecretName: The name of the secret to use for obtaining the StorageOS API credentials. If not specified, default values will be attempted.

The StorageOS Kubernetes volume plugin can use a Secret object to specify an endpoint and credentials to access the StorageOS API. This is only required when the defaults have been changed. The secret must be created with type kubernetes.io/storageos as shown in the following command:

kubectl create secret generic storageos-secret \
--type="kubernetes.io/storageos" \
--from-literal=apiAddress=tcp://localhost:5705 \
--from-literal=apiUsername=storageos \
--from-literal=apiPassword=storageos \
--namespace=default

Secrets used for dynamically provisioned volumes may be created in any namespace and referenced with the adminSecretNamespace parameter. Secrets used by pre-provisioned volumes must be created in the same namespace as the PVC that references it.

Local

FEATURE STATE: Kubernetes v1.14 [stable]
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

Local volumes do not currently support dynamic provisioning, however a StorageClass should still be created to delay volume binding until Pod scheduling. This is specified by the WaitForFirstConsumer volume binding mode.

Delaying volume binding allows the scheduler to consider all of a Pod's scheduling constraints when choosing an appropriate PersistentVolume for a PersistentVolumeClaim.

Last modified April 27, 2022 at 12:14 PM PST: