Rook Ceph Configuration

The simplest and most straightforward configuration is to enable the upstream behavior to utilize all available nodes and devices.

# fragment of the rendered values.rook-ceph-cluster.yaml
cephClusterSpec:
  storage:
    useAllNodes: true
    useAllDevices: true

This behavior is triggered by the following configuration:

# lab.yml
rook_ceph_use_all_nodes: true

However, this configuration requires Kubernetes nodes to possess additional block devices (e.g., /dev/vdb or /dev/sdb) to establish a functional cluster. Therefore it is recommended to provision Kubernetes nodes with additional block storage and explicitly enable rook_ceph_use_all_nodes: true.

In environments lacking dedicated disks, loop devices offer an alternative. However, Rook-Ceph does not automatically detect loop devices when useAllNodes and useAllDevices are enabled. Explicit configuration via the storage.nodes object is required.

# fragment of the rendered values.rook-ceph-cluster.yaml
cephClusterSpec:
  storage:
    nodes:
      - name: "kubelet1"
        devices:
          - name: "/dev/loop0"

The storage.nodes object is parametrized via the lab.yml setting rook_ceph_nodes:

# lab.yml
rook_ceph_nodes:
  - name: "kubelet1"
    devices:
      - name: "/dev/loop0"

Note: As a prerequisite, for using loop devices, the operator needs to have been configured with allowLoopDevices: true, which we do by default.

Automating this configuration presents challenges:

Node names must be hardcoded, leading to potential deployment failures if node names vary between environments.
Creating loop devices on nodes traditionally requires manual or external intervention (e.g., SSH, Ansible).

To mitigate the need for manual device creation, the batteries/rook-ceph/helm/losetup Helm chart provides automation. This chart deploys a Job with privileged pods to create loop devices on Kubernetes nodes (defaulting to a sparse 50Gi file at /var/lib/disk0.img). While this default behavior is currently not configurable, the chart source offers a reference for integrating this logic into systemd services if necessary.

install_losetup_job: true

To address the node naming constraint, automation is provided to dynamically extract node names from kubectl get node and construct the storage.nodes object for the Rook Ceph cluster values. Configuration options for this automation are detailed in lab.default.yml. To enable this automation, use

rook_ceph_nodes_auto: true

This dynamic lookup is implemented via external scripting as Helm values templating is restricted to templates only.

Alternatively, the OpenEBS LocalPV-LVM storage provisioner can be deployed on the block device (initialized as an LVM VolumeGroup). This enables a shared storage architecture where the device supports both Rook Ceph and other consumers, offering Block and Filesystem volumes with strict capacity enforcement—a feature absent in the previously used Rancher local-path provider.

Analogous to the loop device automation, the framework allows for automated LVM initialization via the vgcreate_job. Configuration options include device selection and volume group naming.

install_vgcreate_job: true
vgcreate_job_devices:
  - /dev/loop0

Implementation Note: The OpenEBS storage provider is installed by default in localpv-hostpath mode. Enabling localpv-lvm mode provides LVM-based volumes, enabling block mode and capacity enforcement.

install_openebs_lovalpv_lvm_storageclass: true
openebs_localpv_lvm_storageclass_as_default: true
openebs_localpv_hostpath_storageclass_as_default: false

This configuration enables Rook-Ceph to consume Kubernetes PersistentVolumes via PersistentVolumeClaims (PVCs), aligning with cloud-native storage patterns.

rook_ceph_use_pvcs: true

Refer to lab.default.yml for additional configuration options regarding this mode.

While Ceph typically requires a minimum of three nodes, configuration adjustments allow for single-node deployments by reducing the replication factor and modifying other settings. This can be explicitly enabled:

rook_ceph_singlenode: true

The scripting framework attempts to auto-detect single-node environments, making explicit configuration often unnecessary.

In summary, four primary use cases are defined:

rook_ceph_use_all_nodes
rook_ceph_nodes_auto
rook_ceph_nodes
rook_ceph_use_pvcs

Most modes support loop devices, LVM logical volumes, or physical devices. The use_all_nodes mode (enabling storage.useAllNodes: true and storage.useAllDevices: true) explicitly requires physical devices and is incompatible with loop or LVM devices.

Review the rendered values.*.yaml files and the installation script to verify the configuration and installation order. Note that the installation script behavior may be dynamic, performing runtime kubectl lookups depending on the selected mode. This dynamic generation offers significant convenience despite increasing the complexity of the underlying Jinja2 templates.

An additional method involves statically defining PersistentVolumes on existing local devices (refer to batteries/static-lvm-pvs/helm/static-lvm-pvs). This option is not integrated into the primary scripting framework.