Redundant Storage (ZFS) User Guide

1. Introduction

Starting from the 8.0.0 EVE-OS version redundant storage features like RAID or storage expansion features like adding a physical disk to the device etc. are supported using an open-source openZFS (ZFS) storage stack.

2. System Requirement

Since ZFS requires a certain amount of memory and disk resources to support the redundant storage, this feature will be recommended only on devices with the following specifications but not required.

Minimum Memory (GB) Minimum Physical Drives (Storage) CPUs
32 4 (including EVE-OS install drive) 10 Cores

Note: ZFS can install on any system configuration. The requirements mentioned above are our recommendations for better-sustained performance.

3. ZFS Overhead

Like any redundant storage system, ZFS implementation comes with some overhead in memory, storage usage, and availability. We can divide those into two sub-components, memory and storage.

3.1. Storage Overhead

It has been determined through rigorous performance testing that 16K is the optimal block size for the zvol block devices, and also we will be using compression instead of deduplication.

There are four RAID levels supported in the current release. Namely None, raid1 (mirror), raid5( raidz1), and raid6 (raidz2)

ZFS also requires storage over-provisioning. It has been recommended that 20% of physical storage be set aside. In other words, only 80% of physical storage is available for customers, even before RAID overhead.

The following table shows the minimum disks required for each RAID level (excluding the eve-os install disk) and the available disk space for physical disk space.

Let us assume that each physical disk is 100GB in size

RAID type Minimum disks Physical Space (GB) Available space after 20% provisioning (GB) Available space after RAID (GB) Available space after 2:1 compression (GB)
raid5 3 300 240 2/3 of 240 = 160 320
raid6 4 400 320

2/4 of 320 =

160

320
raid1 2 200 160 1/2 of 160 = 80 160
None 1 100 80 80 160

A couple of important points to note here:

  • The available space % increases with additional disks added to any RAID level. In other words, the space used for parity decreases as the number of disks increases.
  • Though it appears that the customer is losing a lot of disk space, i.e., physical to available space after RAID is reduced, that will not be true once customer workloads are running. ZFS compression feature comes into the picture here. Compression ratios depend on the workload, for example, a database workload is known to have a compression ratio of 2:1, and VM workloads are generally in a similar range. Assuming a typical case scenario of a 2:1 compression ratio, we can now see all the space lost in RAID is recovered using compression. But in a worst-case compression ratio of 1:1, the available disk space will be the available space after RAID, as mentioned in the table above.

3.2. Memory Overhead

When ZFS storage is provisioned, some amount of system RAM will be configured for efficient operation of the storage stack during installation and after a reboot of the device. This RAM will not be available to user applications provisioned through the controller.

ZFS RAM = min ( 256 MiB + 0.003 * poolSize ,  20% of System RAM)

3.3. Physical Disks Recommendation

We recommend EVE-OS to be installed on a smaller capacity disk separate from disks used for ZFS. The capacity of the disk used for EVE-OS can be as small as 10GB.

We recommend that all physical disks used for ZFS be of the same type and capacity. All disks should be of NVMe, SSD, or HDD type, and mixing them though it technically works, is not recommended for consistent performance. The capacity of all the disks should be the same size for optimum space utilization. Else, the pool capacity will be determined by the smallest disk in the pool.

Example 1: Consider three disks of 100GB capacity each.

The storage pool size will be approximately  80% of (3 * 100GB) 

Example 2: Two disks are 100GB each, and one disk is 50GB.

The storage pool size will be approximately 80% of (3 * 50), leading to wastage of disk space of 100Gb approximately.

The following table shows a sample grub config for various raid levels

RAID type Separate EVE-OS disk ZFS minimum disks Example Grub config
none (raid0) Not mandatory 1

eve_install_zfs_with_raid_level = none

Single disk config:

eve_install_disk=sda


Multiple disk config:

eve_install_disk=sda

eve_persist_disk=sda,sdb,sdc


Please note sda should be listed in both install and persist disk.

raid1 (mirror) Mandatory 2

eve_install_zfs_with_raid_level = raid1

eve_install_disk=sda eve_persist_disk=sdb,sdc

raid5 (raidz1) Mandatory 3

eve_install_zfs_with_raid_level = raid5

eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd

raid6 (raidz2) Mandatory 4

eve_install_zfs_with_raid_level = raid6

eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd,sde

 

4. Installation

There are two supported installation modes in EVE-OS

4.1. USB 

USB installation is supported. Since USB installs wipe out everything, there will be no compatibility issues with existing installs. We strongly recommend that the device matches the minimum system requirement mentioned above in section 2. If the installation is on a device that does not qualify for minimum system requirements, then ext4 storage is recommended.

The raid levels supported are [none, raid1, raid5, raid6]

The following grub config parameters are required to install ZFS storage.

eve_install_zfs_with_raid_level = raid5
eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd
In this example sda is where EVE-OS will be installed. It is a requirement that EVE-OS should be installed on a separate disk if storage redundancy is required.
ZFS storage will be configured on sdb,sdc,sdd with raid5

4.2. iPXE

iPXE-based installation is supported and depends on the same grub parameters as USB installation.

5. Upgrade

There is no non-disruptive path to upgrade existing EVE-OS devices running ext4-based storage. During the upgrade process, if the device is found to be configured with ext4-based storage, that will remain intact, and ZFS storage is not provisioned.

If the user wants redundant storage, then they should find a way to back up the apps and then do a USB installation to get redundant storage. This would require re-onboarding (delete/add) of the Edge Node.

Upgrades from ZFS-based storage to the next release are supported.

6. Storage Redundancy

Storage redundancy is supported using the RAID-Z feature set in ZFS. The RAID level is picked depending on the number of physical disks in the system specified through a config parameter during installation.

6.1. Disk Failures

Any physical disk failures are handled seamlessly by ZFS. The number of drives that can fail is determined by the RAID-Z level chosen.

RAID-Z level Disk failures tolerated
raid5 1
raid6 2
None (No raid support) 0
raid1 1

7. Storage Expansion

Storage expansion at the physical disk level is supported. Storage expansion at the application level is not supported currently.

7.1. Expanding Physical Storage (Storage Pool)

This release does not support expanding the storage pool by adding additional physical drives. But there are a couple of ways users can increase the storage pool size.

7.1.1. Migrate and Reinstall

Migrate all the user applications to external cloud storage. Reinstall the EVE-OS on the device with additional physical disks or additional storage capacity with bigger disks. This could lead to downtime in App availability and may not be a preferred choice for the users.

7.1.2. Replacing Disks

Replacing physical disks requires an ssh connection to the EVE-OS device. Without that, the current release cannot replace physical disks, i.e., replacing disks through ZEDCloud is not supported.

The replacing disk should be of the same size or larger than the original disk and disks should be replaced one after another (not all at the same time).

The following steps need to be performed to replace the disk for a pool in raid5 configuration.

  • Check zpool status
    • Run “zpool status”
    • pool: persist
    • state: ONLINE
    • config:

NAME        STATE     READ WRITE CKSUM

persist     ONLINE       0     0            0

raidz1-0  ONLINE       0     0            0

sda          ONLINE       0     0            0

sdb          ONLINE       0     0            0

sdd          ONLINE       0     0            0

  • Identify the disk to be replaced. Say 'sdd' was identified to be replaced with say disk sdc
  • Run replace command.
    • zpool  replace persist /dev/sdd /dev/sdc
  • Manually expand the pool size
    • zpool  set autoexpand=on persist
  • Verify with zpool status to see if sdd is replaced with sdc
    • zpool  status
    • pool: persist
    • state: ONLINE
    • config:

NAME        STATE     READ WRITE CKSUM

persist       ONLINE       0     0         0

raidz1-0    ONLINE       0     0         0

 sda           ONLINE       0     0         0

 sdb           ONLINE       0     0         0

 sdc           ONLINE       0     0         0

  • Verify if the pool size changed and if the replaced disk is of higher capacity.
    • zpool  list

NAME      SIZE    ALLOC   FREE    CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT

persist     668G  8.45M    668G    -                -                     0%        0%   1.00x        ONLINE   -

7.2. Adding Additional Virtual Disk in VM

Adding a virtual disk to VM is not supported currently.

7.3. Expanding Virtual Disk in VM

Virtual disk expansion in the VM is not supported currently.

8. Storage Virtualization

Storage to VMs can be exported through the vhost-scsi interface instead of the virto-blk-pci interface. Multi queue support also has been added, which should provide performance improvement for the apps that are storage I/O bound.

9. ZEDCloud Enhancement

ZEDCloud -> Edge Node -> Status page has been enhanced to display the physical storage on the device. The following screen shows a device configured with raid5 storage.

 

image2.png

Was this article helpful?
0 out of 0 found this helpful

Articles in this section