1. Introduction
Starting from the 8.0.0 EVE-OS version redundant storage features like RAID or storage expansion features like adding a physical disk to the device etc. are supported using an open-source openZFS (ZFS) storage stack.
2. System Requirement
Since ZFS requires a certain amount of memory and disk resources to support the redundant storage, this feature will be recommended only on devices with the following specifications but not required.
Minimum Memory (GB) | Minimum Physical Drives (Storage) | CPUs |
32 | 4 (including EVE-OS install drive) | 10 Cores |
Note: ZFS can install on any system configuration. The requirements mentioned above are our recommendations for better-sustained performance.
3. ZFS Overhead
Like any redundant storage system, ZFS implementation comes with some overhead in memory, storage usage, and availability. We can divide those into two sub-components, memory and storage.
3.1. Storage Overhead
It has been determined through rigorous performance testing that 16K is the optimal block size for the zvol block devices, and also we will be using compression instead of deduplication.
There are four RAID levels supported in the current release. Namely None, raid1 (mirror), raid5( raidz1), and raid6 (raidz2)
ZFS also requires storage over-provisioning. It has been recommended that 20% of physical storage be set aside. In other words, only 80% of physical storage is available for customers, even before RAID overhead.
The following table shows the minimum disks required for each RAID level (excluding the eve-os install disk) and the available disk space for physical disk space.
Let us assume that each physical disk is 100GB in size
RAID type | Minimum disks | Physical Space (GB) | Available space after 20% provisioning (GB) | Available space after RAID (GB) | Available space after 2:1 compression (GB) |
raid5 | 3 | 300 | 240 | 2/3 of 240 = 160 | 320 |
raid6 | 4 | 400 | 320 |
2/4 of 320 = 160 |
320 |
raid1 | 2 | 200 | 160 | 1/2 of 160 = 80 | 160 |
None | 1 | 100 | 80 | 80 | 160 |
A couple of important points to note here:
- The available space % increases with additional disks added to any RAID level. In other words, the space used for parity decreases as the number of disks increases.
- Though it appears that the customer is losing a lot of disk space, i.e., physical to available space after RAID is reduced, that will not be true once customer workloads are running. ZFS compression feature comes into the picture here. Compression ratios depend on the workload, for example, a database workload is known to have a compression ratio of 2:1, and VM workloads are generally in a similar range. Assuming a typical case scenario of a 2:1 compression ratio, we can now see all the space lost in RAID is recovered using compression. But in a worst-case compression ratio of 1:1, the available disk space will be the available space after RAID, as mentioned in the table above.
3.2. Memory Overhead
When ZFS storage is provisioned, some amount of system RAM will be configured for efficient operation of the storage stack during installation and after a reboot of the device. This RAM will not be available to user applications provisioned through the controller.
ZFS RAM = min ( 256 MiB + 0.003 * poolSize , 20% of System RAM)
3.3. Physical Disks Recommendation
We recommend EVE-OS to be installed on a smaller capacity disk separate from disks used for ZFS. The capacity of the disk used for EVE-OS can be as small as 10GB.
We recommend that all physical disks used for ZFS be of the same type and capacity. All disks should be of NVMe, SSD, or HDD type, and mixing them though it technically works, is not recommended for consistent performance. The capacity of all the disks should be the same size for optimum space utilization. Else, the pool capacity will be determined by the smallest disk in the pool.
Example 1: Consider three disks of 100GB capacity each.
The storage pool size will be approximately 80% of (3 * 100GB)
Example 2: Two disks are 100GB each, and one disk is 50GB.
The storage pool size will be approximately 80% of (3 * 50), leading to wastage of disk space of 100Gb approximately.
The following table shows a sample grub config for various raid levels
RAID type | Separate EVE-OS disk | ZFS minimum disks | Example Grub config |
none (raid0) | Not mandatory | 1 |
eve_install_zfs_with_raid_level = none Single disk config: eve_install_disk=sda Multiple disk config: eve_install_disk=sda eve_persist_disk=sda,sdb,sdc Please note sda should be listed in both install and persist disk. |
raid1 (mirror) | Mandatory | 2 |
eve_install_zfs_with_raid_level = raid1 eve_install_disk=sda eve_persist_disk=sdb,sdc |
raid5 (raidz1) | Mandatory | 3 |
eve_install_zfs_with_raid_level = raid5 eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd |
raid6 (raidz2) | Mandatory | 4 |
eve_install_zfs_with_raid_level = raid6 eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd,sde |
4. Installation
There are two supported installation modes in EVE-OS
4.1. USB
USB installation is supported. Since USB installs wipe out everything, there will be no compatibility issues with existing installs. We strongly recommend that the device matches the minimum system requirement mentioned above in section 2. If the installation is on a device that does not qualify for minimum system requirements, then ext4 storage is recommended.
The raid levels supported are [none, raid1, raid5, raid6]
The following grub config parameters are required to install ZFS storage.
eve_install_zfs_with_raid_level = raid5
eve_install_disk=sda eve_persist_disk=sdb,sdc,sdd
In this example sda is where EVE-OS will be installed. It is a requirement that EVE-OS should be installed on a separate disk if storage redundancy is required.
ZFS storage will be configured on sdb,sdc,sdd with raid5
4.2. iPXE
iPXE-based installation is supported and depends on the same grub parameters as USB installation.
5. Upgrade
There is no non-disruptive path to upgrade existing EVE-OS devices running ext4-based storage. During the upgrade process, if the device is found to be configured with ext4-based storage, that will remain intact, and ZFS storage is not provisioned.
If the user wants redundant storage, then they should find a way to back up the apps and then do a USB installation to get redundant storage. This would require re-onboarding (delete/add) of the Edge Node.
Upgrades from ZFS-based storage to the next release are supported.
6. Storage Redundancy
Storage redundancy is supported using the RAID-Z feature set in ZFS. The RAID level is picked depending on the number of physical disks in the system specified through a config parameter during installation.
6.1. Disk Failures
Any physical disk failures are handled seamlessly by ZFS. The number of drives that can fail is determined by the RAID-Z level chosen.
RAID-Z level | Disk failures tolerated |
raid5 | 1 |
raid6 | 2 |
None (No raid support) | 0 |
raid1 | 1 |
7. Storage Expansion
Storage expansion at the physical disk level is supported. Storage expansion at the application level is not supported currently.
7.1. Expanding Physical Storage (Storage Pool)
This release does not support expanding the storage pool by adding additional physical drives. But there are a couple of ways users can increase the storage pool size.
7.1.1. Migrate and Reinstall
Migrate all the user applications to external cloud storage. Reinstall the EVE-OS on the device with additional physical disks or additional storage capacity with bigger disks. This could lead to downtime in App availability and may not be a preferred choice for the users.
7.1.2. Replacing Disks
Replacing physical disks requires an ssh connection to the EVE-OS device. Without that, the current release cannot replace physical disks, i.e., replacing disks through ZEDCloud is not supported.
The replacing disk should be of the same size or larger than the original disk and disks should be replaced one after another (not all at the same time).
The following steps need to be performed to replace the disk for a pool in raid5 configuration.
- Check zpool status
- Run “zpool status”
- pool: persist
- state: ONLINE
- config:
NAME STATE READ WRITE CKSUM
persist ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdd ONLINE 0 0 0
- Identify the disk to be replaced. Say 'sdd' was identified to be replaced with say disk sdc
- Run replace command.
-
zpool replace persist /dev/sdd /dev/sdc
-
- Manually expand the pool size
-
zpool set autoexpand=on persist
-
- Verify with zpool status to see if sdd is replaced with sdc
- zpool status
- pool: persist
- state: ONLINE
- config:
NAME STATE READ WRITE CKSUM
persist ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
- Verify if the pool size changed and if the replaced disk is of higher capacity.
-
zpool list
-
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
persist 668G 8.45M 668G - - 0% 0% 1.00x ONLINE -
7.2. Adding Additional Virtual Disk in VM
Adding a virtual disk to VM is not supported currently.
7.3. Expanding Virtual Disk in VM
Virtual disk expansion in the VM is not supported currently.
8. Storage Virtualization
Storage to VMs can be exported through the vhost-scsi interface instead of the virto-blk-pci interface. Multi queue support also has been added, which should provide performance improvement for the apps that are storage I/O bound.
9. ZEDCloud Enhancement
ZEDCloud -> Edge Node -> Status page has been enhanced to display the physical storage on the device. The following screen shows a device configured with raid5 storage.