2.5.3 Maintain Free Space in a Storage Pool to Protect Against Disk Failure

An Exascale storage pool is a collection of pool disks that provides the persistent physical storage for Exascale vaults and files. If the storage device hosting one pool disk fails, then the storage pool can continue to function with reduced redundancy. However, the storage pool may go offline if multiple pool disks fail at the same time.

To maintain maximum protection against storage failures, you should maintain sufficient free space in each storage pool to facilitate an automatic data rebalance operation after the failure of a main storage device.

The amount of free space in a storage pool is the difference between the following storage pool attributes:

  • spaceRaw: Specifies the total amount of storage space associated with the storage pool.

  • spaceUsed: Specifies the total amount of storage space allocated (used) in the storage pool.

You can use the ESCLI lsstoragepool command to examine these attributes. For example, the following command shows the spaceRaw and spaceUsed values associated with the storage pool named POOL1:

@> lsstoragepool POOL1 --attributes spaceRaw,spaceUsed

The amount of free space required to rebalance a storage pool after losing a storage device largely depends on the number and type of storage servers associated with the storage pool.

Another consideration is the redundancy level used to protect against data loss. However, because high redundancy (triple mirroring) is generally advised, the following recommendations assume that the storage pool only contains high redundancy files. Less free space may be sufficient if the storage pool includes a significant proportion of normal redundancy (double-mirrored) files.

When a storage pool occupies four or more storage servers, the lost data can be reconstructed on any partner storage server. In this case, the amount of free space that should be maintained to successfully rebalance a storage pool after losing one main storage device is whichever is greater out of the following:

  • 3% of the total space in the storage pool.

  • Three times the size of the lost pool disk(s).

    Note:

    On an EF storage server with 4 capacity-optimized flash devices, each device contains two pool disks.

When the storage pool occupies only three storage servers, high redundancy data must be reconstructed on the same server. In this case, the required free space to reconstruct data after losing a storage device is linked to the number of physical storage devices in the storage server.

The following formula describes the free space requirement to reconstruct data after losing one main storage device for a storage pool using only three storage servers. The free space requirement is expressed as a percentage of the total amount of storage space associated with the storage pool. In the formula, DeviceCount represents the number of main storage devices contained in each storage server.

Free Space Percentage = ceiling(100/DeviceCount)+3

For example, on a High Capacity (HC) storage server with 12 disks, one disk represents 8.33% (100/12) of the total storage capacity. So, for a storage pool using three 12-disk HC servers, you should maintain 12% (ceiling(8.33)+3) of the storage pool as free space to reconstruct data after losing a storage device.

Based on the preceding formula, the following table summarizes the amount of free space that should be maintained to successfully rebalance a storage pool spanning only three storage servers after losing one main storage device.

Storage Server Type Free Space Requirement

High Capacity (HC) storage server with 12 disks

12% of the total space in the storage pool.

HC storage server with 6 disks

20% of the total space in the storage pool.

Extreme Flash (EF) storage server with 8 performance-optimized flash devices

16% of the total space in the storage pool

EF storage server with 4 capacity-optimized flash devices

28% of the total space in the storage pool